By John Thompson.
The education sector is plagued by “astroturf” think tanks that issue “reports” that repeatedly conclude that (surprise!) reform is working. To do so, pro-reform scholars cherry-pick data in order to provide attractive, multi-colored graphics that show an upward trajectory, or at least would indicate growth to readers who don’t carefully study the charts. The worst example may be the “meteor” theory of school improvement. It attributes growth from the late-1990s boom years to the NCLB Act of 2001. Yes! They claim that improved scores on tests taken years before NCLB were caused by the law. This teleological explanation is based on the claim that the 1990s gains were due to “consequential accountability” that was not dissimilar to NCLB. But, reformers can’t even agree on which states supposedly had consequential accountability or even what it supposedly was.
The new report from the Urban Institute about increased student performance in the District of Columbia, by Kristin Blagg and Matthew Chingos, is just as brazen of an act of misrepresenting issues. Blagg and Chingos acknowledged that “the proportion of white and Hispanic students in DC has roughly doubled, while the proportion of black students has declined.” Their analysis of NAEP scores from 2005 to 2013 indicates that those demographic changes account for only four to six points of that eight-year increase. They thus minimize the effects of D.C.’s gentrification on test score growth. But, hold on there! Doesn’t gentrification include more than a shift in children’s skin color?
Blagg and Chingos do not adjust for “any measures of family income!” And, that is not even the most intellectually dishonest part of their spin!
Any education scholar would understand why policy makers would be interested in whether D.C. scores were due to gentrification or improvements in the classroom. D.C. became the epicenter in the clash over test-driven, competition-driven reform when Michelle Rhee took charge of the school system. Her lavishly-funded, accountability-driven reforms were continued by Kaya Henderson. Headlines such as “Does Gentrification Explain Rising Test Scores in D.C.?” are clearly drafted in order to imply that the Rhee/Henderson reforms worked. The Urban Institute’s graph is especially effective in promoting Rheeism when presented, without much explanation, like this. So, it’s not hard to figure out why Bragg and Chingos borrowed the NCLB “Meteor” tactic of including gains in the years before Rhee was hired.
Even conservative reformer Matthew Ladner once acknowledged “that the 2011 NAEP would be the first real book on Michelle Rhee’s tenure running DCPS. The 2009 NAEP was a little early, and the 2013 numbers and those going forward will be owned increasingly by those in charge after Rhee, for better or worse.” Since Rhee was hired in 2007, the 2007 NAEP scores should be the baseline. And, that raises the question of what the Blagg and Chingos methodology (as inadequate as it is) would reveal if it had begun with a proper baseline, and it placed data changes in a historical perspective.
Blagg and Chingos write that changes in fourth-grade math demographics would predict a four-point increase, “but scores increased 17 points.” They include a chart which shows an equally dramatic finding for eight grade math, but acknowledge that reading gains not attributable to demographics would be smaller.
Even with fourth grade math, the choice of 2007 as a baseline would have made a significant difference in their study. In the first place, subtracting the three-point increase which occurred before Rhee arrived would reduce the gains by about 17%. I don’t claim to have the access to the database or the expertise of Blagg and Chingos, so my narrative should be taken as a quick and dirty narrative, but it is an analysis that puts the six-year gains in perspective. During the seven-year economic boom between 1996 and 2003, D.C.’s fourth grade math scores increased by 18 points. So, before NCLB and before philanthropists and the federal government poured incomprehensible amounts of money into their reforms, scores were increasing faster than during the Rhee/Henderson era.
Moreover, fourth grade reading scores further argue against Blagg’s and Chingos’ spin. Eight grade reading scores are much more meaningful (I’d argue that they are the most valuable metric in terms of estimating learning gains.) It is harder to raise middle school reading scores, however. Had Blagg and Chingos used 2007 as their baseline, fourth-grade reading gains would decline by 2/3rds and eight grade gains would drop by 3/4ths! During the six years from 2007 to 2013, unadjusted fourth grade reading scores only increased by nine points. The same gains were posted in the four years that preceded Rhee. In that case, even their not-ready-for-prime-time guestimate of the effect of gentrification looks huge in comparison. Finally, by 2013, the gap between the higher and lower ranked eight grade reading scores (between the 75th and 25th percentiles) had increased from 44 points in 2002 to 53 points! (Again, since I have not broken data down into outputs by charters and traditional public schools, I can’t address these metrics with the same precision as Blagg and Chingos could, if they chose to do so.)
Being a former historian I’m perplexed that Big Data researchers don’t attempt to study the relationships between NAEP increases and economic booms of the recent past, and gentrification. Neither can I understand why fair-minded scholars wouldn’t apply their methodology to low-performing students across multiple eras, and compare the resulting patterns with high-performing students. It would be “misnaepery” to use NAEP scores, alone, to prove that outcomes were caused by a certain education policy. But, to use such a primitive methodology and to ignore the huge costs of a policy and to imply that it yielded big gains is a new low.
What do you think? NAEP score growth largely declined after NCLB and then stagnated after the Obama administration put it on steroids. Isn’t economics still considered a part of demographics? Why would scholars start a study of D.C. student performance in 2005, ignore the era before Michelle Rhee was hired, and leave out family income, and still claim they were controlling for demographics?