Can we trust the District’s yardstick for school quality?
There are many reasons to be wary about relying too heavily on the School District of Philadelphia’s main tool for measuring school quality – especially when it comes to making high-stakes decisions about closures, staffing shake-ups and charter conversions.
Presented with a NewsWorks analysis of the publicly available results of the first three years of the School Progress Report, top District officials acknowledged that the tool cannot be used reliably to evaluate the effectiveness of schools over time.
Because of the way the District has manipulated the inner workings of the SPR each year, leaders say reports should be considered standalone snapshots in time.
Reading the reports, though – which showcase outcomes over years – can leave the public with a much different impression.
Take Ellwood Elementary School in Oak Lane as an example. Judging by this year’s report, the school seems to be on an upward trajectory. Ellwood gets especially high marks for student “progress” – which the SPR gives the most weight in calculating an overall score.
In each of three categories – “achievement,” “progress” and “climate” – schools can receive four designations: “model,” “reinforce,” “watch,” or “intervene.”
As the screenshot below shows, Ellwood is now considered “reinforce” in progress – up from the yellow dot representing “watch” for the previous year.
But, according to District officials, that is an essentially meaningless distinction and a misleading visual representation. Ellwood only moved up a progress tier because the District recalibrated its scoring system between 2013-14 and 2014-15, easing expectations.
In technical statistical terms, the “floors” and “targets” were lowered.
Does that recalibration negate the entire idea of calculating “progress” this year?
District officials say no. For some schools, they say, the appearance of “progress” is a fiction, while for others, it’s a reality.
“You’d have to unpack it by going into the specific metrics,” said Jura Chung, the district’s chief performance officer and architect of the SPR.
The idea is further complicated because the state tests taken by elementary and middle-school students changed dramatically between 2013-14 and 2014-15.
All of this has come to a head leading up to the much-anticipated vote on whether Wister Elementary in Germantown should be converted into a neighborhood-based charter school run by Mastery.
One of the essential questions in that debate has been: Is the progress shown by Wister in the 2014-15 SPR real? Is it, as it is for Ellwood, a trick of technical tinkering? Or, possibly, something related to the test change?
More on that later.
In the days before Thursday’s School Reform Commission meeting – one that will feature a series of hotly debated, high-stakes votes, including the vote on Wister – these are some of the oddities in this year’s SPR that should be considered closely.
The bottom-line question is: How much stock should decision-makers put into SPR in a year when the metric was revised and the assessments were dramatically altered?
SPR was implemented three years ago on the heels of significant budget cuts in Philadelphia.
Leaders wanted to create an accountability tool that would highlight meaningful progress and not shame and blame the schools that serve the most needy, at-risk student populations.
The state’s School Performance Profile metric gives equal weight to achievement and growth, and analyses consistently show that performance on SPP correlates strongly with student poverty.
Pivoting away from that model, SPR gives “progress” the most weight. And it derives those growth scores from the state’s value-added metric, PVAAS – which judges schools based on how well students perform on state tests compared to their results the previous year in relation to the rest of the state.
PVAAS results show very little correlation to student poverty in reading and math in non-high schools.
“It’s important for us to understand who’s growing children, because otherwise, if we just do it on achievement, then it paints a very different picture,” said Superintendent William Hite. “We need to acknowledge where children are growing.”
But like most metrics that purport to evaluate school quality, the SPR has its share of critics – especially because of the year-to-year volatility of PVAAS.
The District has increased that volatility by adjusting the inner workings of the metric.
As a result, as the graph below shows, the most consistent thing about school “progress” as captured by SPR is inconsistency.
Of the 170 non-high schools that participated in SPR since its inception, the largest chunk moved into a different “progress” tier each year – both up and down.
(Here you can see an interactive spreadsheet of non-high schools that participated in SPR since its inception. Only District schools participated in year one.)
This volatility is what drove the District to give growth less emphasis, changing it from 50 percent of a school’s overall score to 40 percent in 2014-15.
But looking at the three years of SPR “progress” data raises questions like this:
If SPR tells us that, for instance, John Hancock Elementary has progress scores that moved from “reinforce” down to “intervene” before surging up to “model,” is it telling us anything meaningful about what’s going on inside the school?
For School Reform Commissioner Bill Green, the inconsistency calls into question the ways that the District has tinkered with the SPR.
“It’s not fair to the schools not to lock it down at a certain point,” said Green. “It’s hard on decision-makers and policy-makers when you have a score, but the underlying reason for that score is shifting.”
Highlighting a rarely seen rift between the SRC and the Hite administration, Jura Chung pushed back against Green’s recommendation.
“It’s a tradeoff, right? Do I keep the tool constant for the sake of longitudinal analysis?” she asked. “Or do I enhance the tool because I think these changes are making it better and giving us a better snapshot of school quality? I don’t want to give up one for the other, necessarily. I can commit to minimizing the changes. I can’t guarantee that we’ll have no changes from here on out.”
Chung added that she shares much more nuance and context with decision-makers than is available in the publicly available reports.