Guest Blog Post: A-F School Letter Grades

Guest Blog Post by Jeff Dean.

In April 2013, the Arkansas Legislature passed Act 696 (Ark. Code Ann. § 6-15-2105), which requires the use of A-F school letter grades on the state’s annual school report cards it issues for schools. The subtitle of the law states that letter grades are intended “to clarify for parents the public school rating system on annual school report cards.” Letter grades will replace the two-category school rating system established by Act 35 of the 2^nd Extraordinary Session in 2003 (Ark. Code Ann. § 6-15-2012-2013).

Precisely because letter grades are easily understood by everyone, they are potent. A-F letter grades aren’t inherently good or bad. They greatly increase the public visibility of the ratings placed on schools. If the underlying rating system is deeply flawed, then letter grades make things worse by increasing the impact of bad ratings. The inverse is also true: a good system can make a greater positive impact by the use of highly visible letter grades in place of more ambiguous labels.

Arkansas joins fourteen other states that have developed letter grading systems for schools. These letter grading systems have met with varying degrees of success over the past fifteen years. Those that have succeeded have had to strike an acceptable balanceof simplicity, fairness, and meaning. They need to be simple, in order to be explained and understood by the public. They should be fair, so that schools are not penalized or rewarded for factors beyond their control. They also should be meaningful, so that leaders, educators, and communities can use them to guide and motivate improvement. No single priority can be satisfied perfectly. Given this tension, finding the right balance may seem like “Mission: Impossible”, but this need not be the case.

What’s simplest is not always best. Act 696 charged the State Board of Education with adopting “rules necessary to implement” an A-F system, giving the Board latitude to hear and approve a new model for school ratings. If the Board failed to adopt a new model, the letter grades assigned to schools would default to align with the labels given schools under federal accountability (ESEA Flexibility). This is the simplest possibility of all, given current accountability. Exemplary schools would earn an “A”, Achieving schools would earn a “B”, Needs Improvement schools would earn a “C”, Focus schools would earn a “D”, and Priority schools would earn an “F”. Hypothetically, if schools earned the same Flexibility labels in 2014 as they did in 2013, then only eight schools in Arkansas (out of over 1,000 schools total) would earn an “A”. The state would have 137 “B” schools, and 75% of all schools in the state (790 schools) would earn a “C”.

This distribution, although compliant with state and federal law, does not meaningfully describe and differentiate among Arkansas’ public schools. There are also problems of alignment. Focus schools, for instance, were identified by a different set of criteria (achievement gaps) than schools with other labels. To put those five labels on a continuum (which letter grading implicitly does) is misleading. Assigning a “D” to Focus schools implies that they are somehow less effective than “C” and more effective than “F” schools, when the issue at question for Focus schools is not effectiveness but equity. Simple, perhaps, but hardly fair or meaningful.

Given the possibility of these outcomes, the state decided to develop a grading model that would replace the repealed rating system, as well as provide more appropriate differentiation among schools as compared to ESEA Flexibility labels. Beginning in September 2013, policymakers and stakeholders were brought together by the Department of Education to discuss concerns and possibilities for school letter grades. School leaders stated a preference for a model that was intelligible to the public and that offered schools multiple ways to earn their grades. Veteran stakeholders of student testing and school accountability were anxious to improve upon past models yet not create a model that would be so different as to cause confusion. All parties realized an overriding need to balance simplicity with fairness, two factors which are in tension in any sort of rating system.

The end result of this process was a grading system consisting of up to four components:

Weighted Performance. Proficiency rates only consider whether a student scores above or below the proficiency cut point. Weighted performance gives additional consideration to other cut points. Schools earn points for students scoring Basic rather than Below Basic, as well as Advanced rather than Proficient.
ESEA Improvement. Schools earn points by meeting ESEA Flexibility targets (AMOs) in up to six categories, depending on size and grades served: Literacy – All Students, Literacy – TAGG Students, Math – All Students, Math – TAGG Students, Graduation – All Students, and Graduation – TAGG Students.
Four-Year Adjusted Cohort Graduation Rate (where applicable).
Gap Adjustments (where applicable). Schools with above-average gaps between TAGG and non-TAGG students on achievement and/or graduation receive a penalty. Schools with smaller-than-average gaps receive a bonus. Schools with average gaps receive no adjustment.

By drawing upon measures and concepts that should be familiar to Arkansas leaders and educators, the grading system aims to be meaningful; a complete reinvention of school accountability would perhaps be so different that it would be less meaningful to those familiar with accountability in Arkansas. But the model also represents a refinement of measures that are familiar to Arkansas leaders and educators. It differentiates meaningfully among schools. And ultimately, it translates into a letter grade that has meaning for the public, for whom the law was intended to provide clearer information. It seeks to give a fair chance to all schools regardless of the challenges they may face. While people may fairly disagree on the merits of the model, it nonetheless represents the best efforts and input of a wide variety of stakeholders and policymakers around the state.

Perhaps the most important test of fairness for any grading model is its relationship to student poverty, a disadvantage over which schools have very little control. If school grades are highly correlated with poverty, then those grades say far more about the challenges schools face than they do about schools’ effectiveness in educating the students that walk in the door. In the model proposed by the Department of Education to the State Board, the correlation between schools’ grades and poverty levels is -0.36, which as a rule of thumb is considered modest. This modestly negative correlation tells us that schools with higher poverty rates sometimes tend to receive lower grades. Yet among all models considered, the one that was chosen exhibited the lowest correlation with poverty. Using some basic statistics, one can show that a correlation of -0.36 implies that only13% of the differences in school grades can be accounted for by school poverty levels. The remaining 87% of variation arises from factors other than poverty. When the Office of Innovation examined these sources of variation, we found that school letter grades explained about 40% of the variation in student achievement (math and literacy) between schools after accounting for demographics including poverty. While this doesn’t show that the grading system is perfect, it does show that it clears a fundamental hurdle in terms of fairness.

One of the concerns raised frequently by educators and policymakers is what to expect with letter grades given the arrival of the new PARCC tests this school year. A new test certainly presents challenges, and no one will know what effect the tests will have on letter grades until students actually take the test. But the A-F law, as well as the proposed model, gives full freedom to the State Board and the ADE to make adjustments as necessary to ensure a fair distribution of grades. The transition may require a “pause” in letter grades during 2015 to establish baselines for future improvement targets. To ensure schools have an opportunity to improve upon their 2014 letter grade, other methods could be used during the transition year to identify improvements in schools, and where appropriate, assign a higher grade for the pause year.

Looking beyond the first year of PARCC tests, the state will have the opportunity to use a more refined model of student learning growth which, if given greater emphasis, could compare schools with advantaged and disadvantaged populations in a way that improves upon the current model. As with any transition, uncertainty lies ahead, but the hope is that with a new test the state will be able to refine the proposed method for letter grades as well as the method for determining federal accountability. The transition will allow the state to integrate new possibilities while drawing upon the lessons learned from past models.

The goal of the process for determining letter grades remains the same: to balance simplicity with fairness while providing meaningful differentiation among schools. The goal of the letter grades which result from this process also remains the same: to clarify to parents the public school rating system. Whatever course our state chooses to pursue, giving attention and weight to the priorities exampled here will make our mission far more possible than we previously thought.