After a month of speculation, President Obama unveiled his plan to “shake up” higher education last week. As promised, the proposal contained some highly controversial elements, none greater than an announcement that the U.S. Department of Education will begin to rate colleges and universities in 2015 and tie financial aid to those results three years later. The announcement prompted typical clichéd Beltway commentary from the higher education industry of “the devil is in the details” and the need to avoid “unintended consequences,” which should rightfully be attributed as, “We are not going to outright object now when everyone’s watching but instead will nitpick to death later.”
But the ratings threat is more substantive than past announcements to put colleges “on notice,” if for no other reason than it is something the department can do without Congressional approval. Though it cannot actually tie aid received directly to these ratings without lawmakers (and the threat to do so would occur after Obama leaves office), the department can send a powerful message both to the higher education community and consumers nationwide by publishing these ratings.
Ratings systems, however, are no easy matter and require lots of choices in their methodologies. With that in mind, here are a few recommendations for how the ratings should work.
Ratings aren’t rankings.
Colleges have actually rated themselves in various forms for well over a hundred years. The Association of American Universities  is an exclusive club of the top research universities that formed in 1900. The more in-depth Carnegie classifications , which group institutions based upon their focus and types of credentials awarded, have been around since the early 1970s. Though they may not be identified as such by most people, they are forms of ratings — recognitions of the distinctions between universities by mission and other factors.
A federal rating system should be constructed similarly. There’s no reason to bother with ordinal rankings like the U.S. News and World Report because distinguishing among a few top colleges is less important than sorting out those that really are worse than others. Groupings that are narrow enough to recognize differences but sufficiently broad to represent a meaningful sample are the way to go. The Department could even consider letting colleges choose their initial groupings, as some already do for the data feedback reports  the Department produces through the Integrated Postsecondary Education Data System (IPEDS).
It’s easier to find the bottom tail of the distribution than the middle or top.
There are around 7,000 colleges in this country. Some are fantastic world leaders. Others are unmitigated disasters that should probably be shut down. But the vast majority fall somewhere in between. Sorting out the middle part is probably the hardest element of a ratings system — how do you discern within averageness?
We probably shouldn’t. A ratings system should sort out the worst of the worst by setting minimum performance standards on a few clear measures. It would clearly demonstrate that there is some degree of results so bad thatit merits being rated poorly. This standard could be excessively, laughably low, like a 10 percent graduation rate. Identifying the worst of the worst would be a huge step forward from what we do now. An ambitious ratings system could do the same thing on the top end using different indicators, setting very high bars that only a tiny handful of colleges would reach, but that’s much harder to get right.
Don’t let calls for the “right” data be an obstructionist tactic.
Hours after the President’s speech, representatives of the higher education lobby stated the administration’s ratings “have an obligation to perfect data.”  It’s a reasonable requirement that a rating system not be based only on flawed measures, like holding colleges accountable just for the completion of first-time, full-time students. But the call for perfect data is a smokescreen for intransigence by setting a nearly unobtainable bar. Even worse, the very people calling for this standard are the same ones representing the institutions that will be the biggest roadblock to obtaining information fulfilling this requirement. Having data demands come from those keeping it hostage creates a perfect opportunity for future vetoes in the name of making perfect be the enemy of the good. It’s also a tried and true tactic from One Dupont Circle. Look at graduation rates, where the higher education lobby is happy to put out reports critiquing their accuracy  after getting Congress to enact provisions  that banned the creation of better numbers during the last Higher Education Act reauthorization.
To be sure, the Obama administration has an obligation to engage in an open dialogue with willing partners to make a good faith effort at getting the best data possible for its ratings. Some of this will happen anyway thanks to improvements to the department’s IPEDS database . But if colleges are not serious about being partners in the ratings and refuse to contribute the data needed, they should not then turn around and complain about the results.
Stick with real numbers that reflect policy goals.
Input-adjusted metrics are a wonk’s dream. Controlling for factors and running regressions get us all excited. But they’re also useless from a policy implementation standpoint. Complex figures that account for every last difference in institutions will contextualize away all meaningful information until all that remains is a homogenous jumble where everyone looks the same. Controlling for socioeconomic conditions also runs the risk of just inculcating low expectations for students based upon their existing results. Not to mention any modeling choices in an input-adjusted system will add another dimension of criticism to the firestorm that will already surround the measures chosen.
That does not mean context should be ignored. There are just better ways to handle it. First and foremost is making ratings on measures based on performance relative to peers. Well-crafted peer comparisons can accomplish largely the same thing as input adjustment since institutions would be facing similar circumstances, but still rely on straightforward figures. Second, unintended consequences should be addressed by measuring them with additional metrics and clear goals. For example, afraid that focusing on a college's completion rate will discourage enrolling low-income students or unfairly penalize those that serve large numbers of this type of students? The ratings should give institutions credit for the socioeconomic diversity of their student body, require a minimum percentage of Pell students, and break out the completion rate by familial income. Doing so not only provides a backstop against gaming, it also lays out clearer expectations to guide colleges' behavior, something the U.S. News rankings experience has shown that colleges clearly know how to do with less useful measures like alumni giving (sorry, Brown, for holding you back on that one).
Mix factors a college can directly control with ones it cannot.
Institutions have an incentive to improve on measures included in a rating system. But some subset of colleges will also try to evade or “game” the measure. This is particularly true if it’s something under their control — look at the use of forbearances or deferments to avoid sanctions under the cohort default rate. No system will ever be able to fully root out gaming and loopholes, but one way to adjust for them is by complementing measures under a college’s control with ones that are not. For example, concerns about sacrificing academic quality to increase graduation rates could be partially offset by adding a focus on graduates’ earnings or some other post-completion behavior that is not under the college’s control. Institutions will certainly object to being held accountable for things they cannot directly influence. But basing the uncontrollable elements on relative instead of absolute performance should further ameliorate this concern.
Focus on outputs but don’t forget inputs.
Results matter. An institution that cannot graduate its students or avoid saddling them with large loan debts they cannot repay upon completion is not succeeding. But a sole focus on outputs could encourage an institution to avoid serving the neediest students as a way of improving its metrics and undermine the access goals that are an important part of federal education policy.
To account for this, a ratings system should include a few targeted input metrics that reflect larger policy goals like socioeconomic diversity or first-generation college students. Giving colleges “credit” in the ratings for serving the students we care most about will provide at least some check against potential gaming. Even better, some metrics should have a threshold a school has to reach to avoid automatic classification into the lowest rating.
Put it together.
A good ratings system is both consistent and iterative. It keeps the core pieces the same year to year but isn’t too arrogant to include new items and tweak ones that aren’t working. These recommendations present somewhere to start. Group the schools sensibly — maybe even rely on existing classifications like those done by Carnegie. The ratings should establish minimum performance thresholds on the metrics we think are most indicative of an unsuccessful institution — things like completion rates, success with student loans, time to degree, etc. They should consist of outcomes metrics that reflect their missions—such as transfer success for two-year schools, licensure and placement for vocational offerings, earnings, completion and employment for four-year colleges and universities. But they should also have separate metrics to acknowledge policy challenges we care about — success in serving Pell students, the ability to get remedial students college-ready, socioeconomic diversity, etc. — to discourage creaming. The result should be something that reflects values and policy challenges, acknowledges attempts to find workarounds, and refrains from dissolving into wonkiness and theoretical considerations that are divorced from reality.
Ben Miller is a senior policy analyst in the New America Foundation's education policy program, where he provides research and analysis on policies related to postsecondary education. Previously, Miller was a senior policy advisor in the Office of Planning, Evaluation, and Policy Development in the U.S. Department of Education.