In 2000, the National Center for Public Policy and Higher Education released the first 50-state report card on higher education performance. Measuring Up 2000 assessed states on how well they were doing in preparing students for college, providing access to college, making college affordable, and promoting completion of certificates and degrees. That groundbreaking report also highlighted areas where objective, comparative data were lacking, most notably student learning outcomes.
Eight years later, a number of states have used the report card to drive conversations about improving public higher education policy. But despite all the talk in Washington and state capitals about the need for better data and more robust accountability systems, we have made little progress in filling critical information gaps and have even moved backward in some areas. Our efforts to make higher education more accessible and affordable will stall unless we change this.
These information gaps extend from high school through graduate school and indicate that for every step forward, we have taken a step back in having the data states and institutions need to better inform decision-making about how to increase college access and success while containing costs.
In the area of college preparation, state-level data have moved forward and backward. On the positive side, states do have better data about who is making it through high school. But they know less about course-taking in high school math and science because fewer states are participating in national surveys in these areas today than in 2000. And while most states administer the 12th grade version of the National Assessment of Educational Progress (NAEP), state-level results are still not available.
The nation has made no headway in getting a better handle on who is making it to college. Nationally, we can track college enrollment rates by race/ethnicity and income, but at the state level, enrollment rates by income are still not available. Additionally, we can say little (if anything) about what happens to students who cross state lines for college after they first enroll.
There have been small steps forward in gauging college affordability, but there are more steps to be taken. We can now track undergraduate and graduate student loan borrowing separately, a significant improvement over 2000. The 2004 edition of the National Postsecondary Student Aid Study (NPSAS) provided valuable information about whether and how aid packages change for students after the freshman year; unfortunately, the study only covered about a quarter of the states. Moreover, we are still not in a position to assess unmet financial need for students at the state level.
Even on critical issues such as determining whether students are completing programs on-time or at all, we know a little more, but still not enough. The U.S. Department of Education’s Graduation Rate Survey now provides comparative data on first-time, full-time students completing degrees at three-, four-, five-, and six-year intervals. But these data provide a limited picture because a six-year timeframe is too short for many students (especially working adults) and because the survey cannot account for transfers or more importantly those who start as part-time students. We also have an incomplete picture of progress and completion for students who move across states during their college career, since not all states are participating in a data-sharing effort through the National Student Clearinghouse (and that database was not created to serve the analytic function we’re now asking of it).
Despite all the hue and cry about student learning since 2000, we have actually taken a step backward in gathering comparable state-level data. Most of the movement in the last eight years has focused on individual campuses and systems, through efforts such as the Collegiate Learning Assessment and the Voluntary System of Accountability. Perhaps the biggest step backward has been in the measurement of adult skills. The number of states participating in the National Assessment of Adult Literacy (NAAL) fell from 13 in 1992 to just six in 2003, and the 2003 data are still not available for all the participating states.
In just a few weeks, state legislatures will convene to face the biggest budget crisis in a generation. Unfortunately, they will have to make difficult decisions about priorities without the benefit of better information about the most urgent needs for getting more students to and through college at a price they can afford. This makes it more likely that we will see the usual responses -- raising tuition, capping enrollment, cutting across-the-board -- that will put states further behind in the race to grow a competitive work force.
We can fix this. It is time for every state -- and the nation -- to commit to getting the information needed to increase the size of our college-educated population, and to halt the worrisome slide of the United States relative to other advanced nations on higher education outcomes.
Dennis P. Jones is the president at the National Center for Higher Education Management Systems, a research and development center founded to improve the management effectiveness of colleges and universities. He serves on the National Advisory Group for Measuring Up 2008, the national and state report card on higher education issued by the National Center for Public Policy and Higher Education. Measuring Up 2008 will be released on December 3, when it will be available for viewing and download on the National Center's Web site.
In the movie "Ghostbusters," Dan Aykroyd commiserates with Bill Murray after the two lose their jobs as university researchers. “Personally, I like the university. They gave us money and facilities, and we didn’t have to produce anything. You’ve never been out of college. You don’t know what it’s like out there. I’ve worked in the private sector. They expect results.” I can find some amusement in this observation, in a self-deprecating sort of way, recognizing that this perception of higher education is shared by many beyond the characters in this 1980s movie.
Members of Secretary Spellings’ Commission on the Future of Higher Education were very clear about their expectations for higher education when they wrote, “Students increasingly care little about the distinctions that sometimes preoccupy the academic establishment, from whether a college has for-profit or nonprofit status to whether its classes are offered online or in brick-and-mortar buildings. Instead, they care -- as we do -- about results.”
This expectation for assessment as accountability has forced many faculty members and administrators to seek out ways to balance assessment for “us”, or assessment for “improvement,” with assessment for “them,” or assessment for “accountability.” We do assessment for “us” in our classrooms, to provide feedback to students on their progress, in our programs to provide direction for improvement efforts, for each other when we provide reviews of articles and of ourselves when we evaluate our own performance.
Conversely, assessment for “them” is done in response to an external demand to prove “how much students learn in colleges and whether they learn more at one college than another," as the Spellings Commission put it in its final report.
When we perform assessment for "us" we are not afraid to discover bad news. In fact, when we assess for "us," it is more stimulating to discover bad news about our students' performance because it provides clear direction for our improvement efforts. In contrast, when we perform assessment for "them," we try our best to hide bad news and often put a positive face on the bad news that we can’t hide.
When we perform assessment for "us" we do our best to create valid and reliable assessments but don’t let the technical details, particularly when they are not up to exacting research standards, derail our efforts. When we perform assessment for "them," if there is any deviation from strict standards for validity, reliability, norming group selection, sampling approach, testing procedures or scoring techniques, we are quick to dismiss the results, particularly when they are unfavorable.
We know the "us" -- faculty members, students, department chairs, deans -- and we know how to talk about what goes on at our institution with each other. Even amid the great diversity of institutions we often find a common core of experience and discover that we speak each other’s language.
But the "them" is largely a mystery. We may have some guesses about the groups that make up "them" -- parents, boards of regents, taxpayers, legislatures -- but we cannot be sure because accountability is usually described generically, not specifying any particular group, and because our interaction with any of these groups is limited or nonexistent.
When we perform assessment for "us," we operate under a known set of possible consequences. Some of these consequences could be severe, such as a budget reduction or a reprimand from our superior, but in general the possible consequences are a known and acceptable risk.
When we perform assessment for "them," the consequences are much more terrifying because we do not control who uses these data or the purposes of their use. One of the uses of assessment for "them" is for accreditation, which can bring particularly negative consequences. We wake up in the middle of the night with visions of newspaper headlines publicly disclosing our poor performance.
At its best this would bring years of embarrassment and shame that would hang over our heads like the cloud of dust that followed Charles Schulz’s Pig-Pen. At its worst we face losing accreditation and the labeling of our school as a “diploma mill,” causing our students to be ineligible for federal student aid and leading to a mass exodus of students from our institution. Assessment for "them" brings high levels of risk and low levels of reward.
Finding the balance between assessment for "us" and assessment for "them" is a significant challenge that is also full of uncertainty as the Department of Education pursues negotiated rule making and as the Higher Education Act comes up for renewal in Congress. It can feel a bit like the Eliminator challenge in the television game show "American Gladiators" that had contestants navigating a balance beam while Gladiators attempted to knock them off the beam with swinging medicine balls. There have, however, been a number of efforts by university systems and by individual institutions to find ways to balance assessment for "us" with assessment for "them."
The State University of New York (SUNY) Assessment Initiative seeks to strike a balance between assessment for "us," or assessment for “improvement,” with assessment for "them," or assessment for “accountability”. The SUNY Assessment Initiative can be divided into two parts: assessment of general education and assessment within academic majors.
For assessment of general education, SUNY first developed a set of learning outcomes for general education programs at undergraduate degree-granting institutions. All SUNY institutions are required to use “externally referenced measures” to determine whether or not their students are achieving in the areas of Critical Thinking, Basic Communication and Mathematics. However, to keep this approach in balance, the Assessment Initiative does not require all institutions to use the same measure. Rather, institutions can select from nationally-normed exams or rubrics developed by a panel that best represent their mission in the state. This holds institutions accountable for demonstrating student achievement in foundational areas but will not be used to “punish, publicly compare, or embarrass students, faculty, courses, programs, departments or institutions either individually or collectively,” according to a description of the program.
Institutions are also required to perform local assessment of their general education programs. Institutions are held accountable for attending to the process of assessment -- examining student learning on specific objectives through assessment and making decisions about ways to improve based on those data -- by an external group called the General Education Assessment Review group (GEAR). GEAR, composed of primarily faculty members from SUNY institutions, reviews and approves campus assessment plans but not the actual assessment outcomes. In this way, SUNY documents say, “emphasis is placed on assessment best practice without introducing an element of possible defensiveness campuses might feel if their assessment program does not yield evidence to support optimal student learning.”
At the institutional level, Colorado State University and the University of Nebraska-Lincoln partnered together to implement within their institutions the Plan for Researching Improvement and Supporting Mission (PRISM) and Program Excellence through Assessment, Research and Learning (PEARL), respectively. PRISM and PEARL engage faculty members in assessment of the academic major -- assessment for "us." Faculty members select learning outcomes that are important for students in that major, perform assessment of student learning on those outcomes and then make improvements to their program based on those data. A panel of faculty members from each institution holds the academic majors accountable by reviewing assessment plans and encouraging the use of higher quality assessment practices.
To balance assessment for "us" with assessment for "them," PRISM and PEARL utilize an online software system that allows for the classification of the academic major assessment activity for aggregation at higher levels. In this way the institutions can describe the kind of learning that is going on within the institution, the assessment instruments that are being used to examine that learning and the improvement activities that were performed in response to the assessment data.
The SUNY Assessment Initiative and the PRISM and PEARL approaches balance assessment for "us" and assessment for "them" by demonstrating a commitment to student learning, not by achieving benchmark scores on a specific assessment or by earning a particular ranking. In both of these examples participants are held accountable for engaging in the process of assessing student learning, a process that is reviewed for best practices by an external panel.
Dan Aykroyd and Members of Secretary Spellings’ Commission on the Future of Higher Education are correct in expecting “results.” If discussions for demonstrating these “results” continue to emphasize narrow and prescriptive assessment for "them" institutions will face large amounts of work, risk and agony for little benefit. However, if assessment for "them" can be about demonstrating a commitment to student learning and being accountable for a process, then institutions will be able to place their time an energy where it belongs: with the students.
Alarmed echoes of “the feds are coming” reverberate in the halls of academe as Secretary of Education Margaret Spellings’s Department of Education confers with higher education representatives to improve quality and public accountability. This so-called “negotiation” process follows on the heels of the Spellings Commission’s report, "A Test of Leadership," calling for improved quality and public accountability. The Department discussions and hearings have morphed into recommendations for a new degree of federal intervention. Campus leaders see the intrusion as unprecedented – but so too is the problem it aims to address.
Education -- pre-school through college -- is the primary means of improving human capital and is therefore understood to be the single most important ingredient in the ability of America to compete in the global economy. But there is a growing unease about what now passes for quality in undergraduate education, a vocal concern led not by angry students, as in the ‘60s, but by parents, business, political and academic leaders who sense a dangerous hollowing of an increasingly precarious ivory tower.
Virtually every study within and outside the academy acknowledges that we are not doing as well as we should and that we need to significantly improve our undergraduate colleges -- not only to compete globally, but equally importantly, to enrich an active democracy here at home, a public life marked by liberty, dissent, and robust civic engagement. Former Govs. James B. Hunt (of North Carolina) and Garry Carruthers (of New Mexico), the Association of American Colleges and Universities and the Business-Higher Education Forum are among the groups and policy makers that have acknowledged a major performance gap in undergraduate education.
Higher education has neither developed adequate metrics to assess learning nor demonstrated a willingness to publish such results when they are available, content to rely on and participate in, while at the same time damning, spurious college guides and reputation rankings. And it is not uncommon to hear faculty and administrators across the country protest that most of what we teach is too complex to be measured, that the diversity of college and university missions precludes one-size-fits-all assessment, or that the market place is the only required arbiter of quality. This implicit "trust us" attitude is now confronted by a market place that is questioning quality and is no longer accepting what amounts to higher education’s privilege of what is in essence a form of “faith-based” entitlement.
Joining the critics and jumping into the vacuum created by higher education leaders perceived as unwilling to take on the necessary reform agenda to substantially improve quality, the Spellings Commission identified accountability as the fundamental issue, dependent, it said, on assessment of value-added learning.
The Commission’s logic on this is as follows: (1) undergraduate education quality is inadequate given the challenges we face in the 21st century; (2) the solution to quality improvement requires a more transparent accountability; (3) assessment, especially value-added learning assessment, is fundamental to the improvement of quality and accountability. (I should note here, in full disclosure, that the Collegiate Learning Assessment, with which I am closely affiliated, was cited as an assessment tool by the commission.) As the commission itself wrote straightforwardly:
We believe that improved accountability is vital to ensuring the success of all the other reforms we propose. Colleges and universities must become more transparent about cost, price, and student success outcomes, and must willingly share this information with students and families. Student achievement, which is inextricably connected to institutional success, must be measured by institutions on a “value-added” basis that takes into account students’ academic baseline when assessing their results.
The Spellings Commission got it right -- quality needs to improve, accountability must become far more transparent, and assessing learning, including value-added assessment, is crucial to both. This is not to say, however, that this requires that one single test be imposed on all institutions or that that we know how to measure all that is worth learning. But it is to say that transparent, systematic learning assessment can be a powerful force for improvement and is necessary for regaining public trust in the public good served by higher education.
There is an apparent conflict between assessment for improvement and assessment for accountability. I say “apparent” because I do not think this is an either/or situation; assessment for improvement and accountability are inextricably related. The public has every right to expect that it is higher education’s educational and professional duty to systematically assess its impact on student learning as an essential condition for improvement and transparent accountability.
From an improvement perspective, student learning is higher education’s raison d’ etre, and we know that appropriate and timely feedback to students and faculty increases student learning and can usefully inform institutional change. From an accountability perspective, professional training and the sanctioning status it confers obligates the academy to be transparent in its endeavors, something expected of all professions. Moreover, colleges and universities are subsidized by the public, either directly through tax revenues and/or through tax exemptions, and thus we do have responsibility for rigorous student and institutional assessment and public accountability. The difficult issue is to make sure appropriate assessments are used and that the “stakes” are fair.
Timing is crucial. Lest the issues of learning assessment and institutional accountability be allowed to become the handmaiden of state and federal politics as many believe has occurred in the K-12 sector, the academy must act now. For this to happen, higher education needs to take the professional lead and control on issues of learning assessment and public accountability, a strategy endorsed a few weeks ago by the Modern Language Association.
It wrote: "It is hard to disagree with the argument that colleges should be held publicly accountable for the quality of education they provide and that careful assessment of what our students learn is a reasonable means of demonstrating such accountability. If these principles are applied in an intelligent fashion and with full cooperation by American colleges and universities, the report of the Spellings Commission can usefully spur them in their continuing effort to improve the education they offer."
The operative phrase is “intelligent fashion and with full cooperation by American colleges and universities.” During the commission hearings and after the issuance of its final report, many in the academy feared and argued strenuously against any imposition of a federally mandated test reminiscent of the NCLB regime of high-stakes state tests. The academy, however, is in a somewhat weak position to be claiming “foul” given that it is the self-regulated arbiter of quality via accreditation standards, not to mention complicit in supplying data to and de facto affirming current ranking and college guidebook as quality indicators that we know are invalid. Measures of quality such as reputation, retention and graduation rates, and alumni giving, for example, are predicted mostly by admissions selectivity and have not been show to be predictors of learning.
Nor has the academy been a staunch defender of its own accreditation processes, accepted at best publicly as a necessary evil and, in private, loathed and demeaned, especially by the “elite” institutions that perceive the need for accreditation as beneath their presumed quality. The commission, too, excoriated the current accrediting process as ineffective for relying too heavily on “input” variables and reputation. Interestingly (some have suggested “cynically”), Secretary Spellings’s strategy is not to create a new structure for control but rather attempts to enlist the cooperation of colleges and universities by utilizing the treasured academic value of peer review in a more rigorous accreditation process! The academy seems to have been hoisted on its own petard.
To accomplish this feat, the Department of Education, through its legal authority to recognize and regulate accrediting agencies, is proposing a far more robust set of standards requiring emphasis on learning assessment and public disclosure of such data. The “negotiations” over accrediting principles and standards are in their late stages and for the moment there is relative quiet from the campuses. Whether or not this is the proverbial calm before the storm, fatigued acquiescence, or principled agreement remains to be seen. Or, some believe it might all go away, that the department has overstepped its legal authority and will be cut short by a Democratic Congress or delayed until a savior arrives in the next election cycle.
While I appreciate fully how the academy has led itself into what may be a box canyon, my own experience as a faculty member and administrator in public and private colleges and universities causes me to believe that neither higher education nor this country would be well served by additional federal control. I suggest, however, that higher education would be wise to jump at the chance to strengthen institutional peer review via the accreditation structure.
By this I mean that the academy not wait to have something imposed but rather take the offensive and quickly accept responsibility for developing appropriate standards and learning outcome measures, and reform the accreditation process by revising standards and increasing transparency. I am not naïve -- it may be too late to ward off federal intervention -- but a stance the academy ought to take would sound something like this: “We can and must improve our quality; we commit to making use of our considerable, collective research capabilities to develop, pilot and implement a variety of appropriate learning outcome measures; and we agree to construct protocols for sharing such data with the public.”
How best to begin? I propose that higher education be given five years for such a development process and financial support from federal incentive funds of $10 million per year, to be matched by institutions, corporations and foundations for consortiums to develop such measures. I propose a “summit” meeting of regional accreditors and the leading national organizations in higher education to be convened in the next few months to create a comprehensive and coherent action plan resulting in a framework and criteria for future self-selected consortium proposals to access the pool of funds designated for such purposes from the combined contributions of the federal government, corporations and foundations. We can argue later which organization would be best suited to holding and administering such funding. The summit would need to be perhaps a weeklong event (rather than the one-day affair recently held by Secretary Spellings) and certainly there are venues like Wingspread or the Aspen Institute that are equipped to facilitate such a gathering.
This proposal is hardly radical. The National Association of State Universities and Land-Grant Colleges (NASULGC) and the American Association of State Colleges and Universities (AASCU) have proposed a “Voluntary System of Accountability for Undergraduate Education” (VSA) that accepts the need for improved quality and accountability and the need to develop appropriate measures of learning. The Council for Independent Colleges (CIC) has for the past three years engaged many of its members in a consortium to try a variety of learning outcome measures.
The Teagle Foundation has been funding a number of consortiums across the country to develop and implement a variety of learning assessment measures. The Association of American Colleges and Universities (AAC&U) has for years convened conferences on assessment, teaching, and curricula and has issued a strong plea for use of student outcomes for accountability. The Council for Higher Education Accreditation (CHEA) is on record supporting improved assessment and accountability. And most of the states and each of the professional and regional accrediting associations have stipulated the need for such data as part of their accountability standards.
The work mentioned above is hardly exhaustive of what is taking place in this country, but clearly we need to see something more convergent, coherent, timely, and transparent. And what I am proposing is hardly sufficient to the task ahead. Surely we need to reform incentives and rewards for promotion and tenure, not to mention how faculty are educated in doctoral programs if learning outcomes are to be a fundamental criterion of institutional quality. And the incentives and rewards for public institutions dependent on state support must change as well.
What is needed before anything else, however, is for higher education to get its professional and collective act together immediately on the issues of learning assessment, accountability, and the role of accreditation lest the cry, “the feds are coming” result in a federal No College Left Behind.
Richard H. Hersh is a former president of Hobart and William Smith Colleges and Trinity College; former vice president for Academic Affairs at the University of New Hampshire and Drake University and former vice president for research at the University of Oregon; a co-director of the Collegiate Learning Assessment Project; and co-editor of Declining by Degrees: Higher Education at Risk.
As participants in the debate regarding appropriate strategies for assessing learning in higher education, we agree with some of the statements Trudy Banta made in her Inside Higher Ed op-ed: “A Warning on Measuring Learning Outcomes.” For example, she says that “it is imperative that those of us concerned about assessment in higher education identify standardized methods of assessing student learning that permit institutional comparisons.” We agree. Where we part company is on how that can best be achieved.
Banta recommends two strategies, namely electronic portfolios and measures based in academic disciplines. One of the many problems with the portfolio strategy is that it is anything but standardized and therefore unable to support institutional comparisons. For instance, the specific items in a student’s portfolio and the conditions under which those items were created (including the amount and types of assistance the student received) will no doubt differ across students within and between colleges. In short, the portfolio is not standardized and therefore cannot function as a benchmark for institutional comparisons.
The problem with Banta’s second strategy, discipline specific measures, stems from the vast number of academic majors for which such measures would have to be created, calibrated to each other (so results can be combined across majors), and updated, as well as the wide differences of opinion within and between institutions as to what should be assessed in each academic discipline. Banta is concerned that “if an institution’s ranking is at stake [as a result of its test scores], faculty may narrow the curriculum to focus on test content.” However, that problem is certainly more likely to arise with discipline specific measures than it is with the types of tests that she says should not be used, such as the critical thinking and writing exams employed in the Collegiate Learning Assessment (CLA) program, with which we are affiliated.
Thus, while we agree with Banta that there is a place for discipline specific measures in an overall higher education assessment program, the CLA program continues to focus most of its efforts on the broad competencies that are mentioned in college and university mission statements. These abilities cut across academic disciplines and, unlike the general education exams Banta mentions, the CLA -- which she does not mention by name, but is implicitly criticizing -- assesses these competencies with realistic open-ended measures that present students with tasks that all college graduates should be able to perform, such as marshalling evidence from different sources to support a recommendation or thesis ( see Figure 1 for sample CLA scoring criteria and this page for details).
We suspect that Banta’s criticism of the types of measures used in the CLA program stems from a number of misperceptions about their true characteristics. For example, Banta apparently believes that scores on tests of broad competencies would behave like SAT scores simply because they are moderately correlated with each other. However, the abilities measured by the CLA are quite different from those assessed by the general education tests discussed in Banta’s article, such as the SAT, ACT and the MAPP. Consequently, an SAT prep course would not help a student on the CLA and instruction aimed at improving CLA scores is unlikely to have much impact on SAT or ACT scores.
Moreover, empirical analyses with thousands of students show that the CLA’s measures are sensitive to the effects of instruction; e.g., even after holding SAT scores constant, seniors tend to earn significantly higher CLA scores than freshmen. Differences are in the order of 1.0 to 1.5 standard deviation units. These very large effect sizes demonstrate that the CLA is not simply assessing general intellectual ability.
Banta also is concerned about score reporting methods, such as those used by the CLA, that adjust for differences among schools in the entering abilities of their students. In our view, score reporting methods that do not make this adjustment face very difficult (if not insurmountable) interpretative challenges. For example, without an adjustment for input, it would not be feasible to inform schools about whether their students are generally doing better, worse, or about the same as would be expected given their entering abilities nor whether the amount of improvement between the freshmen and senior years was more, less or about the same as would be expected.
The expected values for these analyses are based on the school’s mean SAT (or ACT) score and the relationship between mean SAT and CLA scores among all of the participating schools. This type of “value added” score reporting focuses on the school’s contribution to improving student learning by controlling for the large differences among colleges in the average ability of their entering students.
Banta objects to adjusting for input. She says that “For nearly 50 years measurement scholars have warned against pursuing the blind alley of value added assessment. Our research has demonstrated yet again that the reliability of gain scores and residual scores -- the two chief methods of calculating value added -- is negligible (i.e., 0.1).”
We suspect the research she is referring to is not applicable to the CLA. For example, the types of measures she employed are quite different from those used in the CLA program. Moreover, much of the research Banta refers to uses individual-level scores, whereas the CLA program uses scores that are much more reliable because they are aggregated up to the program or college level.
Nevertheless, it is certainly true that difference scores (and particularly differences between residual scores) are less reliable than are the separate scores from which the differences are computed. But how much less? Is the reliability close to the 0.1 that Banta found with her measures or something else?
It turns out that Banta’s estimates are way off the mark when it comes to the CLA. For example, analyses of CLA data reveal that when the school is the unit of analysis, the reliability of the difference between the freshmen and senior mean residual scores -- which is the value added metric of prime interest -- is a very healthy 0.63, and the reliability of institutional level residual scores for freshmen and seniors are 0.77 and 0.70, respectively. All of these values are conservative estimates ( see Klein, et al, 2007 for details). Even so, these values are far greater than the 0.1 predicted by Banta, and they are certainly sufficient for the purpose for which CLA results are used, namely obtaining an indication of whether a college’s students (as a group) are scoring substantially (i.e., more than one standard error) higher or lower than what would be expected relative to their entering abilities.
Banta concludes her op-ed piece by saying that “standardized tests of generic intellectual skills which she defines as ‘writing, critical thinking, etc.’ do not provide valid evidence of institutional differences in the quality of education provided to students. Moreover, we see no virtue in attempting to compare institutions, since by design, they are pursuing diverse missions and thus attracting students with different interests, abilities, levels of motivation, and career aspirations.”
Some members of the academy may buy into Banta’s position that no standardized test of any stripe can be used productively to assess important higher education outcomes. However, the legislators who allocate funds to higher education, college administrators, many faculty, college bound students and their parents, the general public, and employers may have a different view. They are likely to conclude that regardless of a student’s academic major, all college graduates, when confronted with the kinds of authentic tasks the CLA program uses, should be able to do the types of things listed in Figure 1. They also are likely to want to know whether the students at a given school are generally making more or less progress in developing these abilities than are other students.
In short, they want some benchmarks to evaluate progress given the abilities of the students coming into an institution. Right now, the CLA is the best (albeit not perfect) source of that information.
Stephen Klein, Richard Shavelson and Roger Benjamin
Stephen Klein is director of research and Roger Benjamin president of the Council for Aid to Education, and Richard Shavelson is Margaret Jacks Professor of Education at Stanford University.