The Educational Testing Service is canceling all scores on the Graduate Record Examination from China this month because large portions of the test had been used in previous administrations of the exam.
While testing companies periodically have to cancel scores when a testing center makes an error or gives an incorrect version of a test, making an error throughout China has major ramifications. About 24,000 test takers -- most of them likely applicants to graduate schools in the United States -- took the GRE this month in China.
Imagine that, instead of a college education as we now know it, we substituted a test-preparation course of study such as those offered by companies that prepare students for the SAT, ACT, and similar tests. The rationale for this course of study would be that the purpose of a college education is to improve performance on narrow cognitive assessments such as these. From this point of view, it makes sense that we cut to the chase. Instead of students studying English, history, mathematics, or science, they rather will prepare to do better on more advanced versions of the narrow cognitive tests used for college admissions. If the goal is to improve scores, why not teach directly to the tests?
When the goal is posed this way, few people probably would accept the substitution of test preparation for a genuine college education. It seems ill-advised. Yet the dominant trends in assessing learning in college might lead one to believe that, whatever educators may think, some of them act, perhaps inadvertently, as though this substitution of test-preparation for education would be a good idea. Which is to say: Oops, we already are moving in this direction!
Partially in response to pressure on the academy for accountability from the Spellings Commission, hundreds of institutions and entire state systems of higher education now assess learning in college via a standardized test, such as the Collegiate Learning Assessment (CLA), the ETS Proficiency Profile (ETS-PP, formerly the MAPP), or the Collegiate Assessment of Academic Proficiency (CAAP -- an ACT product). The CLA is intended to measure critical-thinking skills. The ETS-PP measures skills in critical thinking, reading, writing, and mathematics in the context of the humanities, social sciences, and natural sciences. The CAAP has modules measuring reading, writing skills, writing essay, mathematics, science, and critical thinking.
These are all rather valid and reliable tests, insofar as they go, but they are narrow in what they measure. They achieve their reliability in part because they focus their assessments so narrowly. (So-called “internal-consistency reliability” rises to the extent that a test narrowly measures just a single construct.) So psychometrically, the tests are reasonably good ones. But the issue discussed here is not how “good” the tests are, but rather, how well they are used--whether they have sufficient breadth adequately to serve as measures of learning in college.
Tests such as the CLA, ETS-PP, and CAAP measure skills similar to those measured by the SAT or ACT and are highly correlated with these tests. Moreover, data collected by the Voluntary System of Accountability (VSA) show the CLA, ETS-PP, and CAAP to be very highly correlated with each other. Other research by Douglas Detterman and his colleagues has shown that tests such as the SAT and ACT are highly correlated with IQ, meaning that, in the end, all these tests largely measure the same thing -- what psychologists call “general ability,” or g. What then can we conclude from scores on such tests?
A recent book, Academically Adrift, concludes that students learn frightfully little in college. Its conclusion is based in large part upon small or nonexistent gains on the CLA. The authors of the book point out several important areas of genuine concern, such as lack of study time and writing experience on the part of college students. These worrying areas of concern should not be ignored. But the book’s conclusion that higher education is “academically adrift” does not fully follow from its primary data. Although the authors recognize some of the limitations of their data, these limitations may not be fully recognized by readers and certainly have not been appreciated by reviewers. What is missing?
According to a carefully researched report recently released by the Lumina Foundation, in which is presented a “degree qualifications profile,” there are five areas in which college students should make demonstrable progress while in college: broad, integrative knowledge; specialized knowledge; intellectual skills; applied learning; and civic learning. Lumina further lists five intellectual skills: analytic inquiry, use of information resources, engaging diverse perspectives, quantitative fluency, and communication fluency. But one could consider an even more diverse set of kinds of intellectual skills. Consider four important kinds of thinking:
Analytical thinking. The tests measure reasonably well analytical (or critical) thinking, somewhat narrowly defined. This kind of thinking is important in being able to analyze an argument, evaluate an article, or compare and contrast two ideas. Hence it is quite proper that the tests should measure this kind of thinking.
Creative thinking. We as college teachers and administrators want students to learn not only to analyze and evaluate what they read, but also to go beyond what they read — to think creatively. Indeed, often our biggest complaint is that students have trouble getting beyond the book. Tests such as the CLAdo not measure creative thinking.
Practical thinking. Students can learn in a way that produces good test results but then find themselves unable to use what they learn in practical settings. They could get an A in Spanish but be unable to speak the language; or an A in statistics but be unable to analyze their own data; or an A in English or history but be unable to persuade people to take their ideas about world events seriously. Tests such as the CLA do not measure practical thinking. Although the CLA uses scenarios that come from everyday life, it does not use scenarios from the students’ everyday lives, so the problems are, to the students, nevertheless relative abstractions.
Wise and ethical thinking. Students need not only to acquire a knowledge base, but also learn how to direct this knowledge base in an ethical way toward a common good — one that balances the student’s own interests with other people’s interests and larger interests, over the long as well as short terms. Tests such as the CLAdo not measure wise or ethical thinking.
The importance of these four kinds of thinking has been well established through research on successful functioning in real world educational and employment contexts. Individuals need creative thinking to generate new ideas, analytical thinking to ascertain whether their ideas are good ideas, practical thinking to implement their ideas and convince others of their value, and wise and ethical thinking to ensure that their ideas help to achieve a common good.
The CLA — the measure used to establish the findings presented in Academically Adrift -- at best measures one fourth of these essential intellectual skills. But it measures only a minuscule portion of the total range of outcomes highlighted in the Lumina Degree Qualifications Profile.
Creators of tests such as the CLAview themselves as assessing critical-thinking skills in serious contexts. But they are not the students’ real-world contexts, and moreover, they are not the rich contexts in which students are taught to think in the academic disciplines they study. The reason that students "major" in a discipline is not just to learn the content knowledge of that discipline but also to learn to think deeply in the context of that discipline: How, for example, would a physicist, or sociologist, or historian, or educator, or business executive think about a particular problem? Moreover, the Lumina Degree Profile turns a spotlight on the importance of integrating knowledge across multiple disciplines and multiple sites of learning — informal as well as formal.
One might argue that, in the first two years, most students do not yet major in any discipline. Even for those students who take two years of general education courses in multiple areas of study, however, the goal is to steep students in rich intellectual disciplines and their modes of inquiry. But the thinking measured by the CLA and similar cognitive tests pays no attention to the rich conceptual knowledge fostered in the disciplines.
Moreover, although we like to think that the main agenda of college is for students to learn formal disciplinary knowledge and to think with it, arguably, the agenda is as much for them to learn tacit knowledge — to learn the ropes, so to speak. Tacit knowledge is procedural. It deals with how you manage yourself so as to accomplish your goals and stay out of trouble, how you form relationships with people and network effectively, how and from whom you seek help when you need it, how you decide whom you can trust and of whom you should be suspicious, how you meet the demands of an organization (collegiate or otherwise) while maintaining a meaningful life, and so forth. These outcomes are largely the result of learning outside the classroom; so really, all those activities outside the classroom are not necessarily a waste of time or even time ill-spent. The Lumina Degree Qualifications Profile underscores the role that informal learning plays in developing essential competencies. But these skills are not measured by the CLA and its sister tests.
Of course, some will question whether the Lumina Foundation guidelines provide any kind of reasonable framework. But the leading organization for the promotion of the liberal arts in the United States, the Association of American Colleges and Universities, proposes through its Liberal Education and America’s Promise (LEAP) initiative the following critical areas of student progress: knowledge of human cultures and the physical and natural world; intellectual and practical skills; teamwork and problem solving; personal and social responsibility; and integrative and applied learning. These so-called “essential learning outcomes,” developed through a broad dialogue with the higher education community and with employers, are similar to those of the Lumina Foundation’s DP. Indeed, its similarity to the LEAP essential-learning outcomes is one of the strengths of the Lumina framework.
This nation made a serious mistake in introducing well-intentioned but poorly executed legislation, the No Child Left Behind Act, which has turned many of our elementary and secondary schools into glorified test-preparation centers. Do we dare now do the same for colleges? Do we really want to make preparation for narrowly conceived cognitive tests the primary goal of a college education? Or do we want to broaden assessments, such as performances and portfolios, perhaps in addition to the narrower assessments? If we limit ourselves to narrow measures, we can say good-bye to our hopes to develop an internationally competitive, creative and ethical society. We instead can say hello to creating a nation of excellent test-takers who will shine, but only in some dystopian world in which achieving high scores on tests is the measure of one’s contribution to society.
Ultimately, the goal of college education is to produce the active citizens and positive leaders of tomorrow — people who will make the world a better place. Narrow tests of cognitive skills do not measure the creative, practical, and wisdom-based and ethical skills that leaders need to succeed. We can and truly must assess much more broadly.
Robert J. Sternberg is provost, senior vice president and professor of psychology at Oklahoma State University, and a member of the board of the Association of American Colleges and Universities. The views expressed in the essay are entirely his own.