Robert J. Sternberg, the new provost of Oklahoma State University, has just finished a five-year term as dean of arts and science at Tufts University, during which time he had the opportunity to test out his ideas about non-cognitive evaluation of applicants. Sternberg has long argued that standardized testing and high school grades -- while appropriate considerations in determining who gets into which colleges -- tell only part of the story. At Tufts, he helped add optional admissions  questions designed to measure creativity and other qualities that might well make someone an outstanding college student. In a new book -- College Admissions for the 21st Century  (Harvard University Press) -- Sternberg discusses the experiment at Tufts and why it shows, in his belief, the inadequacy of traditional college admissions tools. In an e-mail interview, in which he stressed that the book reflects his personal views as a scholar and not the views of Tufts, he discussed the work.
Q: For years, some colleges have boasted about having more creative essay questions on applications, or more insightful interviewers. Why do you think the "Kaleidoscope" system at Tufts goes beyond those sorts of application methods?
A: I should explain up front that Kaleidoscope consists of a system for evaluating applications to college that involves essays and other performances and products. It has been used at Tufts for the past five years (since my second year as dean of the School of Arts and Sciences there). Although I conceived of it, it was implemented by the terrific dean of admissions at Tufts, Lee Coffin, and his wonderful staff.
It is admirable that many colleges recognize the need to go beyond traditional measures such as G.P.A. and standardized test scores in admissions processes. The Kaleidoscope Project has three features that are perhaps distinctive. These features emanate from the view that the purpose of college/university education is to produce the leaders of tomorrow who will make a positive, meaningful, and enduring difference to the world.
First, the questions are based on a theory of leadership, WICS -- wisdom, intelligence, creativity, synthesized -- according to which positive leaders need a synthesis of (a) creative skills and attitudes in order to generate new ideas; (b) analytical skills and attitudes in order to ensure that the ideas are good ones; (c) practical skills and attitudes to implement their ideas and to persuade others of the value of these ideas; and (d) wisdom-based skills and attitudes to ensure that the ideas help to achieve a common good, over the long and short terms, through the infusion of positive ethical values. So the questions in Kaleidoscope are designed to measure these creative, analytical, practical, and wisdom-based skills and attitudes. The WICS theory is an extension of my theory of successful intelligence, which I have spent many years validating in empirical research.
An example of a creative question would be to write a story with a title such as "The End of MTV" or to draw an advertisement for a new product or service or to submit a creative video via YouTube. An example of an analytical item would be to state one’s favorite book and why it is one’s favorite book. An example of a practical item would be to explain how one convinced a friend of an idea the friend did not initially accept. And an example of a wisdom-based item would be to tell how one would take a current passion and transform it later to serve the common good.
Second, although responses to the questions are rated holistically, they are based on rubrics. The admissions raters were trained to use specific criteria in assessing responses. For example, creative strength is assessed in terms of novelty, quality, and task appropriateness; analytical strength is assessed in terms of organization, quality of analysis, logic, and balance.
Third, we have done statistical validation on the predictive validity of the evaluations of the responses for predicting success in college. In this way, we can see which questions work well and which do not. Please note that the Kaleidoscope ratings of creative, analytical, practical, and wisdom-based skills are based on the whole application, not just on essays, drawings, and other products that we newly placed on the application.
Q: What do you see as the main lessons of your work at Tufts with alternative admissions?
A: I believe there are four main lessons.
First, it is possible to measure the qualities assessed in Kaleidoscope in a way that is practical and that is enjoyable for students; the students feel the essays and other products tell the college things that otherwise the college would not know about them. Second, we found that the responses predicted both academic and non-academic success and, moreover, these predictions were incremental over (on top of) those of high school standardized test scores and G.P.A.s. Third, evaluations of responses to the questions did not show significant differences between ethnic groups. In other words, our measures increase prediction of success but do not show adverse impact as a function of ethnic identity. Fourth, we found that the measures are cost-effective. As a dean, I found it relatively easy, in collaboration with the advancement office, to raise money for additional admissions personnel to score the assessments.
Our donors tended to be alumni and alumnae who themselves were leaders but whose test scores and G.P.A.s did not necessarily reflect their full range of leadership skills. The donations made it possible to fund the project without taking away resources that the university needed for other endeavors.
Q: How important are non-traditional measures to efforts to recruit and enroll more minority students at elite colleges?
A: Traditional standardized tests show rather substantial ethnic-group differences. Kaleidoscope measures do not. The reason, I believe, based on our research, is that members of different ethnic and socioeconomic groups have different prototypical conceptions of what it means to be smart and often need different skills to adapt to their environments as they grow up. For example, growing up in a challenging inner-city urban environment requires creative and practical survival skills that growing up in an upper-middle-class environment does not require. Growing up in an upper-middle-class environment may provide more opportunities for developing abstract analytical skills.
So members of different groups develop, on average, different skills, and Kaleidoscope assesses many of them, whereas traditional standardized tests assess primarily memory and analytical skills, but not creative, practical, or wisdom-based ones. That is, the traditional tests assess skills in which members of higher socioeconomic strata tend to have more of an advantage. I should add that Kaleidoscope measures are not designed to replace traditional assessments, but to supplement them.
Q: Many critics of traditional admissions systems have called on colleges to stop requiring the SAT or ACT. You haven't, and Tufts added the alternative system on top of a relatively traditional one for competitive institutions. Why don't you favor simply replacing all of the traditional testing measures?
A: Standardized testing such as that provided by the SAT and ACT was originally introduced to enhance fairness, not to diminish it. Different high schools grade in different ways, have students of different levels of academic skill, and provide different qualities of instruction, and so the idea was to provide a standardized measure that would be fair across all these different educational institutions. But three things did not go as planned by the original creators of the tests. First, the tests were created at a time when most of the applicants to college were middle- to upper-middle-class, white, and male. There was nowhere near the diversity of applicants taking the SAT and ACT as is the case today. Second, the originators did not realize that scores would correlate highly with socioeconomic status, to the point that they would become near (but by no means complete) proxies for such status. Third, the originators probably did not realize that the creativity they showed in developing the tests would not be well matched by their successors -- the changes that have been introduced in the roughly hundred years since these tests started have been largely cosmetic, and psychometrically, the tests are very similar in their properties to the original tests.
Those who work for testing organizations might see this constancy of measurement as a positive thing. But imagine if other technologies, such as in telecommunications or medicine, were largely stuck a century in the past! The problem, as I see it, is that the skills measured by traditional tests are quite narrow and do not adequately reflect the full range of skills needed for college and life success.
That said, memory and analytical skills are important and so if these skills are not to be measured by standardized tests (because the tests are not used by a particular institution), then it is important that they be measured in some other valid way. In any case, I think the major problem with standardized tests has never been the tests themselves so much as their overinterpretation. Test scores can be useful if kept in perspective (as I believe they are at Tufts as well as at Oklahoma State).
It is easy to blame the testing companies, but really the problem is systemic: Our society as a whole has to move beyond narrow conceptions of the skills needed for life success. It is really something of a tragedy. For example, the meltdown on Wall Street and in the world in 2008 was in part a result of the work of people from highly prestigious colleges with outstanding standardized tests scores whose analytical skills were not matched by comparable levels of wisdom. Do we really want to select for people who will use their outstanding analytical skills to enrich themselves at others' expense? I believe that, to some extent, we as a society have done so by our vast overemphasis on narrow test scores and our gross underemphasis on wisdom and ethical qualities.
Q: Much of the talk about admissions reforms comes at elite universities or small liberal arts colleges. Many public universities, in contrast, use fairly straightforward admissions systems involving some combination of grades and test scores. Does Kaleidoscope belong at these institutions? And can they afford this kind of very personal evaluation of students?
A: Oklahoma State, where I am now, has questions on its application that measure leadership skills and these are being refined for next year. That said, I think that public universities, or at least land-grant publics, such as where I am now, have a somewhat different mission from elite private schools. When I was at Yale and then at Tufts, our hope was that we could be as selective as possible. The goal was to select the very strongest students, and Kaleidoscope provided a way to broaden the selection process so as better to select the optimal leaders of tomorrow.
At a land-grant institution, access looms much larger. The institution wants to provide access to as many qualified students as possible, especially from within-state. So having a low selection ratio does not have the same meaning; our goal is to provide opportunity, not to exclude students from it. There is much more emphasis on value added. So at Oklahoma State, for example, we would have great pride in taking students who do not have sky-high grades or test scores and then providing an environment where, despite this fact, we could help them become outstanding leaders of tomorrow. Of course, we want to ensure that admitted students are qualified to do the work, and so we use traditional academic measures. But our mission is to serve the state (and the world), and it is not clear that we best would do so by rejecting as many students as possible. Doing so might improve our ratings in one magazine or another, but would actually take us in a direction that is contrary to our land-grant mission of access. So Kaleidoscope is still relevant; all schools should select tomorrow’s positive leaders. But it is useful in a different way, within the context of achieving access more than selectivity.