Numbers fascinate and inform. Numbers add precision and authority to an observation (although not necessarily as much as often perceived). The physical sciences revolve around the careful measurement of precise and repeatable observations, usually in carefully controlled experiments.
The social sciences, on the other hand, face a much more challenging task, dealing with the behavior of people who have an unfortunate tendency to think for themselves, and who refuse to behave in a manner predicted by elegant theories.
Under the circumstances, it's really quite remarkable that statistical predictions are as useful as they are. Advertisers ignore, at their peril, conclusions based on data gathered on large numbers of people acting alike. Supermarket shoppers or football fans behave in much the same way, no matter the infinite number of ways each member of the population differs in other respects. In their interaction with the location of shelved foods -- or forward passes caught -- few of these variations make a difference.
Population samples comprised of large numbers of uniform members can be defined, observations made, statistical calculations made, and policy deduced with astonishing accuracy.
Efforts have been made to extend this methodology to the classroom, and trillions of data elements have been gathered over the past 30 years describing K-12 activities, students, inputs, and outcomes. But judging from the state of K-12 education, little in the way of useful policy or teaching strategy has emerged. The reason is not immediately clear, but one surmises that while the curriculum path for K-12 children is similar, the natural variation among children, in teachers, in social circumstances and in school environment makes it impossible to create a uniform population out of which samples can be drawn.
At the postsecondary level, the problem facing the number gatherer is greatly exacerbated. Every student is different, almost intentionally so. A college might have 25 different majors each with three or four concentrations. Students take different core courses in different order, from different teachers. They mature differently, experience life differently and approach their studies differently. When all the variables which relate to college learning are taken into account, there is no broad student population. Put another way, the maximum size of the population to be examined is one!
This reality informed traditional accreditation. Experts in a field spoke to numbers of students, interviewed faculty, observed classroom lectures, and, using their own experience and expertise as backdrop, arrived at a holistic conclusion. There was nothing "scientific" about the process, but it proved remarkably successful. This is the accreditation that is universally acknowledged to have enabled American colleges and universities to remain independent, diverse, and the envy of the world.
In 1985, or thereabout, voices were heard offering a captivating proposal. Manufacturers, they said, are able to produce vast numbers of items successfully, with ever-decreasing numbers of defects, using counting and predictive strategies. Could not similar approaches enhance higher education, provided there were sufficient outcome data available? Some people, including then-Secretary of Education William Bennett, swallowed the argument whole. Others resisted, and the controversy played itself out (and was recorded!) in the proceedings of the National Advisory Committee on Accreditation and Institutional Eligibility (predecessor of the current National Advisory Committee on Institutional Quality and Integrity) between 1986 and 1990.
Advocates persisted, and states, one by one, were convinced of the necessity to measure student learning. And measure they did! Immense amounts of money, staff time, and energy went into gathering and storing numbers. Numbers that had no relevance to higher education, to effectiveness, to teaching or to learning. "Experts" claimed that inputs didn't count, and those who objected were derided as the accreditors who, clipboard in hand, wandered around "counting books in the library."
At one point, the U.S. Department of Education also adopted the quantitative "student outcomes" mantra, and accrediting agencies seeking recognition by the education secretary were told to "assess." "Measure student learning outcomes," the department ordered, "and base decisions on the results of these measurements."
Under duress, accreditors complied and subsequently imposed so-called accountability measures on defenseless colleges and universities. In essence, the recognition function was used as a club to force accreditation to serve as a conduit, instead of barrier, to government intrusion into the affairs of independent postsecondary institutions.
Today, virtually all those who headed accreditation agencies in the 1990s are gone, and the new group of accreditors arrived with measured student learning outcomes and assessment requirements firmly in place. Similarly, college administrators hired in the last decade must profess fealty to the data theology. Both in schools and in accrediting agencies, a culture of assessment for its own sake has settled in.
But cautionary voices remain, arguing that the focus on quantitative measures and the use of rubrics which have never been substantiated for reliability and validity, are costly to the goals of teaching and learning.
Numbers displace. Accreditors have been forced to rely on irrelevant numerical measures, rather than on the intense direct interaction that is one of the essentials of peer review. If there are failings to accreditation, they are at least partially due to decisions made on the basis of "data," rather than the intensely human interaction between site visitors and students, faculty, alumni, and staff.
Numbers mislead. Poor schools are able to provide satisfactory numbers, because the proxies proposed as establishing institutional success are, at best, remotely connected to quality and are therefore easily gamed. Bad schools can almost invariably produce good numbers.
Numbers distort. Participants at a national conference sponsored a few years ago by the U.S. Department of Education were astonished to learn that colleges had paid students to take the Collegiate Learning Assessment. Other researchers pointed out that seniors attributed no importance to the CLA and performed indifferently. Under the circumstances, it is impossible to use CLA results as a basis for a value added conclusion. Can we legitimately have a national conversation about the "lack of evidence of growth of critical thinking" in college, based on such data?
Numbers distract. The focus on assessment has captured the center stage of national educational groups for almost two decades. A quick review of annual meeting agendas of major national education conferences reveals that pervasive assessment topics moved educators from their proper concentration on learning and teaching. Seemingly, many people believe that effective assessment will result in improved teaching and learning. One observer compared this leap in logic to improving the health of a deathly ill person by taking his temperature. The current emphasis on "better" measures, then, would correspond to using an improved thermometer.
Numbers divert. Faculty members spend an untold number of hours outside of classroom time on useless assessment exercises. At least some of this time would otherwise have been available for engagement with students. Numbers divert our focus in other ways as well. Instead of conversations about deep thinking, lifelong learning, and carefully structured small experiments to address achievement gaps, faculty must focus on assessment and measurement!
Assessment has become a recognizable cost center at some institutions, still without any policy outcomes or improvements to teaching and learning, in spite of almost thirty years of effort.
This is not to be taken as a blanket attack on numbers. There are fields, particularly those with an occupational component, for which useful correlations between numerical outcomes and quality can be made. There are accrediting agencies which are instituting numerical measures in a carefully controlled, modest fashion, establishing correlations and realities, and building from there. Finally, there are fields with discrete, denumerable outcomes for which numbers can contribute to an understanding and a measure of effectiveness. But many other accreditors have been forced to impose measuring protocols, which speak to the flaws noted above.
It's time to restore balance. Government must begin to realize that while it is bigger than anyone else, it is not wiser. And those who triggered this thirty-year, devastatingly costly experiment should have the decency to admit they were wrong (as did one internationally known proponent at the February 4th NACIQI meeting, stating "with respect to measuring student learning outcomes, we are not there yet").
The past should serve as an object lesson for the future, particularly in view of the recently released Degree Qualifications Profile (DQP) bearing all the signs of another "proxy" approach to the judgment of quality.
Our costly "numbers" experience tells us that nothing should be done to implement this DQP until after a multi-year series of small experiments and pilot programs has been in place and preliminary conclusions drawn. Should benefits emerge, an iterative process with ever more relevant features can be presented to the postsecondary community. If not, not.
But no more should a social experiment be imposed on the American people, without the slightest indication of reliability, validity or even relevance to reality.
Bernard Fryshman is an accreditor and a professor of physics.
In travels around the country, I’ve been seeing signs of a trend in higher education that could have profound implications: a growing interest in learning about learning. At colleges and universities that are solidly grounded in a commitment to teaching, groups of creative faculty are mobilizing around learning as a collective, and intriguing, intellectual inquiry.
This trend embraces the advances being made in the cognitive sciences and the study of consciousness. It resides in the fast-moving world of changing information technology and social media. It recognizes and builds upon new pedagogies and evolving theories of multiple ways of knowing and learning. It encompasses but transcends the evolution of new and better measures of student learning outcomes.
As more and more institutions sign on to administer the National Survey of Student Engagement and the Collegiate Learning Assessment, some see the resulting data as sufficient to close the books on the question of student learning, while others see them as no more than a rudimentary beginning. The advent of new instruments reflects in part the desire to unseat the commercial rating systems that wield enormous influence despite their well-known shortcomings and distortions. The new measurement regimes are responding, as well, to demands from accrediting and regulatory agencies for convincing data on "value-added educational outcomes." But educators know that assessing what students have learned is far less valuable than finding out how they learn.
Uri Treisman’s landmark study at Berkeley a quarter century ago validated this proposition. He compared how students of African and Chinese descent learned calculus, used the findings to export successful strategies from one group to the other, and evaluated the results. Richard Light’s studies at Harvard carry on the Treisman tradition.
Efforts to identify fruitful points of intervention in the classroom and in co-curricular offerings are picking up steam, importing into the councils of higher education -- and strengthening -- a line of educational research that had been largely overlooked by faculty and administrators whose disciplinary allegiances were with the liberal arts and sciences, not the study of pedagogical practice. A number of foundations, notably Teagle, Spencer, and Mellon, are funding empirical studies that are uniting these worlds. The Carnegie Foundation for the Advancement of Teaching has been a leading voice in this conversation for many years as, more recently, has the Association of American Colleges and Universities.
Faculty at Indiana University have since 1998 been fostering interdisciplinary communities for innovative course-focused research to improve undergraduate learning, and exporting the work through conferences of a growing International Society for the Scholarship of Teaching and Learning. Georgetown’s Center for New Design in Learning and Scholarship is hosting cutting-edge events to feed faculty interest in the scholarship of teaching and learning. John Seely Brown, former chief scientist at Xerox, has been exploring the edges of this new field, drawing, for example, on Polanyi’s distinction between "learning about" and "learning to be," activities that take place in iterative cycles ("I get stuck; I need to know more"). "Learning about" involves explicit knowledge, "learning to be" is more tacit: sensing an interesting question, feeling the rightness of an elegant solution. Now we can enable with ease the "socially-constructed understanding" that fuels the cycles of being stuck and learning more through "interactions with others and the world" in this new digital age, he observes.
"Something is in the air," adds Michael Wesch in a YouTube video that has been watched by over three million viewers. He’s standing in an old-fashioned auditorium at Kansas State University and the "something" that all teachers have no choice now but to reckon with is all of human knowledge instantly available to all students through their wi-fi connections. The pioneers on this new frontier are pursuing novel learning technologies that can be harnessed in the service of greater intellectual connection between students and faculty, enhanced student learning, less drudgery, more creativity, more freedom and more joy for students and faculty alike. Clay Christensen warns that if we fail at this task, "disruptive technologies" will do it for us, and eat our lunch.
Where might this lead? If groups of faculty were to think deeply and systematically over a number of years about student learning and student success, they could create for their own institutions and the wider field a more robust evidence-based culture of learning, a “science of improvement,” as groups of medical leaders are advancing for their profession.
An effort like this at one institution would require the gradual creation of highly-intentional learning (not teaching) cultures with explicit cycles of improvement in place throughout the college or university, starting with academic departments and working up from there. The results would be widely discussed by everyone: faculty, students, staff, trustees. Over time, and without much fanfare, they would influence hiring decisions and criteria for promotion and other rewards. Resources would be re-allocated to activities that were demonstrably advancing student learning in the context (not in lieu) of serious disciplinary scholarship.
This work would necessarily be multidisciplinary, iterative, and methodologically inventive and yet tight. It would come over time to define an inquisitive and ambitious learning community. The findings would not be available for use as a punitive club to force accountability to the state or federal government or to other external groups. Pressure for accountability must not be allowed to confound and corrupt the assessment and continuous improvement of learning outcomes.
I know that this essay is loaded with fighting words. But I believe we need, and are now beginning to see, ways to reframe the problem of learning outcomes, ways that might galvanize positive energy and support within a faculty. Imagine “the administration” saying to faculty, in effect: We want you to be learning all you can about who your students are now, how they learn and what they need to know in order to be successful in a world that is changing faster than we can imagine much less anticipate. And we want you to have the resources and collegial connections you will need to make the pursuit of that question an exciting and fruitful complement to your scholarship. From learning science there are stunning advances that need translation before they can be brought successfully into classrooms, findings and possibilities that at least some faculty might find inherently fascinating if they were approached right, offered a supportive culture with meaningful incentives and rewards and scholarly payoffs.
More than a decade ago, at Wellesley, I watched a group of faculty from several liberal arts colleges, with Trinity in the lead, take up the issue of how to close the academic achievement gap, an issue brought to attention by Bill Bowen and Derek Bok in The Shape of the River and one about which faculty cared deeply, an institutional failure they felt keenly as their responsibility. They found allies in their own and other institutions and created an organization (Consortium on High Achievement and Success), a collaborative learning group that invented an emergent process, adjusting as they went. They assembled data; consulted experts they could respect; found local champions in their own institutions and raised up their work; sought out promising strategies in other institutions; listened to their students’ accounts of challenges they were facing and developed student partnerships to address those issues. They pooled knowledge, shared data, assembled resources, designed honest conversations and entered them with inquiring minds. The element that was missing then was systematic research: testing pilot initiatives and developing intervention studies. Without solid research it’s impossible to know what really works. The learning initiative I have in mind would need to build this in from the start. But, first and foremost, it would have to be rooted, as was CHAS, in the belief among a group of faculty that their students could be better served.
I’m convinced that some faculty could become absorbed in a sophisticated intellectual collaboration to learn about learning. Throughout higher education, we fret about unsound expenditures we know are driven by crude rating systems and the fierce competitive dynamic they fuel. We are not going to eliminate competition between institutions of higher learning, even if we wanted to, which we probably don’t. But could we conceivably change the terms of the competition, put learning rather than amenities at the center of the arms race, spend less on making students more and more comfortable at college and more on making them more and more curious?
Now there’s a question worth asking.
Diana Chapman Walsh served as president of Wellesley College from 1993 to 2007.