With today's release of the National Survey of Student Engagement,  hundreds of colleges and universities will be studying their results, and considering whether they should change policies or approaches to better reach students. But a new study  released Friday argues that NSSE (pronounced "nessie") is seriously flawed, lacking validity for its conclusions and asking questions of students in ways that are sure to doom the value of the data collected.
NSSE "fails to meet basic standards for validity and reliability," writes Stephen R. Porter, an associate professor in Iowa State University's educational leadership and policy studies department. Porter's study -- presented in Vancouver at the annual meeting of the Association for the Study of Higher Education -- raises questions about most research based on surveys of students, and he stresses that he does not believe the problems are unique to NSSE. He even goes so far as to say that in the past he has done research based on student surveys that he now doubts has validity.
But he uses NSSE as his focus, in part, because it is a student survey that has captured the attention of so many higher education leaders.
"Most academic research" using student surveys "is ignored, but this has a huge impact," he said in an interview. "If I get something wrong in a journal article, maybe two dozen people read it, but this is something they are using to tell colleges what to do. There are high stakes on how these surveys are used."
Alexander C. McCormick, NSSE director and associate professor of education at Indiana University at Bloomington, said in an interview that "some of the issues [Porter] raises have legitimacy," but he also said that Porter is overstating some of the problems and ignoring some evidence that backs NSSE's methodology.
In his paper, Porter gathers evidence from other researchers that suggests NSSE's basic approach to asking questions of students is flawed. One key issue he raises is that students don't necessarily know what it means when they are asked if certain practices or experiences are frequent or rare -- even though such measurements are critical to NSSE questions.
For example, Porter cites a 1982 study in which college students were asked the same question (over the course of a longer survey so it wouldn't be obvious to students that they were answering the same question) in two ways. The question was typical of NSSE questions in asking students how frequently they made an appointment to see a faculty member, and the first round of answer choices were similar to the kinds of answers NSSE uses: occasionally, often and very often. But in the second iteration of the question, students were given more specific answers, ranging from "once or twice a year" to "more than once a week."
As the results indicate, there is considerable variation among students on what they mean when they say "often," which in this study was almost as likely to mean "once a week" as "3 to 6 times a year." And students with the same meaning (when defined precisely) check different categories when given the NSSE-like answers.
Consistency of Student Interpretation of Occasionally, Often and Very Often
|Students Who Checked...||Occasionally||Often||Very Often|
|... and also checked "never" for same question||4%||0%||0%|
|... "once or twice a year"||37%||6%||2%|
|... "3 to 6 times a year"||38%||24%||9%|
|.. "1 or 2 times a month"||18%||45%||33%|
|... "about once a week"||3%||21%||35%|
|... "more than once a week"||0%||4%||21%|
But it's not just that students don't measure frequency the same way, Porter argues. They also don't know (at least with precision) many other terms used by NSSE.
For instance, there is a question about whether students discuss grades or assignments with instructors, to which Porter points out that some may count teaching assistants and others may not.
Or the question about "serious conversations with students." Of this question, Porter asks: "How does a student distinguish between serious and frivolous conversations? And what is a conversation? A chat in the bathroom? An hour-long bull session in a dorm room?"
And then there is the question about whether students believe that their college helps them learn to think critically and analytically. Porter writes that this question is "a good example of how we let educational jargon creep into our surveys, and then assume that students understand what we mean. Recently, a graduate student interviewed me for a class she was taking about teaching, and asked me how I taught critical thinking in my classes. We then proceeded to have a discussion about what she meant by critical thinking, because I wasn’t clear on what it meant, in terms of what she was asking. If I, as a higher education researcher, have trouble defining the phrase 'critical thinking,' how can we expect the average college student to understand the concept, much less ensure that this understanding is similar across college students?"
In these cases, students may be honestly answering questions, but their lack of knowledge may result in wide variations of what the data mean, Porter says.
In yet other cases, there is research to suggest that students may not be entirely honest. Porter writes of numerous studies suggesting that students engage in a bit of grade inflation when asked about their academic performance, and tend to answer questions in ways that make themselves look like slightly better students than they really are.
Given these and other issues, Porter writes that NSSE cannot be presumed to be valid -- and he questions the idea that there is any evidence that NSSE scores in their current form are a good indicator of student learning.
"The promise of a survey instrument that can quickly and relatively cheaply provide an alternative to actually measuring learning has, not surprisingly, been alluring to many colleges," he writes. "That an instrument that fails to meet basic standards of validity and reliability has been so quickly adopted by numerous institutions indicates the desire of many institutions for a solution to this issue."
What to do? In some cases, Porter argues for additional validity testing to see whether there are notable patterns -- by institutional competitiveness, sector or major, for example -- on how students respond to questions. In other cases, he argues for much more detailed definitions and instructions. And he suggests that other approaches -- such as time diaries, in which students carry a diary and record what they do over a period of time, rather than later remembering what they did -- are far more accurate.
In an interview, Porter stressed that he too believed (from his experience, not from research) that qualities promoted by NSSE, such as close student-faculty interaction, are important.
Porter went to Rice University, where he said he remembers many qualities of the type praised by NSSE in producing engaged students. "I would want my child to go to a college that has those characteristics," he said. "The question is: Can the NSSE help us identify those colleges?" He said that because of the "grand claims" made by NSSE and largely unquestioned by academic leaders, it has become the "gold standard," when it really needs an overhaul. Where is the evidence, he said, that students understand the questions, and that their answers lead not only to engagement but to learning?
The NSSE Response
McCormick, the NSSE director, said that he and his colleagues "are the last persons to say that NSSE is perfect," and that Porter had raised some issues "we need to deal with." But he also said that Porter was leaving out some key context about survey research generally and the realities of student surveys.
As to what Porter got right, McCormick said it was important for those who do survey-based research to regularly consider whether the wording of various questions made them vulnerable to misinterpretation, and he said that, for example, that while he thinks the question on critical thinking isn't typical, it is the question where "one could make the strongest case that it's a jargon-ish question."
But McCormick said that NSSE has done extensive validity studies on its questions -- typically through focus group "cognitive interviews" in which subjects are asked to think aloud about their answers to various questions, and that this process was used to verify that questions were being answered consistently, and to refine wording. Further, he said that earlier versions of NSSE included lengthier instructions and definitions, and that researchers found that students simply ignored the information, leading to the belief that more students will participate, and participate thoughtfully, with shorter introductions to the questions.
Further, McCormick questioned the cost and practicality of giving students time diaries to carry out for some period of time. He noted that students can fill out a NSSE survey in 15 to 20 minutes, making it much easier to gather information from large numbers of students.
"It's certainly true that if we equipped all students with time use diaries and said 'Fill this out every half hour to tell me what you do,' we would get much more precise estimates of how many hours a week they spend. Of if we hired someone to follow them around, we would also get more information," he said, but he wondered how many students would go for this approach.
Porter was setting a false standard for NSSE (and other surveys) to live up to, McCormick said. "A lot of what's written about in the paper are problems that are common in social science research," he said. "Our measurement tools are blunt, and NSSE is certainly no exception to that."
Finally, McCormick said that a lot of the value of NSSE was in "relative comparisons" in how an individual college does from year to year, how various parts of a college do, or how groups of colleges do. So if a college sees relatively low scores in an area, and shifts policies, and sees an improvement on NSSE, that means something worked, and if there is no improvement, more work may be needed. Improvement may be valid, even if the students' answers to questions do not yield "some precise numeric comparison," he said.
While McCormick acknowledged the possibility that NSSE may need to be "more clear" about the value of relative comparisons as opposed to individual figures, he said "that's what most of the schools are doing."
It would be a problem, McCormick said, if research found that students at different kinds of institutions or in different types of academic programs answered questions in notably different ways, but he said he saw no evidence of this, and that this concern was "not a severe threat to the instrument."
Porter said he wasn't impressed with NSSE's response. Porter said that NSSE places too much emphasis on maximizing the number of respondents, as opposed to maximizing the chances that answers are accurate. "There seems to be this attitude that people doing surveys in the field should be held to lower standards. I don't agree," he said.