Search Views


Browse Archives

Views

The Limitations of Portfolios

October 16, 2009

Share This Story

FREE Daily News Alerts

Advertisement

Colleges have come to realize the need to assess and improve student learning and to report their efforts to students, faculty, administrators, and the public; including policy makers and prospective students and their parents.

The question is how to accomplish this. The roar of yesterday’s Spellings Commission and its vision of accountability is background noise to today’s cacophony of calls for more transparency and campus-based, authentic assessment of student learning. Some of the advocates for more authentic measures, such as Carol Schneider, president of the Association of American Colleges and Universities, have suggested using electronic portfolios -- collections of a student’s work products, such as term papers, research papers or descriptions, and the student’s written thoughts (“reflections”) about these work products and curricular experiences that are bundled together on an electronic platform. The presumed merits of portfolios, such as their supposed ability to drill down into the local curriculum, have been extolled elsewhere.

Portfolios are simply not up to the task of providing the necessary data for making a sound assessment of student learning. They do not and cannot yield the trustworthy information that is needed for this purpose. However, there are approaches that can provide some of the information that is required.

Portfolio Assessment’s Inherent Limitations

There are three major reasons portfolios are not appropriate for higher education assessment programs: They are (a) not standardized, (b) not feasible for large-scale assessment due to administration and scoring problems, and (c) potentially biased. Indeed, course grades, aggregated across an academic major or program, provide more reliable and better evidence of student learning than do portfolios. Here’s why.

Lack of Standardization

Standardization refers to assessments in which (a) all students take the same or conceptually and statistically parallel measures; (b) all students take the measures under the same administrative conditions (such as on-site proctors and time limits); (c) the same evaluation methods, graders, and scoring criteria are applied consistently to all of the students’ work; and (d) the score assigned to a student most likely reflects the quality of the work done by that student and that student alone (without assistance from others).

Portfolios do not and cannot meet the requirements for standardization because by their very nature, they are tailored to each student. AAC&U’s attempts at “metarubrics” are not even close to being an adequate solution to address this problem. Portfolio advocates simply ignore the evidence that valid comparisons in the level of learning achieved can only be made when students take the same or statistically “equated” measures (such as different versions of the SAT).

Without standardization, faculty and administrators at individual campuses cannot answer the fundamental questions: Is the amount of student learning and level of achievement attained by the students at our campus good enough? Could they do better, and if so, how much better? For example, are the critical writing skills of our students on a par with those of students at comparable institutions and if below, what might be done to improve their performance?

The reason that campuses using portfolio assessment cannot answer these types of questions is that determining how much learning has occurred has to be measured by comparison to some type of standardized benchmarks. For example, to assess whether seniors write better than freshmen, both groups need to respond to the same essay questions within the same time limits and have their answers mixed together before being graded by readers who do not know whether an answer was written by a freshman or senior.

The same standardization is needed to assess whether the students at one school (or in one program within a school) are more proficient (or learned more) than students at similar schools. In short, learning has to be measured by some type of standardized, controlled, and unbiased comparison. There is no absolute scale (like weight and height) that is interpretable in and of itself.

Descriptions of scoring criteria are not sufficient to ensure comparable grading standards even when benchmark answers are used to train raters. In order to answer the good enough question, performance comparisons -- “benchmarking” -- is necessary. But benchmarking cannot occur without standardization and benchmarking is necessary to interpret differences in scores between programs within a campus and between peer campuses. Without standardization, differences might be due to variation in portfolio content, rater background and training, assistance provided to students for building their portfolios, bias (see below), and a host of other factors.

Valid interpretations of differences in scores between students, programs, and schools can only occur when the assessment is standardized. Only then can institutions monitor their students’ progress toward improving their skills and abilities relative to (a) their school’s academic standards, (b) the progress made by their classmates, and (c) the improvements in performance made by students in other programs and similar institutions. Ironically then, by eliminating the standardization that is necessary for benchmarking learning, the portfolio method prevents making the kinds of comparisons that are essential for assessing improvement.

We recognize that there are roles for portfolios. For example, they might be used to provide information about the range of tasks and activities students engage in and their views about the importance of different aspects of their education and campus experiences. This information may have heuristic value in providing possible insights into areas for improvement.

Not Feasible for Large Scale Learning Assessment

By their un-standardized nature, portfolios (even electronic ones) are not practically feasible on a large scale. A moment’s reflection reveals why this is true. Because of their length, a single grader will typically need an hour or so to grade a single portfolio. To assure adequate score reliability, each portfolio needs at least two independent graders (and major differences between them should be resolved by a third). In addition, due to the potential interdisciplinary nature of a portfolio’s contents, raters with different areas of expertise might be needed which could lead to even more scoring time and feasibility problems.

For portfolios to be truly authentic, they have to relate to each student’s academic major or combination of majors. Hence, different teams of graders (and most likely different scoring rubrics) are needed for students with different majors. These and related concerns preclude combining results across students with different and perhaps unique combinations of majors.

Computer technology cannot solve portfolio feasibility and reliability problems. For example, computers with natural language processing software have been shown to provide a cost-effective and accurate way to grade large numbers of student responses to essay questions and other open-ended tasks. However, these machine grading methods require standardized prompts. They require that thousands of students respond to the same prompt and thus they are not applicable to portfolios.

Simply put, the time, content expertise, and other challenges -- and hence feasibility -- of grading portfolios substantially exceeds that of grading constructed responses (e.g., essays) that are administered and scored under standardized conditions. Incidentally, the solution to this problem does not lie in having local faculty grade portfolios, even when justified as a professor’s instructional and professional development responsibilities. The evidence is clear: in large-scale programs, portfolio assessment overwhelms faculty, and is a source of faculty resistance and low morale. Portfolio assessment, then, is simply not a feasible or practical tool for large-scale assessment programs.

Bias

A portfolio may include a photograph, videoclip, or other information about student identities. Their gender, race, ethnicity, and other characteristics also may be known by those evaluating the portfolio. This lack of anonymity may bias results.

***

Faculty are understandably skeptical of standardized tests. In an article last year in Academe, Gerald Graff and Cathy Birkenstein pointed out that many faculty erroneously equate standardized exams with the highly questionable multiple-choice tests that characterize the implementation of the No Child Left Behind Act. Professors and administrators rightly celebrate the diversity of American higher education and therefore do not see how the same standardized test could be used across this range of institutions. However, colleges may share some important goals. For instance, virtually all faculty and college mission statements agree that critical thinking and writing skills are essential for all college graduates to possess. Graff and Birkenstein put it well:

A marketing instructor at a community college, a biblical studies instructor at a church-affiliated college, and a feminist literature instructor at an Ivy League research university would presumably differ radically in their disciplinary expertise, their intellectual outlooks, and the students they teach, but it would be surprising if there were not a great deal of common ground in what they regard as acceptable college-level work. They (these instructors) would probably agree -- or should agree -- that college-educated students, regardless of their background or major, should be critical thinkers, meaning that, at a minimum, they should be able to read a college-level text, offer a pertinent summary of its central claim, and make a relevant response, whether by agreeing with it, complicating its claims, or offering a critique.

If standardization is possible, the question arises as to whether it is possible to standardize “authentic” tasks. David C. McClelland's 1973 paper, provided the key to authenticity with standardization. He argued for a “criterion-sampling” approach to assessment in which students confront “real-world” tasks like those they may face in their further education, work, and private and civil lives. As McClelland said, if you want to know if a person can drive a car, observe and evaluate his performance on a sample of tasks like starting the car, pulling out into traffic, turning left, parking and the like. Moreover, you can evaluate performing these tasks in a standardized way. Put succinctly, he provided a strong argument for gaining authenticity through the assessment of criterion performances.

Performance assessment, then, represents an authentic, standardized testing paradigm in which students craft original responses to real-life (criterion-sampled) tasks. For example, most state bar examinations now include tasks in which candidates are given a realistic case situation and asked to use a library to perform a typical task, such as prepare deposition questions or a points-and-authorities brief, draft instructions for an investigator, or write a letter to opposing counsel. Candidates are given a “library” of documents and told to base their answers on the information in these documents. The library might include the opposing counsel’s brief, excerpts of relevant and irrelevant case law, letters, investigator reports, and other documents… just like they would review in practice. Performance tasks also have been used in credentialing teachers.

We applied this testing paradigm in developing the Collegiate Learning Assessment (CLA). This testing tool taps critical thinking, analytic reasoning, problem-solving and written communication skills of college students with standardized analytic writing and performance tasks that have been described elsewhere. Over 450 colleges with 200,000 students have participated in the CLA. Faculty and students recognize its authenticity and report that its tasks tap the kinds of thinking and reasoning they expect a college education will help students perform.

We are concerned about the suggestion to replace standardized higher education measures with electronic portfolios as a means for assessing the effects of campus’ programs and as a response to the demand for external accountability. Because of the inherent problems with portfolios, they do not and cannot provide trustworthy, unbiased, or cost effective information about student learning. This is just not in their DNA.

Gathering valid data about student performance levels and performance improvement requires making comparisons relative to fixed benchmarks and that can only be done when the assessments are standardized. Consequently, we urge the higher education community to embrace authentic, standardized performance-assessment approaches so as to gather valid data that can be used to improve teaching and learning as well as meet its obligations to external audiences to account for its actions and outcomes regarding student learning.

Richard J. Shavelson is a professor of education at Stanford University. Stephen Klein and Roger Benjamin are director of research and development and president/CEO, respectively, at the Council for Aid to Education, which owns the Collegiate Learning Assessment.

See all postings »
Advertisement
Advertisement

Matching Jobs

Comments on The Limitations of Portfolios

  • Viewpoint or ad?
  • Posted by Merilee Griffin on October 16, 2009 at 7:30am EDT
  • I wish the disclosure that two of the authors have vested interests in the CLA had appeared at the beginning of this piece instead of at the end. I read about 3/4 of it agreeing with some points, disagreeing with others, and wondering where it was headed. When it turned out to be a rather lengthy ad for the CLA, I felt let down.

    The authors are right that performance benchmarking and time-to-read are problems with portfolio assessment, but the conclusion that the CLA is the answer is a leap. Although the intellectual tasks required by the CLA are definitely a cut above most tests, they are still far more simplistic than the real-life tasks of analyzing and writing about a messy world students face in college courses and life beyond college. For example, skill in sorting through the glut of information available online and in electronic libraries is a critical task, one that cannot be measured by any test in which students are isolated from reality.

    If higher education is harnessed to such tests and sanctions are applied to test results, we can expect greater emphasis on the relative few, simple skills that can be measured by tests, while more complex intellectual work - far harder to teach and to learn - are neglected.

    Other alternatives are being developed in hundreds of programs and departments across the country and need only an electronic connection and a little more time to grow into assessments that can be both authentic and standardized. Stay tuned.

  • Assessment for What?
  • Posted by Victor Borden , University Planning, Institutional Research & Accountability at Indiana University on October 16, 2009 at 8:00am EDT
  • The authors, with vested interests as noted in the first comment, base their argument on a very narrow view of assessment, that is, one focusing primarly on assessment for accountability and comparability. Although there are clearly limitations and challenges to both portfolio assessment and assessment using standardized performance examinations, they each have their place and serve slightly overlapping but mostly differing purposes. For faculty and staff who are trying to understand at a deep level how students are gaining knowledge, skills and abilities from their programs, a portfolio is far more appropriate than a standarized exam (unless the program has some very narrow purposes that happen to align well with the exam). For institutional-level accountability purposes, I would argue that neither are currently very useful. However, I would agree that a standardized exam, like the CLA, can be used constructively to catalyze assessment efforts, stimulate vigorous discussion at many levels, and perhaps lead us to better solutions.

  • Posted by Henry Vandenburgh on October 16, 2009 at 9:00am EDT
  • Not standardized? Exactly the point!

  • The easy way out. . .
  • Posted by Cliff Adelman , Senior Associate at Institute for Higher Education Policy on October 16, 2009 at 9:15am EDT
  • This is a “please buy my test” article that goes after a straw man and conveniently ignores the very difficult and long-term strategy of developing degree qualifications frameworks than the Europeans have undertaken under the Bologna Process. They are now being imitated on other continents for a reason: there are ways to do this with public benchmarking of performance criteria and discrete learning outcomes. Probably more importantly, Bologna includes the “Tuning” process at the level of the discipline, where faculty across institutions determine templates with discrete reference points for student learning outcomes (not a straightjacket strategy, rather one that respects departmental autonomy while executing some convergence within disciplines). It is no accident that “Tuning,” too, has been adapted elsewhere (e.g. in 186 universities across 19 countries in Latin America), is under planning in Australia, and, most poignantly, is subject to pilot projects in three states (Indiana, Minnesota, and Utah) under Lumina Foundation sponsorship. Each team in the Tuning USA project include everyone from the flagship state university to the community college system, with private institution participation, and a student. Six disciplines were selected by the state (2 in Minnesota and Utah, 1 in Indiana, and 1 overlap–history in Indiana and Utah). They will report out in December, and one expects these efforts to continue and expand. It’s a long haul (the European Tuning teams have been at it, in phases, for anywhere from 6 to 12 years, and the Latin Americans have been at it since 2004), but it’s a bottom-up effort with faculty ownership in the disciplines in which they have been trained and organized, and has a far better chance of real impact on the lives and learning of students—something the CLA or any other standardized test, restricted or unrestricted response, does not. Of course, it’s a lot of work, and we’d rather buy a test and put the issue to sleep, wouldn’t we?----Cliff Adelman, Institute for Higher Education Policy

  • The Proof is in the Portfolio
  • Posted by Carol Schneider , President at Association of American Colleges and Universities on October 16, 2009 at 10:00am EDT
  • For the convenience of readers who are just learning about AAC&U's approach to assessment, we're providing links to three resources available to download.

    1) The first is Our Students' Best Work: A Framework for Accountability Worthy of Our Mission. Revised and reissued last year, this is an official Board of Directors statement. It describes ways of focusing assessments on students' actual work, completed across the curriculum. The core idea is captured in the title. Assessments ought to motivate students to do their very best work, and higher education ought to make the production of such "best work" a focal point for the college curriculum. When students are producing "authentic work," that work can be assessed using validated rubrics by faculty who have been trained to apply rubrics to samples of student work.

    Recognizing the scalability challenge this approach to assessment presents, Our Students' Best Work recommends that each academic program build into the regular curriculum abundant opportunities for students to practice and produce work that deploys important college outcomes, such as analysis, communication, problem solving, engagement with difference, and integrative learning. For purposes of institutional assessment and external reporting, a random sample of portfolios can be scored and reported using rubrics and multiple blind raters.

    2) The second link takes you to the VALUE rubrics that have just been released through AAC&U's federally funded national project, Rising to the Challenge. These rubrics are keyed to the essential learning outcomes that AAC&U has developed--in concert with the higher education community--through its ongoing initiative, Liberal Education and America's Promise (LEAP). The LEAP VALUE rubrics feature "dimensions" of specific learning outcomes that faculty should take into account in determining a student's growth in competence through his or her studies. The VALUE project studied hundreds of existing campus rubrics for specific learning outcomes that faculty had already developed to assess student work and progress. The rubrics were developed by faculty-led expert teams and have been tested multiple times against actual student work at many different institutions.

    3) The third link takes you to my own essay, "The Proof is in the Portfolio," which I published last year to express my dismay that higher education, in the wake of the Spellings furor, was now piloting the use of a single test to be taken by student volunteers that would supposedly provide external evidence about what students have learned over time. While I respect my CLA colleagues for their psychometric fervor, I stand firmly by my view that no institution should use a single test, taken by a set of student volunteers, to form or report judgments about the quality of student achievement across the entire family of programs and majors.

    As I said in my essay, we are educators. As educators, we have a responsibility to help our constituents distinguish between good practice and bad practice. Using a single measure to capture the academic achievement of an entire college or university curriculum is bad practice.

    Our Students' Best Work: A Framework for Accountability Worthy of Our Mission
    http://www.aacu.org/About/statements/assessment.cfm
    VALUE Rubrics
    http://www.aacu.org/value/rubrics/index.cfm
    The Proof Is in the Portfolio
    http://www.aacu.org/liberaleducation/le-wi09/le-wi09_president.cfm

  • Accountability does not equal comparability and standardization
  • Posted by Jeremy Penn on October 16, 2009 at 10:15am EDT
  • Accountability does not require nationally comparable and standardized data. In K-12 education the state of Nebraska used a NCLB approach (called STARS) that allowed for individual school districts to set their own standards (as long as they were equal to or higher than the state standards) and create or select measures (local assessments, multiple measures, non-high stakes, classroom based, and high quality) of student achievement that would demonstrate student achievement of those standards. The measures were carefully evaluated by a state-wide team of assessment and measurement experts for evidence of validity, reliability, and alignment to the district's standards. The approach appeared to have strong benefits for student learning and led to considerable involvement of teachers, resulting in improved teaching. See Roschewski, Isernhagen, and Dappen (2007, Nebraska STARS: Achieving Results, Phi Delta Kappan, 87(6), 433-437). The Nebraska STARS approach has mostly disappeared due to political pressure, but was a clear model for an accountability approach that did not focus on comparability and standardization and led to improvement of student learning (the true goal of any accountability program).

  • All of the Above
  • Posted by K. Scott Alberts , Associate Professor of Statistics and Portfolio Project Director at Truman State University on October 16, 2009 at 11:00am EDT
  • While the CLA is good for what it is good for, so are portfolios.
    The weaknesses you point out are also their strengths.
    Would that a single test do the trick.

    A school that really cares about improvement should do a variety of these things.
    Multiple measures tell a richer story, and richness is what is needed to encourage real change.
    I'm not sure that isn't the point you are thinking you are making, but it isn't the point I see here.

  • Assessment means letting institutions decide
  • Posted by Ashley Finley , Director of Assessment for Learning at AAC&U on October 16, 2009 at 12:15pm EDT
  • Many good points have already been made in the comments preceding this one – so I’ll resist rehashing those (e.g. standardization alone will not adequately assess learning outcomes, institutions should seek to examine a complex and meaningful array of learning outcomes that include but also go well beyond good writing). I was struck by the essay’s dismissal of e-portfolios as “potentially biased” while adding that standardization is beneficial because “(d) the score assigned to a student MOST LIKELY reflects the quality of the work done by that student and that student alone (without assistance from others).” The bottom line is validity is hard to achieve regardless of the instrument. Standardization provides a decent means of achieving reliability, but replication isn’t helpful is you’re just consistently missing the mark.
    My work with campuses has suggested that for assessment to work at an institution (meaning people actually do it, pay attention to results, and try to do something meaningful with the data), campus constituencies (administrators, faculty, and students) need to have input into how outcomes are defined, evaluated, and contextualized. Standardization makes this difficult by necessarily having to apply a one-size-fits all model across varied and unique campus climates. So the idea that “Without standardization, faculty and administrators at individual campuses cannot answer the fundamental questions” is simply inaccurate. Institutions need to find a face validity that works for their institution and apply assessment accordingly. This may very well include standardization and it may very well include e-portfolios, but I think we’re all agreed it shouldn’t include just one thing.

  • Another perspective
  • Posted by Peter Ingle , Director of the Learning Coalition at Westminster College, Salt Lake City, Utah on October 16, 2009 at 1:30pm EDT
  • The move of many institutions to a focus on a central learning mission and a specified set of college wide learning goals has run in to many problems. For many, the notion of a grade on one assignment, or course or even a set of courses falls short of demonstrating the mission or college goals. Those assessments carry the same issue of reliability, validity, bias and standardization expressed by the author as problems with portfolios.
    A portfolio offers a creative and effective method of having students demonstrate learning over a period of time. Combined with rubrics utilized by a wide variety of individuals in a variety of disciplines, the portfolio can provide another important measure of learning. However, it is just one of a number of assessments that help the student to demonstrate knowledge and skills. It is not the only measure.

  • Maybe this approach to assessment is the real problem
  • Posted by Trent Batson , Executive Director at AAEEBL on October 16, 2009 at 1:45pm EDT
  • The argument would seem to be that because assessment requires
    standardization, portfolios fall short and should not be used for
    assessment.

    My question to this very intelligent but perhaps short-sighted article
    is this:

    If the standard assumptions about what assessment is don't fit with
    reality today -- more group work, more individual variation in
    learning experiences, a greater variety of kinds of work, more
    reflective opportunities -- maybe the problem is not the portfolio
    approach but that those in assessment need to re-think their entire
    field.

    And: is it more important to be able to see more about the student,
    now that we can, and to see more evidence of change, than to capture a
    static snapshot of a standardized performance? Do we want to assess
    a student based on a snapshot or on a fuller picture of the student?

    If assessment, as these authors represent it, means that the sweeping
    new educational values inherent in digital tools such as portfolios
    are irrelevant to good assessment, then we have to question the value
    of such assessment. Why do we always want to standardize what is not
    at all standardizeable?

    The article, to me, calls into question the value of assessment as
    these authors present it, not the value of portfolios.

    Trent Batson, Ex Dir. AAEEBL (www.aaeebl.org)

  • Not Credible
  • Posted by Ron Bramhall , Director of Business Honors Program at University of Oregon on October 16, 2009 at 3:45pm EDT
  • Trent Batson is right on point. The only thing the authors really say is that portoflios are limited becuase they can't measure what their test (the CLA) can. That claim has nothing to do with what students should actually be learning and doing, and whether we can assess all of that or not. It only has to do with what tool is best for measuring a limited set of isolated, "standardized" skills.

    The statement, "Portfolios are simply not up to the task of providing the necessary data for making a sound assessment of student learning." presumes so much. It presumes that the authors know exactly what the "necessary data" is and that they know exactly what "student learning" should be.

    On the 3 flaws:

    1. Standardization

    Consider this statement: "Standardization refers to assessments in which (a) all students take the same or conceptually and statistically parallel measures; (b) all students take the measures under the same administrative conditions (such as on-site proctors and time limits); (c) the same evaluation methods, graders, and scoring criteria are applied consistently to all of the students’ work; and (d) the score assigned to a student most likely reflects the quality of the work done by that student and that student alone (without assistance from others)."

    Let's assume that is all correct. So what? What do we gain from knowing that students, in a lab rat environment, can do certain things? Are they claiming we can somehow generalize something from that? In what real world setting would someone be performing the same task, under the same conditions, with the same evaluation criteria, with help from no one else? My job is never like that and I can't think of a job I would want my students to have that is like that. I want to know what a particular student can do in a variety of ambigious, uncertain situations. Is that harder to measure? Yes. Is it less valuable than these standardized measurements? NO - it is decidely more valuable.

    And, assessment for what? Do we want our students to be "successful"? How do we define that? There is little, if any, evidence that standard scores (GPA, SAT) are in any way predictive of career success.

    2. "Not feasible for large-scale assessment due to administration and scoring problems"

    Again, so what? I'm overstating it a bit - I know education needs to do a better job demonstrating its value. But this is an administration problem, not an education problem. It is also a classic, flawed argument for standardized measurement. The ultimate logical progression of this argument is a world where only the things that can be scored and administered on a large scale are worth doing. Sure would make things simpler - but simple isn't the only criteria.

    The public policy landscape is littered with bad ideas that were administered because it was easier than doing something that would actually solve the problem. Malcom Gladwell wrote a wonderful piece on Pitbulls. Many states/cities have statutes against owning pitbull-like dog breeds because of attacks by those breeds. Problem is, the data suggests that it's not a pitbull problem, its an owner problem. These owners could make a poodle into a ferocious menace to society, so banning pitbulls doesn't actually solve the problem. But it is simpler.

    3. Bias
    "A portfolio may include a photograph, videoclip, or other information about student identities. Their gender, race, ethnicity, and other characteristics also may be known by those evaluating the portfolio. This lack of anonymity may bias results."

    Yep, all true. We are human beings - we have peculiar beliefs, perspectives, tendencies that all play into our behaviors. That's life. Do we want our students thinking that all of that doesn't play into "evaluations" of us everyday? Dealing with that reality is another important skill. Let's teach that - not some sanitized, utopian ideal that doesn't exist.

    Finally, if I were to treat this piece as part of a "Library Case" for my students, I would ask them to assess the credibility of this piece given the information at hand. They would easily find that this piece is fundamentally self-serving and question the credibility. It may all be true, but the unquestionable "bias" of the authors makes it unbelievable.

  • Boo hoo! Why standardized tests still have not changed
  • Posted by Robert J. Sternberg , Dean of the School of Arts and Sciences at Tufts University on October 16, 2009 at 8:00pm EDT
  • The Shavelson et al. essay makes many good points, but it is profoundly disappointing because it illustrates so well (and depressingly) why assessments have changed so little over the course of more than a century. Standardized tests, such as the CLA and similar assessments, do have many advantages, which the authors point out. The problem is that they have been, and continue to be, narrow and somewhat parochial assessments of student learning outcomes. These tests, like their competitors, are largely measures of what psychologists call "g," or general intelligence. Conventional tests can and do have many different surface structures--the CLA and the SAT look different--but very similar deep structures. As the authors mention, the CLA, like the SAT and GRE and other tests with which they correlate, have advantages of standardization, good reliability, and reasonable predictive validity. What they lack is breadth.

    Students need more than the critical-thinking skills that these tests measure to succeed in school and in life. For example, they need creative, practical, and wisdom-based and ethical skills. The financial meltdown of 2008 was not caused by people who lacked knowledge or critical-thinking skills; it was caused by people who could not creatively see outside the box and who lacked the wisdom to use their knowledge and skills ethically for the common good. In our own research, we have found that broader tests that assess these other skills increase prediction of academic and extracurricular performance and decrease ethnic-group differences.

    Comparing portfolios with the more conventional standardized tests is a little like comparing apples and oranges. They are both assessments (as apples and oranges are both fruits), but of different things. Portfolios provide an opportunity to assess some of these broader skills because they are inherently more creative, requiring students to design their own demonstration of learning. In the ideal, then, it would not be an "either-or" decision as to which kind of assessment to use, but a "both-and" decision. However, schools may not have the resources to do both kinds of assessments. Then they should decide what learning outcomes are important to them and create or select assessments that optimally measure these learning outcomes. The testing industry has a history of putting the cart before the horse--of creating measures and then post hoc figuring out what they measure. (For example, the SAT started out as the Scholastic Aptitude Test, then became the Scholastic Assessment Test, and then became just the SAT.) College teachers should decide what learning skills are important to them and only then seek to measure them. If they value broader creative, practical, and wisdom-based skills, portfolios may serve these purposes. They have weaknesses, as Shavelson et al. point out, but so do conventional standardized tests, which is why ideally the two would complement rather than conflict with each other.

  • What's in it for the students?
  • Posted by Doug Larkin , Doctoral Student/Morgridge Fellow at University of Wisconsin-Madison on October 17, 2009 at 8:00am EDT
  • While it certainly makes good sense to gather data about what our students know and are able to do, I'm not certain that there exists a shared purpose for the assessments discussed here, whether portfolio, CLA, or other variant. Let us differentiate between assessments, which are given in order to inform instruction, and evaluations. Lost in the above discussion is the notion that assessments can be useful to students themselves when they provide timely and useful feedback about what they've learned. Portfolios have unfortunately garnered an image of being repositories of "best work," when in fact, we might do better to think of them as workbenches rather than showcases. The criterion measures Dr. Shavelson references can indeed be used for assessing student portfolios, and is in fact something we've been doing with our science teacher education students at UW-Madison for the past five years. The difference is that we use our assessments as a tool for learning, with our institutional standards as the criteria, without the intention of comparing across different contexts. As for the evaluative component, Dr. Shavelson is correct in pointing out that other measures (in our case, coursework grades, fieldwork evaluations, and participation in the formative process to complete the portfolio) are up to this task.

    We live in an era where it is all to common to have one's performance evaluated for purposes of "accountability." Yet the time may be at hand for those being evaluated to hold the evaluators themselves accountable, and ask, what's in it for us?

  • Vision that is too narrow
  • Posted by mkt on October 17, 2009 at 8:00am EDT
  • Excellent comments, I'll add just one more, about the authors' example of using a driving test to assess drivers. That's precisely the simplistic, narrow, skills-based view of higher education that characterizes all too many proposed policies.

    When we give a test to a student driver, or a bar exam to a would-be lawyer, or a programming problem to someone seeking to be a Microsoft Certified Software Engineer, we have a specific set of skills that we are looking for and can test for. Most higher education programs however have a much broader and multi-dimensional set of desired learning outcomes.

    The driving analogy is a poor one. One can learn to drive in a few hours per week over a single term. A college education is a much more complex process, better analogies would be how we choose which political candidate to vote for, or how we choose a spouse. We don't let test scores tell us which candidate to vote for, and we don't give standardized exams to potential spouses. And we shouldn't assess colleges or students' learning outcomes by putting heavy emphasis on standardized tests.

    Tests can of course be used as an additional source of information -- similarly politicians are given numerical ratings by groups ranging from the Sierra Club to the National Rifle Association, and we might choose a first date on the basis of their E-Harmony questionnaire. But the crucial parts of choosing who to vote for, or choosing who to marry, or assessing institutional effectiveness, should not be (and thankfully are not, if we can keep the CLA lobbyists and sales force at bay) done using standardized tests.

  • Binet and standardization
  • Posted by sk on October 17, 2009 at 6:45pm EDT
  • Trent, Ron and others are pointing to the problems with the APA's definition of standardization and how it may be inappropriate for student assessment.
    Agreed. Let's not forget that APA standardization (same test, same test conditions, same scoring, and reliable measures of rank/standing in relation to the associated group) first came about in response to the need to bureaucratize the identification and sorting of mental defectives in French schools.
    Now, as in this case, the bureaucratic tail is wagging the methodological dog. Rather, students need to come first, not system requirements.

     

  • Quantitative vs Qualitative
  • Posted by Mercedes del Rosario , ePortfolio Project Director at LaGuardia Community College on October 18, 2009 at 3:15pm EDT
  • I hope the following papers (highlights of which are excerpted) would help shed light on how ePortfolios could be used as alternatives for large-scale assessment.

    STRATEGIC INITIATIVE GRANT: Final Report

    UMass Dartmouth Eportfolio: Phase Three

    http://media.umassp.edu/massedu/teachingandlearning/Carerra_final%20atgfy07.pdf

    Electronic portfolios (eportfolios) are a powerful assessment tool for colleges and departments as eportfolios make student learning visible to both students and faculty. Because learning takes place both inside and outside the classroom, we also recognize that learning occurs at different moments and formats for individual students. Eportfolios enable students, and faculty to reflect upon those moments and formats, while also contributing to the development of a record of student learning outcomes and to departmental assessment of how students accomplish the learning outcomes of their chosen majors.

    Eportfolios can be an excellent tool for holistic assessment by departments and colleges.

     

    ProfPort Webfolio System:

    Implementation, Curriculum and Assessment

    http://www.dock.net/gathercoal/profport/

    Criticisms surrounding the “validity and reliability” of portfolio assessment are best addressed by investigating the paradigm shift from traditional portfolio assessment to webfolio assessment, evaluation and reporting at once. Today, standardized tests are driving modern-day educational practices (Koretz, 1998). Webfolio systems provide a viable alternative to these standards-based tests. When webfolio systems are fully and properly implemented professional educators can do away with standardized tests in favor of webfolio systems that enable standards-based, authentic assessment, program and instructor evaluation and reporting as the driving force behind educational practices.

     

    Webfolio systems facilitate authentic assessment practices complementary with portfolio assessment; program and instructor evaluation complementary with evaluative observations used to inform instruction in standards-based teaching and learning settings; and authentic reporting of student academic achievement complementary with the practice of sharing student showcase and growth portfolios. The innate ability of webfolio systems to unite authentic assessment linked to educational standards, evaluation of educational programs and instructors, and the ability to report in “authentic ways” academic achievement linked to educational standards to those who have a need to know, irrevocably alters the traditional paradigm of portfolio assessment and denies the old criticisms of “validity and reliability.” It is this substantive improvement, recognizing and valuing the intrinsic links between portfolio assessment, program and instructor evaluation and the reporting of academic achievement that fortify the promise webfolio systems hold for being the next great innovation in education.

     

    “A well-designed curriculum embedded in a webfolio system, conveying academic standards, appropriate resources and providing vehicles for faculty mentoring, enables student’s development and upkeep of developmental, growth and showcase portfolios at once. A web-based electronic portfolio system acknowledges and appreciates the intrinsic links between student assessment, faculty and program evaluation and the meaningful reporting of assessments and evaluations to interested third parties.” In the capable hands of professional educators who have the best interests of their students at heart, webfolio systems may permanently transform assessment, evaluation and reporting to comprise authentic assessment, evaluation and reporting.

    To my mind, however, the debate between standards-based large scale assessment vs ePortfolio-based assessment goes back to the age-old debate between quantitative and qualitative methodologies. Countless studies have already pointed to the limitations of using a mono-method design that is why we’ve seen not too recently the rise of mixed methods (Johnson and Onwuegbuzie, 2004). But if grant funders continue to prefer and demand quantitative outcome measures complete with colorful charts and graphs, qualitative evidence, including student work and reflections embedded in their ePortfolios have, sad to say, ways to go to become mainstream assessment instruments.

    The challenge lies in convincing funders that quantitative evidence mixed with the qualitative allows the confirmation or collaboration of data through triangulation; elaboration or development of analysis resulting to richer detail; and the initiation of fresh insight by catching surprises or paradoxes that could easily be overlooked if one is restricted to a monomethod (Rossman and Wilson, 1984, 1991 as cited in Mile and Huberman, 1994).

  • Accurate standardised assessment using portfolio and e-portfolio
  • Posted by Terence Love , Postgrad Coordinator at Curtin University of Technology on November 21, 2009 at 4:45am EST
  • Trudi Cooper of Edith Cowan University has been successfully using and promoting an approach for standardized accurate assessment of portfolios and e-portfolios for over a decade. Her approach classically uses learning outcomes, performance indicators and evidence of learning outcomes being satisfied. She has several tools to: minimise assessment time; ensure students avoid collecting irrelevant evidence; ensure assessment and evidence validity and guide student learning. The approach can be used from k12 to postgraduate and works across a wide range of educational modalities. Her books on using portfolios are here:

     

    http://www.praxiseducation.com/catalog/portfolios-c-21.html

     

    Some of her academic papers are here:

     

    http://www.love.com.au/PublicationsTLminisite/Publications.htm

     

    Trudi has a new book on e-portfolios in press along with a new chapter on e-portfolios.