What AP Scorers Shouldn’t Grade

Some things -- even things that may seem worthwhile -- may not be graded with fairness, writes Laura Aull.

July 26, 2021
sdecoret/Getty Images

Last week I served as a consultant for the Advanced Placement English Language and Composition exam with a group of dedicated high school and college educators. Working with the group, my job was to determine the borderline responses that could differentiate between AP scores -- for instance, between a 2 and a 3 score, often the difference between receiving college credit or not, or between a 4 and a 5 score, which strives to approximate a B-minus versus A-level score in a college composition course.

To determine the borderline scores, we used sample student responses from this year’s pencil-and-paper exam. Before we did so, we each took the 2021 exam ourselves. Then we joined for a week of Zoom calls, which involved individual tasks, small group discussion and large group discussion.

Discussion focused on all questions and responses on the exam: multiple choice reading questions (focused on reading comprehension and genre awareness), multiple choice writing questions (focused on finding main ideas and revising, such as by deleting a sentence or not), and three timed essay tasks.

The essay tasks included a synthesis task, which was an argumentative essay expected to include at least three (provided) sources, a rhetorical analysis task of a (provided) speech and an argument task, which was an open-ended question to which students wrote an opinion using unspecified evidence, including personal experience.

These do not match common writing tasks I see college writing programs or composition research; for instance, it is common to assign synthesis tasks in college (such as in a literature review) that prioritize other sources’ ideas rather than using them primarily in service of students’ own argument. Indeed, education and linguistic research suggests that students do much less essayistic and argumentative writing in college, as they write research reviews, case studies and reports, for example.

There was some discussion about this in our Zoom calls, though it was clear from the College Board that the nature of the essay tasks was not the main goal for feedback. But it was the scoring portion of the process that compelled me to write this plea. It is here that perhaps scorers and consultants can make a difference, soon, in the criteria they prioritize in determining student opportunity.

I refer to what is termed the sophistication point -- a scoring category that appears to suggest that writing quality has something to do with high hats and arrow collars, white spats and lots of dollars. I entreat scorers to consider ignoring this area of evaluation, because it is poised to enact just the kind of inequitable, socially constructed rewards that education today needs to desperately claw itself away from.

AP exams in English are evaluated according to a thesis category, an evidence category and a sophistication category. The sophistication category, which can earn a point or not and appears on both the AP Language and Literature exams, is described as “identifying and exploring complexities or tensions” and “employing a style that is consistently vivid and persuasive,” for instance. Myriad resources, from YouTube videos to online tutoring resources, describe this as an “elusive” trait worth working for.

Related Stories

In my discussions with the experienced, dedicated educators in the group hired to determine the AP English Language and Composition cut scores, I asked a lot of questions about the sophistication point. What did it mean? How did they teach it? How did they score it? They said things like this: “I know it when I see it,” “I am not sure it is something that can be taught,” “it is a gifted way with words” and “it goes beyond the usual.”

It brought to mind similar examples that have circulated since student writing was first tested. In 1879, a Harvard University grader criticized “inelegant expression” as a reason for greater attention to writing in secondary and college study. (Like the AP English exams, those Harvard exams were also timed, though they did give students a bit longer than the 40 minutes per essay.)

In 1975, a Newsweek article described students as “semilliterates” who use “simplistic” style, based on results from the (timed) U.S. National Assessment of Educational Progress between 1969 and 1975. In 2012, Helen Sword’s survey of over 70 academics reported in Stylish Academic Writing suggested that they were “surprisingly consistent” in their recommendations about academic writing, including that it should include “elegant, carefully crafted sentences” and “convey a sense of energy, intellectual commitment, and even passion.”

Here’s the rub: it is much easier to describe these qualities than to consistently score them. Linguistic research shows language patterns vary across genres and fields of academic writing -- including use of the passive voice that Sword calls “stodgy.”

The particular development of U.S. English studies (still classified by the antique term English Language and Literature/Letters by the National Center for Education Statistics) means most English educators study literature rather than language, and many do not study assessment. Literacy researchers Mary Lea and Brian Street, who studied educators’ responses to academic writing across disciplines, show this challenge: the tendency for educators to use of abstract terms and to criticize when “Faced with writing which does not appear to make sense” within their own academic framework. They use “familiar categories” like “clarity” and “analysis,” when in reality, these categories are “bound by their own understandings and not readily understood by students.”

On the flip side, in rare cases where reasons for failure are explained, they betray the social construction of academic usage conventions associated with belles-lettres traditions. The early Harvard graders, for instance, lamented “second-rate diction,” including “confusion of shall and will” and the use of ain’t -- in other words, class-specific language markers that were determining these students’ access to college opportunity (and which, in the case of shall and will, would evolve to become obsolete).

I am not saying that there are not some student writers who use language in such a way that it impresses their teachers vaguely but definitely. And I am not saying that teachers should not let students know, in their feedback, when this is the case. What I am saying is that undefinable qualities should not determine access to opportunity. I am saying that if it can’t be taught, it shouldn’t be assessed.

If every student -- any kind of student -- cannot with effort and practice learn “sophistication” in their writing in the course of the AP class, then it should not be a criteria for success, on the exam or in the course. If you believe good writing has a certain je ne sais quoi and must keep believing that, fine. But don’t try to score it.

My plea to AP scorers is this: consider assigning the sophistication point to everyone or no one. This would be a systematic approach. My related plea is to the College Board: in the spirit of fairness and justice, remove the sophistication point as you would remove any barrier to student success.


Laura Aull is associate professor of English and director of the writing program at the University of Michigan.


We have retired comments and introduced Letters to the Editor. Letters may be sent to [email protected].

Read the Letters to the Editor  »

Today’s News from Inside Higher Ed

Inside Higher Ed’s Quick Takes

Back to Top