Here was my day, in a sentence: “The past of the character greatly contributes to the meaning of the novel as a whole.” Indeed. Many, many AP English Lit essays I graded today began with the same startling insight. Please pass the hemlock.
The tests are supposed to have been shuffled by ETS before we receive them, 25 to a folder, but it seems clear to me that sometimes several in a row are from students at the same facility, who’ve been coached for the test by the same teacher. Today, for instance, I read six essays in a row on Marilynne Robinson’s Gilead, the only ones I’ve seen on that text this week. They were all average in the same way and started with identical sentences.
Now that we’ve gotten rolling, you can hear snorts and laughter around the tables as readers encountered clinkers in the writing. One of my essay books had written on its front cover: “The essays contained herein are MONEY.” The writer was right, at least for the one I graded. But others get a little excited: Jayne [sic] Eyre faced abuse and ridiculation; Gatsby’s ignoration of Daisy’s marriage caused problems; because Toni Morrison lived in Cincinnati, she understands what took place with slavery.
Remember when John Cusack meets up with his old teacher in Grosse Pointe Blank? “Still inflicting all that Ethan Frome damage?” he asks. I got a lot of collateral Frome damage today. Other works were done to death too. Holy crap, Charlotte Brontë, did you have to be so…gothic? You’ve still got students talking about it. And you, F. Scott Fitzgerald: I’m coming to dig you up and help you roll over in your grave, after what students made of your work. “I used to like The Great Gatsby,” my roommate said this afternoon.
The pit boss at each table of six readers is called a Table Leader. One told me I’d “do 1200 [student essay] books” this week but stressed that “accuracy is important, not speed.” He admitted that if a reader graded slowly enough, he or she would be “encouraged.”
For much of the first day, and several times a day since, we “norm”—read the same practice essay or essays and vote, by a show of hands, which number score from 0 to 9 we’d assign. Even once we’d “gone to live books” (grading actual student tests), there was often a 70% differential in periodic practice grading sessions, though most scores clustered within three points of each other. Table Readers read at least 10% (and often more) of our graded essays in “back checks” and consult with us individually on how better to read to the rubric.
Still, that meant that by Monday night, up to 90% of many thousands of essays had been read by only one person, who might or might not be at fairly severe odds with the normalized practice scores. (By Tuesday afternoon, we’d read nearly 25% of all essays.) This made me uncomfortable at first, not the least for the possibility that I sometimes wasn’t in perfect sync with the "official" score either. After all, I’ve never graded high school papers before. I asked my Table Leader what a point here or there means to a student’s future. He was vague, and I reassured him I wasn’t trying to make trouble; I only needed to know how much bourbon to drink that night to assuage my guilt.
After the first day, when our Question Leader (head of readers for each of three questions on the exam) asked us to raise our hands for the norming exercise, she tended to start not at 9, the top of the scale, and work downward, as she had the first day, but lower down the scale, where the anticipated score was. This saved time but also precluded anyone with oddly stray scores from showing their hands. And today, she cut off the lower end too, by ending, “Four and below?” instead of running through all the numbers.
The Table Readers and Question Leader and Chief Reader (of AP Literature) all encourage us strongly to “use the range of scale” when we grade, since it doesn’t look good statistically if all the scores fall in the middle, and to “reward writers for what they do well.” Curiously, we’re never encouraged to grade downward, but are encouraged sometimes to grade up, often by being directed to “always go back to the rubric.”
With one practice essay, the Question Leader at the front of the room wanted to “completely persuade [us] that the sample was a 9,” the highest score, even though many of us saw it as lower. Another sample essay, on The Poisonwood Bible, was gushed over as a 9, but the student had analyzed some things poorly, including, “All the characters emerge from their trial wiser and more enlightened.” I beg to differ. Rachel, the eldest daughter in that novel is a racist (and a pain in the ass) to the end. ''The way I see Africa,'' Rachel says, ''you don't have to like it but you sure have to admit it's out there. You have your way of thinking and it has its, and never the train ye shall meet!”
Another time, some of us had issues with the above-average score given a practice essay on Gatsby. It was, in fact, good prose, but it failed to address the prompt at all, which asked about the characters’ relationship to the past. If a student had written it for my class, I would have had to ask him to start over. In this case, I asked if it might be indicative of an essay prepared before the AP prompt was known. We were told we “might want to look back” at the essay and rubric—a gentle way of saying, You’re not gonna win this.
Anyway, interesting things happen in the grading trenches. I read one essay on All the King’s Men and scored it too high by a point, I think, looking back. The next essay on the same text read better, so of course it received a higher score than the previous one, compounding the first error.
Having said all this, I have no doubt that we are, as a group, reading more similarly now than when we first convened. Also, the fact is, AP students took an objective multiple-choice exam, along with discursive answers to three prompts. Those essays will be graded by three different people, and there’s about a 1-in-10 chance that one of those will be read by another reader in a back-check. If I were a student, I’d take the odds of that system, given what I know of the readers here. But I’m not sure I’d happily accept the conditions of the written portion of the test: forty minutes each for three essays in a row, without a real chance at revision of any kind. It sometimes takes me hours to write a blog post, let alone anything more significant.
I could get it done faster, but you’d have to let me begin, “The past of my day greatly contributes to the meaning of this posting as a whole.”