AI-generated essays are nothing to worry about (opinion)

You have /5 articles left.
Sign up for a free account or log in.

sompong_tom/iStock/Getty Images Plus

September 2022 was apparently the month artificial intelligence essay angst boiled over in academia, as various media outlets published opinion pieces lamenting the rise of AI writing systems that will ruin student writing and pave the way toward unprecedented levels of academic misconduct. Then, on Sept. 23, academic Twitter exploded into a bit of a panic on this topic. The firestorm was prompted by a post to the OpenAI subreddit where user Urdadgirl69 claimed to be getting straight A’s with essays “written” using artificial intelligence. Professors on Reddit and Twitter alike expressed frustration and concern about how best to address the threat of AI essays. One of the most poignant and widely retweeted laments came from Redditor ahumanlikeyou, who wrote, “Grading something an AI wrote is an incredibly depressing waste of my life.”

As all this online hand-wringing was playing out, my undergraduate Rhetoric and Algorithms students and I were conducting a little experiment in AI-generated student writing. After reviewing 22 AI essays I asked my students to create, I can tell you confidently that AI-generated essays are nothing to worry about. The technology just isn’t there, and I doubt it will be anytime soon. For the aforementioned AI essay activity, I borrowed an assignment sheet from the University of Texas at Austin’s first-year writing class. The assignment asks students to submit an 1,800- to 2,200-word proposal about a local issue. Students usually tackle on-campus issues, advancing ideas like “It shouldn’t be so hard to get into computer science classes” or “Student fees should be lower” or “Campus housing should be more affordable.” For the purposes of the Rhetoric and Algorithms class, I asked students to rely on the AI as much as possible. They were free to craft multiple prompts to generate AI outputs. They were even welcome to use those prompts in their essays. The students were also free to reorder paragraphs, edit out obvious repetitions and clean up the formatting. The primary requirement was that they needed to make sure the bulk of the essay was “written” by AI.

The students in this class were mostly juniors and seniors, and many were majors in rhetoric and writing. They did great work, putting in a lot of effort. But, in the end, the essays they turned in were not good. If I had believed these were genuine student essays, the very best would have earned somewhere around a C or C-minus. They minimally fulfilled the assignment requirements, but that’s about it. What’s more, many of the essays had obvious red flags for AI generation: outdated facts about the cost of tuition, quotes from prior university presidents presented as current presidents, fictional professors and named student organizations that don’t exist. Few of the students in my class have experience with computer programming. As a result, they mostly gravitated toward freely accessible text generators such as EleutherAI’s GPT-J-6B. Several students also opted to sign up for free trials of AI writing services such as Jasper AI. However, regardless of the language model they used, the results were pretty consistently mediocre—and usually quite obvious in their fabrication.

At the same time, I asked my students to write short reflections on their AI essays’ quality and difficulty. Almost every student reported hating this assignment. They were quick to recognize that their AI-generated essays were substandard, and those used to earning top grades were loath to turn in their results. The students overwhelmingly reported that using AI required far more time than simply writing their essays the old-fashioned way would have. To get a little extra insight on the “writing” process, I also asked students to hand in all the collected outputs from the AI text generation “pre-writing.” The students were regularly producing 5,000 to 10,000 words (sometimes as many as 25,000 words) of outputs in order to cobble together essays that barely met the 1,800-word floor.

There has been a fair amount written about the supposed impressiveness of AI-generated text. There are even several high-profile AI-written articles, essays or even scientific papers or screenplays that showcase this impressiveness. In many of these cases, the “authors” have access to higher-quality language models than most students are currently able to use. But, more importantly, my experience with this assignment tells me that it takes a good writer to produce good algorithmic writing. The published examples are generally the beneficiary of professional writers and editors crafting prompts and editing results into a polished form. In contrast, many of my students’ AI-generated essays showed the common problems of student writing—uncertainty about the appropriate writing style, issues with organization and transitions, and inconsistent paragraphing. Producing a quality essay with AI requires having enough fluency with the target writing style to craft prompts that will lead the model to produce appropriate outputs. It also requires having solid organizational and revising skills. As a result, the best writers among my students produced the best AI essays, and the developing writers generated essays with many of the same issues that would have been in their genuine writing.

All in all, this exercise tells us we are not on the verge of receiving a flood of algorithmically generated student submissions. It’s just too much work to cheat that way. The activity also tells me that the best defense against AI essays is the same as the best defense against essay repositories—a good assignment sheet. If your assignment is “For today’s homework assignment, please describe the reasons for the U.S. Civil War” (a literal stock prompt from the GPT-J model mentioned above), you are way more likely to get AI or downloaded essay submissions than if you craft a detailed assignment sheet specific to your classroom context. The assignment I used for my Rhetoric and Algorithms students was a substantial challenge because it asked them to address local issues of concern. There are just not enough relevant examples in the data the AI text generators are drawing from to generate plausible essays on this topic whole cloth.

Beyond worries about academic misconduct, this activity also showed me that using AI text generation can be a part of good writing pedagogy. Two of the most important and difficult things to teach about writing are genre awareness and best practices for revision. Developing writers don’t have the experience necessary to intuit the subtle differences between different essay or assignment types. This is why student essays often feel either over- or underwritten. Students are often still figuring out how to find the sweet spot and how to adjust their style for different writing activities. What’s more, the usual delay between submission and feedback doesn’t do a lot to help develop this intuition. Prompt crafting for AI text generators, however, provides mostly immediate feedback. Through experimenting with sentences that do and do not produce appropriate AI outputs, students can develop a sense of how to write differently for different genres and contexts. Lastly and regrettably, most of my students complete their writing assignments in a single session just before the deadline. It is hard to get them to practice revision. AI-generated text provides an interesting possibility for a sort of pedagogical training exercise. Students could be asked to quickly generate a few thousand words and then to craft those words into useable prose. This isn’t “writing” in the same way that line drills aren’t basketball. But that doesn’t mean there isn’t a useful pedagogical role here.

Ultimately, higher education is going to have to come to grips with AI text generation. At present, most of the efforts to engage these concerns seem to gravitate either toward AI evangelism or algorithmic despair. I suppose this parallels AI discourse more broadly. Nevertheless, neither evangelism nor despair strikes me as the ideal response. To those who despair, I think it’s very unlikely that we are (or will soon be) drowning in AI-generated essays. With current technology, it’s just too much harder and more time-consuming than actually writing an essay. At the same time, I am deeply skeptical that even the best models will ever really allow students to produce writing that far exceeds their current ability. Effective prompt generation and revision are dependent on high-level writing skills. Even as artificial intelligence gets better, I question the extent to which novice writers will be able to direct text generators skillfully enough to produce impressive results. For the same reasons, I also question the enthusiasm of AI evangelists. It has been just over five years since Google Brain computer scientist Geoffrey Hinton declared, “We should stop training radiologists now. It’s just completely obvious that within five years, deep learning is going to do better than radiologists.” Well, we’re still training radiologists, and there’s no indication that deep learning is going to replace human doctors anytime soon. In much the same way, I strongly suspect full-on robot writing will always and forever be “just around the corner.”