“No.” “Nope.” “Not at this time.” “Not yet!” “Just discussing it now.” “I have not.” “I will do this in the future.” “Yes.” “No way.” “Not yet, but I have a lot of ideas …”
This is a representative sample of faculty responses to the question “If you have successfully integrated use of ChatGPT into your classes, how have you done so?” in a 2023 Primary Research Group survey of instructors on views and use of the AI writing tools. (Note: The survey is behind a paywall.) A few other responses of note were “It’s a little scary,” “Desperately interested!” and “I’m thinking of quitting!”
A few short months after OpenAI released ChatGPT—a large language model with an unusual ability to mimic human language and thought—the company released an upgrade known as GPT-4. Unlike the earlier product, which relied on an older generation of the tech, the latest product relies on cutting-edge research and “exhibits human-level performance,” according to the company.
GPT-4 is a large multimodal model, which means that it produces natural language in response not only to words but to visuals such as charts and images. This latest version largely outperforms the earlier model. For example, GPT-4 scored in the top decile on a simulated bar exam, while ChatGPT scored in the bottom decile. There are noteworthy exceptions. Both earned grades of 2 (out of 5) on a simulated Advanced Placement English Language and Composition exam, for example.
As the pace of artificial intelligence accelerates, administrators and faculty members continue to grapple with the disruption to teaching and learning. Though many are at work updating their understanding of AI tools like ChatGPT, few have developed guidelines for its use. But by OpenAI’s own admission, humans are susceptible to overrelying on the tools, which could have unintended outcomes.
“It’s spring break,” Mity Myhr, professor of history and associate dean of the School of Behavioral and Social Sciences at St. Edward’s University, said last week when asked whether faculty members at the private nonprofit Texas institution were discussing the even more sophisticated AI tool. “I imagine that conversation will happen next week … But some are waiting for this summer to really dig in.”
Surveys: Faculty Want More AI Guidance
When ChatGPT appeared to be the most sophisticated AI writing tool in the college-writing landscape—only a couple of weeks ago—faculty were abuzz with conversation about how to design assignments that could evade the software, how to distinguish machine writing from human writing and how to protect students from AI’s sometimes disturbing replies.
Then came GPT-4.
“The old version from a few months ago could be a solid B student,” said Salman Khan, founder of Khan Academy, an American nonprofit focused on creating online educational content for students. “This one can be an A student in a pretty rigorous program.” Khan’s nonprofit is working on an AI assistant that seeks to ensure students do most of the work. (The tool’s name, Khanmigo, is a pun on con amigo, or “with friend” in Spanish, which echoes the company’s name.)
Primary Research Group’s survey considered the views of 954 instructors from colleges that grant associate, bachelor’s, master’s, doctoral and specialized degrees. The poll took place between Jan. 28 and March 8, with most (87 percent) responding in February.
Few college administrations (14 percent) have developed institutional guidelines for the use of ChatGPT or similar programs in classrooms, according to the faculty respondents. Smaller colleges and public colleges were less likely to have developed guidelines than larger or private colleges.
Further, few instructors (18 percent) have developed guidelines for their own use or that of their students, according to the report. Community college instructors were the most likely to have developed guidelines. The likeliness to have developed policies was inversely related to age. That is, younger instructors were more likely to have developed policies than were older instructors.
Faculty respondents were split about whether they should integrate ChatGPT into educational strategy or encourage students to use it. Approximately one-quarter (24 percent) felt that they should. A slightly larger group (30 percent) felt that they should not do either. Close to half (44 percent) had no opinion.
Many professors are in a wait-and-see mode concerning AI writing tools in the classroom, though some are waiting for guidance, according to the survey. Most (63 percent) have no opinion on their colleges’ efforts to deal effectively with the educational consequences of AI writing tools’ availability. But some (22 percent) are dissatisfied or very dissatisfied. Those who are satisfied or very satisfied (6 percent) are the smallest population.
To be sure, 2023 is still young, and some students, professors and colleges are hard at work drafting artificial intelligence policies. An undergraduate Data, Society and Ethics class at Boston University, for example, has drafted a blueprint for academic use of ChatGPT and similar AI models that they hope will be a starting point for university discussions.
Individual faculty members have also shared online resources. Ryan Watkins, professor of educational technology leadership at George Washington University, for example, offers advice on updating course syllabi. Anna Mills, an English instructor at California’s College of Marin, offers educators starting points for inquiry. Faculty have also formed Google Groups to share sources for stimulating discussion among teachers.
Critical AI, an interdisciplinary journal based at Rutgers University, suggests some next steps for educators in the large-language-model era. The University of California, Berkeley, has launched an AI policy hub with a mission “to cultivate an interdisciplinary research community to anticipate and address policy opportunities for safe and beneficial AI.” Despite the survey report indicating that few colleges and instructors have policies for generative AI writing in place, many appear to be making strides in this direction.
Like its predecessor, GPT-4 has flaws. It can produce convincing prose that is wrong, biased, hateful or dangerous. But GPT-4 exhibits these flaws in ways that are “more convincing and believable than earlier GPT models,” according to an OpenAI paper published this month. As a result, students—indeed, all humans—could overrely on the tool. They may be less vigilant or not notice mistakes while using the software, or they may use it in subjects for which they do not have expertise. Overuse may “hinder the development of new skills or even lead to the loss of important skills,” the paper noted.
Tracy Deacker, a graduate student studying artificial intelligence and language technology at Háskólinn í Reykjavík University, in Iceland, encourages college administrators and professors not to put off engaging with the technology. Students need help understanding its limitations and preparing for an AI-infused workplace, Deacker wrote in an email. But such efforts can also center humans.
“We need human-to-human interaction to learn something to the core,” Deacker wrote. “It’s in our DNA.”
For professors who feel overwhelmed by the technology, some suggest an old-school on-ramp.
“Talk to students!” Maha Bali, professor of practice at the Center for Learning and Teaching at the American University in Cairo, wrote in an email. “Understand how they are thinking about this … Consider building trust and asking them to be transparent about their use with you.”
“The arrival of GPT-4 feels like a gift no one asked for,” Marc Watkins, lecturer in composition and rhetoric at the University of Mississippi, wrote on his blog. “[Faculty] need training, but what is the point of such training when systems keep changing?” Amid the disruption and uncertainty, faculty might focus on how the tech is changing and what it means to work and learn, Watkins concludes.
Some remind academics not to get nostalgic about an imagined past.
“It’s not like we’re starting from ‘things are great, and technology is going to take us away from that,’” said Kumar Garg, vice president of partnerships at Schmidt Futures, a philanthropic initiative focused on solving problems in science and society. Garg spoke at SXSW EDU in Austin, Tex., this month. “We’re starting from a messy middle where some things are working out for some and not for others.”
J. Harold Pardue, interim dean of the School of Computing at the University of South Alabama, was caught by surprise at a recent advisory board meeting his school hosts with industry leaders. (Pardue is also dean of the graduate school and associate vice president for academic affairs.) Board members, who hail from local and national companies, advise the school on curriculum matters and help place students in internships and jobs. They wanted answers.
“We were assaulted by questions about large language models,” Pardue said, adding that one board member mentioned that their company was recently purchased by Microsoft, which is planning wide integration of natural language models in its products. “I was asked point-blank … ‘What are you doing in your curriculum? When are you going to put this in your curriculum?” (Meanwhile, Microsoft laid off its entire team responsible for ethical AI development last week.)
Pardue responded, “Nothing so far, but it’s on our radar.” As a former philosophy student, Pardue is wondering whether large language models may be able to help scale teaching with the Socratic method.
As academics engage in conversations about the impact of natural language models in teaching and learning, many are seizing the moment to offer reminders of humanness.
“As much as we are academics and want to be rational, we’re humans first,” said Steve Johnson, senior vice president for innovation at National University. “The sooner you can start to get engaged with what’s happening, the sooner you can work through your emotions and get to rational thinking.”