You have /5 articles left.
Sign up for a free account or log in.

Chatbot reaches out of desktop computer to talk to woman sitting at desk

Visual Generation/Istock/Getty Images Plus

I recently spent an hour trying to respond to a review of a paper that my lab submitted to one of the top machine learning (ML) conferences. These are considered the most prestigious places to publish in artificial intelligence (AI), but shockingly, the review quality at these conferences has always been ridiculously bad.

AI is the only field I know of that uses inexperienced reviewers, many of whom have never themselves authored a scientific paper. The use of low-quality reviewers is pervasive across the AI/ML fields, and there is no recourse if one receives a low-quality review that causes the paper to be rejected. The reviews are often only a few lines long, often stating—without evidence—that the work lacks “novelty” or your experiments are “flawed.” These are annoying critiques difficult to argue with because the reviewers often have no clue what the paper is about, so they resort to generic criticism. The reviewers often come from foreign countries and do not have the same ethical standards that one might expect within the United States.

Case in point: I was working on a response to a particularly challenging review, and the criticism was completely generic. The reviewer asked for “mathematical theory” but didn’t say what the theory should accomplish, “more comparisons to other methods” but didn’t say what methods, “user studies” but didn’t say what those studies should show, and “a comparison to adversarial learning” even though the paper had nothing to do with that topic. As I stumbled over my words trying to respond to this reviewer, my student coauthor sent me an email revealing that he had run the review through a chatbot-checker; it turned out, the two free-response sections of the review were most likely written by a chatbot.

This revelation says a lot.

First, I had not recognized that I was responding to a chatbot, though it was obvious in retrospect. The field’s review quality is generally so low that I wasn’t expecting more than what I received. I’m no beginner; I have published well over 100 peer-reviewed papers, and it was more than mildly disrespectful that the reviewer made me waste an hour responding to a chatbot. 

Second, my lab’s unpublished (confidential) paper now may be part of a training set for a chatbot whose identity we do not know. We have no recourse to find out more, because the reviewer’s identity is always shielded to authors. This means some chatbot somewhere could replicate the text in our manuscript without our knowledge or permission. Use of our manuscript by the chatbot is not covered under “fair use” because it is not in the public domain. In other words, chatbots in the scientific review process creates a risk that the author’s work would be stolen. 

To be clear, reviewers are not allowed to do this—it’s against the policy of the conference. But the damage is done, and we have no recourse currently.

This case is not unique, and it illustrates the future of scientific publishing with AI. We already know that some authors use large language models to write their manuscripts, but it’s not that big of a deal since authors still need to sign their names to the manuscripts.

Reviewing using chatbots is much more problematic. Being a reviewer is a chore—one way to add a line to one’s CV while doing as little work as possible. The same is true for other computer science fields, and in other scientific fields too. If it’s a task that no one wants to do, they will be tempted to assign that task to a chatbot. And, after this experience, my students and I noticed that two additional papers we submitted had also received reviews that a chat-bot checker confirmed were also chatbot-written. Other colleagues have reported similar observations. It’s clear that the problem is getting a lot worse, fast. Many thousands of academic papers each year, from computer science papers and beyond, will likely be sent to chatbots without the knowledge of the authors. Chatbot companies will be awash in stolen academic data.

We have possible fixes to this problem, but none are easy. First, the AI/ML communities need to clean up their act and actually fix their review quality problem. Some will argue vehemently that it will slow down the publication process. There are ways to keep the review speeds relatively fast while achieving high review quality, but it requires a level of organization that we currently do not have.

Additionally, our professional organizations should track reviewers, and those with little experience shouldn’t be doing this important job. All reviewers should be linked to their academic publication profile so editors know how much experience they have, and anyone caught using a chatbot for reviewing should at least be banned from publishing or reviewing in all AI publication venues for several years.

Chatbot companies should have policies for removing stolen data from their training set and should be required to do so. Finally, professional organizations and publishers need to use AI tools to find patterns in the use of chatbots in reviewing.

AI/ML has a serious integrity problem already. It’s time to start doing something about it.

Cynthia Rudin is a computer scientist at Duke University, where she directs the Interpretable Machine Learning Lab.

Next Story

Written By

More from Career Advice