Is Google Good for History?

At gathering of historians, both critics and fans of book digitization project see great benefits and significant flaws. Some see culture clash between search engine giant and scholars.
January 8, 2010

SAN DIEGO -- At a discussion of "Is Google Good for History?" here Thursday, there weren't really any firm "No" answers. Even the harshest critic here of Google's historic book digitization project confessed to using it for his research and making valuable finds with the tool.

But that doesn't mean Google Books wasn't criticized. In a discussion at the annual meeting of the American Historical Association, scholars questioned the way Google has organized the books project and whether it was doing enough in quality control. At the same time, though, many comments suggested deep appreciation for the company's efforts. And some suggested that Google has become something of an unfair target for academics who pay little attention as other companies charge college and university libraries high fees for their materials. Over the course of the discussion, not only did Google take a few hits, but so did librarians and professors (although the Google representative left it to the academics to criticize themselves).

Dan Cohen, director of the Center for History and New Media, at George Mason University, kicked off the discussion with a strong defense of Google's book digitization efforts.

"Is Google good for history? Of course it is," he said. "We historians are searchers and sifters of evidence. Google is probably the most powerful tool in human history for doing just that. It has constructed a deceptively simple way to scan billions of documents instantaneously, and it has spent hundreds of millions of dollars of its own money to allow us to read millions of books in our pajamas. Good? How about great?"

Cohen argued that Google is both expanding access and improving the quality of research. He noted that while he was trained at universities whose libraries had "Google Books scale," most aren't. "I’m now at an institution that is far more typical of higher ed, with a mere million volumes and few rare works. At places like Mason, Google Books is a savior, enabling research that could once only be done if you got into the right places," Cohen said. He reported that he regularly has students "discover new topics to study and write about through searches on Google Books."

From a research perspective as well, he said, the advances are significant. The vastness of the Google project will fight the "widespread problem of anecdotal history," in which scholars lack the points of comparison to determine the real significance of an event, text or person. "As more documents are scanned and go online, many works of historical scholarship will be exposed as flimsy and haphazard," Cohen said "The existence of modern search technology should push us to improve historical research. It should tell us that our analog, necessarily partial methods have had hidden from us the potential of taking a more comprehensive view."

Cohen stressed that he was under no illusions that Google is perfect. He is among those who -- before everyone was doing it -- shared a find he made of a scanned book by Google that featured a human hand that shouldn't have been visible. And he admitted -- anticipating the criticism that would follow -- that there are numerous mistakes in Google, of titles and categories (especially in the metadata used to classify books for search purposes).

But he said errors are inevitable and was more critical of Google for not releasing more of the tools it has created to classify books so that scholars could better understand them and use them. He said Google was uncharacteristically secretive about the digitization project, although he acknowledged that this is no doubt in part because of all the litigation over it.

Generally, Cohen said, academics are too quick to attack Google or any large corporation. Historians "can find fault with virtually anything," he said, and "this disposition is unsurprisingly exacerbated when a large company, consisting mostly of better-paid graduates from the other side of campus, muscles into our turf." Cohen said that "had Google spent hundreds of millions of dollars to build the Widener Library at Harvard, surely we would have complained about all those steps up to the front entrance."

And he questioned why so many academics are so angry at Google. "While it seems that an obsessive book about Google comes out every other week," he asked where the volumes were about other "large information companies that serve the academic market in troubling ways," arguing that "these companies, which also provide search services and digital scans, charge universities exorbitant amounts for the privilege of access. They leech money out of library budgets every year that could be going to other, more productive uses." (Cohen posted the text of his remarks on his blog.)

Paul Duguid, adjunct professor at the School of Information at the University of California at Berkeley and a professorial research fellow at the University of London, argued that in fact it's difficult to criticize Google or its various projects without being accused of being a Luddite or otherwise old-fashioned.

Duguid argued that the incorrect misclassification of work is too widespread not to be treated as a huge flaw. He note that when the Google Books Blog recently boasted about new tools to use illustrations for new book covers, he found errors in the books used as examples. For instance, he said that Studies of American Fungi had been classified as a cooking book. And he talked about how Google had once located King Lear in upstate New York (due to the Duke of Albany), and that Google had given Duguid credit for writing a book that appeared in 1879. (A podcast with Duguid, focusing heavily on his concerns about errors in Google Books, may be found here.)

Details like dates should matter to historians, Duguid said. "If you mess up dates, you guys, you haven't a lot left."

Once, when he published an essay critical of Google, Duguid said, a scholar wrote to him that he loved Google digitization because he didn't need to go to the library anymore. Duguid called that tragic, and said if that is the argument being used by scholars (and some librarians) to defend digitization, they should be ashamed. (Generally, he said librarians have been too quick to embrace Google.)

Digitization done right, Duguid suggested, could in fact be a great advance. But he said that Google has completely taken over the space. And he said that libraries with important holdings have called off digitization projects on the assumption that "Google will do it." Scholars "with expertise in what they are doing" are being told to stand aside for those who don't, he added.

Despite all of those concerns, Duguid said he worried as well about the possibility that Google might abandon its effort. What if Google should end the project, realizing that it took on more than it could handle, he asked. The result is that no one will ever do digitization right. “This is probably a once and for all scanning," he said. "Nobody is ever likely to take this task on again." (In the question and answer session, the same concern was raised by a historian who is much more enamored of the Google digitization than is Duguid, with this scholar saying he feared the day when "the suits" take over Google and could eliminate the program.)

Brandon Badger, project manager for Google Books, said that the scholars need not worry. He said that there is "passion" for book digitization throughout the company.

Badger didn't directly engage most of the criticisms, but he repeatedly talked about Google's desire to help scholars. When one historian talked about how easy the Web makes digital piracy, Badger said he saw Google Books creating the means to sell serious books to far more people in digital versions. He compared the idea to iTunes, in which the availability of music you can purchase in an instant created an alternative to downloading pirated versions.

On the topic of errors, Badger said Google was committed to improvements that would speed corrections. He said that he envisioned a system down the road where, if two scholars point out an error, it would be automatically corrected. But he also said that some errors (such as photos of the hands of those scanning books) were inevitable and were a cost of moving ahead on the project. Holding up a book he read on his flight to the conference, he said that Google could focus entirely on making perfect scans of every page of every book, with classifications that couldn't be disputed and perfect images without anyone's hands visible. But he said it would take 100 years "and we'd all be dead."

The book that Badger held up was later cited in the question period as an example of the culture clash between Google and academe. Google is in fact proud of a non-corporate culture, and Badger was the least formally dressed person on the panel. But although Badger could have passed for a graduate student, the book he held up -- tips on golf -- wasn't what an aspiring grad student would have read on a flight where future department colleagues might spot him.

The professor who cited the book said that when Badger held it up, "you could hear people's eyes roll." And the professor expressed fear that Google and academics might not be engaged in the conversation they need because of a culture clash. He asked Badger whether Google ever considers hiring academics or people who think like academics to handle such discussions and to contribute to the creation of projects.

Badger said that in fact he viewed the historians in the audience as "power users" and came to meetings such as this to learn from them. He said that Google doesn't want to produce products only for "geeky engineers." Badger joked that he would post something on Craigslist right away to seek out more academic advice. The professor, noting the culture clash once again, suggested H-NET might be a better place to seek advice from academics than would be Craigslist.

