2 Models for Digitizing Collections

Google announces major expansion of its library project, bringing on 12 universities; Emory takes another approach.
June 7, 2007

Google's Library Project, which is in the process of digitizing millions of books at top university libraries around the world, announced a major expansion Wednesday: The 12 universities that make up the Committee on Institutional Cooperation have agreed to let Google digitize up to 10 million of their collective volumes -- generally those from the most distinctive parts of their collections.

The announcement brings to 25 the number of universities involved in the Google project, which is being hailed by some scholars for the way it will assure online access to volumes that have been largely available only in a few locations and that are in danger of decomposition. The project will involve both books in the public domain and copyrighted materials -- and the latter have been controversial. Groups of authors and publishers are suing Google over the Library Project, charging that it is infringing on copyrights, and those suing indicated that they would expect any eventual settlement in the case (should Google lose) to be applied to the additional works being added under the new agreement.

On the same day Google and the 12 universities made their announcement, Emory University announced a plan to digitize major portions of its collection -- independent of Google and using an intentionally different model.

The Google Expansion

The promise of the Google Library Project has always been its ability to offer an unmatched collection of digitized materials. Such major universities libraries as those of Harvard, Princeton and Stanford Universities are already involved, as are key academic libraries abroad, such as those at University of Oxford and Ghent University. Two of the CIC members are already members: the University of Michigan and the University of Wisconsin at Madison.

The new collections involved will come from those two and the 10 other members of CIC: Indiana, Michigan State, Northwestern, Ohio State, Pennsylvania State and Purdue Universities; and the Universities of Chicago, Illinois, Iowa, and Minnesota.

The idea of the Google expansion is to take the portions of these collections that are unique and that would thus add the most to the project. While final lists of collections are still being set, they are expected to include Northwestern's Africana collection, Chicago's South Asia collection, Minnesota's Scandinavia collection, and agriculture and food science collections at the land grant institutions in the consortium. Many of the 300 languages represented in the university libraries will be represented.

The works will join the Google project and will also make up a common digital storage system so that each of the universities involved will gain immediate access to many more materials. The universities will not be paid, but Google will cover the costs, which are expected to be significant, given estimates of up to $100 per book to digitize.

Books in the public domain and copyrighted works alike will be included, but for the latter, the Google book search process will yield only background information, summaries and information on where to locate the book. For books in the public domain, full searching and reading will be provided to users.

Mark Sandler, director of the CIC's Center for Library Initiatives, said at a press briefing Wednesday that he saw the project as a significant way for libraries to fulfill their missions. "Society trusts libraries to organize and preserve our cultural heritage," he said, and libraries have historically taken a "long term view" of what works to include.

But books -- especially older works -- are threatened by deterioration, which could destroy them or force libraries to restrict access. In addition, with more and more people doing research online and not in the stacks, there is a danger that books not in digital format will be "squeezed into a smaller and smaller social space."

This project, he said, is designed to keep "generations of ideas alive."

Sandler acknowledged that universities' professors have a range of perspectives on the copyright issues involved, but he said that he sees more and more faculty attracted to the ability to share knowledge broadly and to have instant access to more materials.

Allan Adler, vice president for legal and government affairs of the Association of American Publishers, one of the groups suing Google, said that while his group would not sue libraries, he didn't want people to think that the addition of new members of the Library Project meant that the copyright issues had been resolved. The lawsuit is in discovery right now, and Adler said it is very much alive.

"Either the court cases will work themselves out or there will be a settlement in which additional libraries will be addressed in same manner," he said. "If there is a legal decision in favor of the plaintiffs, that will certainly necessitate an unraveling of these agreements," he said.

Another Model

The same day as the Google announcement, Emory announced another model for digitizing collections. Emory is planning to digitize about 200,000 of its volumes that are in the public domain and to make the materials available online free or available for purchase as inexpensive print-on-demand volumes through Amazon.com. While people would pay for the print-on-demand books, Emory officials said that pricing would be designed just to cover costs, not to earn a profit for the university.

An early focus of the project will be Emory's extensive collections in Southern history and culture.

Martin Halbert, director for digital programs and systems at Emory's Robert W. Woodruff Library, said that his institution agreed with Google about the importance of digitizing works, especially older works in danger of deterioration. But he said that the university's effort was intentionally different from the Google project. No copyrighted materials will be involved. And the university -- not an outside entity -- will have full control over the digital product.

"We saw that as a critical thing," Halbert said. "We needed to retain our role as stewards of these assets for Emory and the public."


Be the first to know.
Get our free daily newsletter.


Back to Top