University-led efforts to aggregate library collections into a single digital repository inevitably toil in the shadow of Google, which has scanned more than 13 million books since 2004 as part of its controversial Google Books Project, and plans to scan a hundred million more before it’s done.
HathiTrust, a cooperative based at the University of Michigan, has taken steps to make sure that Google is not the only one putting together a comprehensive digital archive of academic library content. After only two and a half years, HathiTrust now counts more than 50 member libraries and a collection of 8.2 million digitized works, including 4.5 million books. According to a new study, by 2014 HathiTrust’s digital archive will mirror 60 percent of works currently held in print by the major U.S. research libraries.
But those who see either of these mass digitization projects as a sign that the “library of the future” -- a place where aisles of bookshelves have been replaced by study spaces, comfy chairs, and computer terminals from which students can instantly access catalogs of digital content larger than any physical collection a library could realistically hope for — is just over the horizon are getting ahead of themselves, says Constance Malpas, a program officer for the Online Computer Library Center.
In a new paper, Malpas acknowledges that all-digital libraries may be the future. But that future is not here yet, and unless libraries come up with a more efficient system for sharing and preserving printed books in the meantime, the space and money pressures currently facing university libraries will worsen, Malpas says.
Malpas spent the last year studying HathiTrust’s archive, along with the print storage repository ReCAP and the catalog aggregator WorldCat. Her conclusion? HathiTrust serves as crucial insurance against the inevitable decay of printed books (and may serve as a comfort to academics who prefer that a Silicon Valley corporation not become, by default, the sole custodian of aggregated digital library content).
But in the current climate of U.S. copyright law and the lack of a clear way forward on licensing access to digitized books, HathiTrust’s main practical function is to insure libraries’ print collections with digital backups, says Malpas. Old-fashioned book lending, she says, is going to be around for a while yet. (She’s not the only one who thinks so, either: Stanford researchers have predicted that in certain disciplines it could take two generations, or 50 years, before demand passes the tipping point from print to digital.)
That said, libraries would do well to come up with newer, better ways to share the duties of lending and preserving printed books, she says. “We need to ensure a print supply chain, but at a much lower price point than in the past,” she told Inside Higher Ed in an interview. The current alliances are generally small, informal, and money-losing — often little more than “gentlemen’s agreements” among libraries in a region.
“If the status quo were preserved, we would see a number of institutions facing dramatic space pressures, moving print collections around and in some case jettisoning parts of them,” says Malpas. Here, again, there is little collective accounting, she says; two college libraries that are theoretically working together might end up getting rid of the same book, leaving at a loss any student or researcher at either institution who might want to borrow it in the future.
On the other hand, if all the major research libraries were able to pull together a “robust shared print offer” from their collections, Malpas says each library could reclaim hundreds of thousands of dollars' — perhaps even millions' — worth of space per year.
She acknowledges that the extent to which individual institutions would in fact reclaim space and improve their service would vary according to their unique needs, but in a nutshell here is how it would work: 50 different libraries each might have a certain number of books that, while underused, are still being kept around, either in the stacks or in remote storage. Instead of having all 50 libraries continue to stock that book, a subset of, say, five would agree to keep it, lending it to the others when necessary. Then all 50 would pool the costs for maintaining those five copies.
The result, then, is that 45 institutions would have one book fewer to shelve, and the cost to every institution of maintaining an available copy of that book would go down by a factor of 10.
A study last year by Paul Courant, the university librarian at Michigan, and Matthew (Buzzy) Nielsen, an Oregon library economist, found that “open stack” (read: browsable) storage costs libraries $4.26 per book, per year. “High-density” storage, where books are held in temperature-controlled units, usually off-campus, costs about $0.86 per book, per year. Granted, a book in “high density” storage might be closer at hand than one held on another campus -- but then again, maybe not. A year ago, the Syracuse University librarian Suzanne Thorin announced that Syracuse would begin storing its low-use books in a facility in Patterson, N.Y., 240 miles away. (However, Thorin had to backpedal when 200 students and professors showed up to protest at a public meeting about the issue.)
“In a shared service context, there is some risk that the concentration of demand from multiple institutions will result in increased retrievals and higher operating costs,” Malpas acknowledges in her paper. However, she says that membership to shared digital repositories such as HathiTrust -- which does provide access to more than 2.1 million “public domain” works (books, papers, and serials for which copyright has expired or been relinquished by the holder) -- "might bridge the gap between a well-documented decline in the use of academic print collections and the anticipated shift toward scholarly reliance on full-text resources."
HathiTrust, founded in 2008, is probably the closest thing academe has to a Google Books Project of its own -- although it might not exist had Google not come calling several years earlier. In 2004, as a condition of allowing the information company to scan its print collection, Michigan stipulated that Google would leave a copy of each scanned volume with the library. That condition became standard as Google began scanning books at other academic libraries, says Jeremy York, a project librarian at HathiTrust. The libraries eventually pooled their duplicate copies, along with the fruits of other digitization projects, to create the trust, which supports a combined archive through member dues.
Still, the value of HathiTrust is, for now, primarily as a vehicle for preservation. ("Hathi" is Hindi for "elephant," an animal known for its exceptional memory.) Members cannot access digitized versions of texts they have not bought in hard copy, and there is not yet a framework in place for licensing access to digitized materials. As it stands, the members of the trust are now paying the trust to insure their catalogs with digital backups.
Despite the various risks and unknowns, Malpas maintains that more sharing of the duties of lending and preservation, especially in the print realm, is crucial as libraries try to keep costs down while navigating the awkward limbo between the print and digital eras. To that end, HathiTrust is currently developing a strategy for coordinating the storage and distribution of print monographs among its members, according to York. Such alliances aimed at improving the supply chain for bound books might seem counterintuitive in the supposed dawning of the digital age, but it would be a costly mistake to forsake the infrastructure for providing and maintaining print volumes prematurely, says Malpas.
“The organizational change required to achieve these gains is likely to be substantial and challenging to implement,” she says. “Yet the opportunity costs of inaction may prove even greater than the risks of enacting shared print management regimes.”
For the latest technology and opinion from Inside Higher Ed, follow @IHEtech on Twitter.