Commentary on the Digital Public Library of America

The carnage and manhunt in Boston last week obliged the Digital Public Library of America to postpone its grand opening festivities at the Boston Public Library until sometime this fall. So sudden a change of plans could only create a logistical nightmare. The roster of museums, archives, and libraries participating in DPLA runs into the hundreds, and the two-day event (Thursday and Friday) was booked to capacity, with scores of people on the standby list. But the finish line for the marathon was just outside the library, and rescheduling unavoidable.

The delay applied only to the gala, not to DPLA itself: the site launched on Thursday at noon, E.S.T., right on schedule. The response online has been, for the most part, enthusiasm just short of euphoria. The collection contains not quite 2.4 million digital “objects,” including books, manuscripts, photographs, recorded sound, and film/video. More impressive than the quantity of material, though, is how much thought has gone into how it’s made available. 

That’s true even of the site’s address: DP.LA. I’ve seen at least one grumble about how anomalous this looks. Which it does, but in a good way. Even if you forget the address, it takes no effort to reconstruct. The brevity of the URL makes it convenient to type on a cellphone; when you do, the site’s homepage is readily navigable on the small screen. That demonstrates an awareness of how a good many visitors will actually use the site – more so than is often the case with library catalogs online.

DPLA is the work of people who understand that design is not just icing on the digital cake, but a significant (even decisive) factor in how we engage with content in the first place. They have made available an application program interface (API) for the site, which is a very useful thing indeed, according to my source in the geek community. With the API, users can create new tools for sorting and presenting the library’s materials. Combine it with a geolocation API, for example, and you could put together an application displaying the available photographs of the street you are on, organized decade by decade. 

The library’s potential for assembling and integrating an incredible range of documents and knowledge is almost unimaginable. Excitement seems appropriate. But in describing my own impressions of DPLA, I want to be a little more qualified about the enthusiasm it inspires. Things are not nearly as far along as some comments have implied. This isn’t just naysaying. The site is currently in its beta version, and many of my points will probably be nullified in due course. But it’s better to be aware of some of the limitations beforehand than to visit the site expecting a digital Library of Alexandria.  

One thing to keep in mind is that DPLA is not so much a library as an enormous card catalog, with the “shelves” of books, photographs, and so forth being the digital collections of libraries and historical societies, large and small, all over the country. The range of material offered through the Digital Public Library of America reflects what people running the local collections have decided to digitize and make available. What DPLA gathers and makes searchable is the metadata: descriptions of what a document contains (its subject, origins, copyright status, and so on) and of its characteristics as a digital object (size and file type).

The DPLA “card” gives the available information about an item, often accompanied by a thumbnail image of the book cover, manuscript, etc. – along with a link taking you to the digital repository in which it appears. DPLA puts the metadata into a standard format. But much of the content-description will inevitably be done by local librarians and archivists, making for a considerable range in detail. Often the DPLA entry will provide a bare minimum of description, though some entries run to a paragraph or two.

But the entry is only as strong as its link. It seemed appropriate to make one of my earliest searches at the Digital Public Library for the quintessential American poet Walt Whitman. There were 52 hits, with 9 of the top 10 being manuscripts of his letters in the Department of Justice collection at the National Archives. Not one of the links for the letters worked. By contrast, I had no trouble getting access to photographs of the poet held by the Smithsonian Institution.

This proved par for the course. Most links worked -- but out of two dozen entries for items in National Archives, only one did. It’s hardly surprising (gremlins have a strong work ethic), but it shows the need for troubleshooting. Users of the library can be expected to point out such glitches, if encouraged to do so. It might be worth adding a widget that would appear in each record allowing users to flag an inoperative link, a typographical error, or some problem with the content description. It's true that the site has a contact page, but people are more likely to report errors if they are encouraged to do so.

Continued thumbing through the catalog demonstrated how early a stage DPLA is in accumulating its collection – and how much fine-tuning its search engine may need.

Entering “Benjamin Franklin,” you get more than 1,400 results. Out of the first 30, all but 3 are documents (usually death certificates) for people named after the inventor and statesman. A toolbar on the left allows the user to refine the search in various ways – but the most useful filter, by subject, is at the very bottom and easy to overlook.

It was encouraging to get 17 results when searching for Phyllis Wheatley, the first published African-American poet, but 15 of them led to records from the 1940 census, by which point she had been dead the better part of 150 years. Only one of the other two was at all germane to her as historical figure. The other concerned an Atlanta branch of the Young Women’s Christian Association named in her honor.

I expected to locate just a few things about the Southern Tenant Farmers Union of the 1930s, but in fact got no hits at all. At the other extreme, DPLA has records for more than 90 items pertaining to the Ku Klux Klan – photographs, handbills, and cartoons, both pro- and anti-. Quite likely these were among the most striking and attention-grabbing items in various collections, and were digitized for use in print publications and online. It's concrete evidence that the Digital Public Library of America's offerings will be only as representative as the decisions made by the contributing institutions.

A number of foundations and government agencies have lent their support to DPLA, and its progress towards incorporation as a 501(c)3 organization should make it an even more appealing destination for the big philanthropic bucks. But important as funding certainly is for the library’s future, what it will ultimately be decisive for its success is a massive infusion of intellectual capital. Some of it will come from code writers hacking out new applications using the library's metadata and API. More than that, though, DPLA will need to encourage the participation and the expertise of people using the site. It's an impressive foundation  and scaffold, but it's up to scholars, librarians, and other knowledgeable citizens to build the library, from the ground up.



Editorial Tags: 

Thomas Friedman is wrong about MOOCs (essay)

There’s a legendary story about Anne Sexton’s learning how to write a sonnet by watching I.A. Richard’s educational-television series in the late fifties. I’ve thought about that fairly often while reading the daily stories on MOOCs. In the Sexton/Richards instance, there was a fortuitous electronic meeting of an excellent teacher who saw possibilities in the then “new” technology of television and a motivated student who was ready to write as if  -- and according to her this was indeed the case -- her life depended on it.

That hyperbolic tone of the last sentence above -- a tone that readers of Sexton’s later poems and interviews are already familiar with -- is also the tone of a good many declarations about MOOCs.

Thomas Friedman’s latest column “The Professors’ Big Stage” is a case in point. His piece on “the MOOCs revolution” is riddled with contradictions, shallow thinking -- and an error in basic arithmetic.

Friedman begins by excitedly informing us that he’s just returned from a “great conference” sponsored by M.I.T. and Harvard on “Online Learning and the Future of Residential Education.” He doesn’t explain why he had to attend in person, or question why the conference wasn’t online, but he adds his own title, “How can colleges charge $50,000 a year if my kid can learn it all free from massive open online courses?" That premise, it soon becomes clear, is moot.

More on Friedman and MOOCs
"Thomas Friedman has as much
credibility on education as I do on
dunking a basketball," writes
John Warner.

As Friedman goes on to extol the virtues of using MOOCs as supplements for traditional courses and programs, MOOCs then become an example of preliminary programmed learning -- the sort of thing that community colleges have been doing in terms of remedial aid for quite a while. Publishers like Bedford/St. Martin’s have offered online drills for years. And if the MOOC is tied to an accredited college’s course, then Junior and his dad are still paying for Junior’s education.

According to Friedman, students enrolled in a hybrid course at San Jose State, which combines M.I.T.’s introductory online Circuits and Electronics course with traditional in-seat class time, have done quite well: “Preliminary numbers indicate that those passing the class went from nearly 60 percent to about 90 percent.” There’s even better news for the students involved in that course than Friedman’s assessment: he sees the improvement as one-third; in fact, a jump from 60 percent to 90 percent means the number of students passing the class increased by one-half, or 50 percent.

We should note that this is an argument for remedial preparation and/or immersion in a subject -- not necessarily an argument for online versus in-seat instruction.

And that, of course, is just one class. Friedman sees MOOCs as going far “beyond the current system of information and delivery -- the professorial ‘sage on the stage’ and students taking notes, followed by a superficial assessment. This description not only fails to describe adequately the current system but also ironically illuminates some of the biggest problems with MOOCs. Given the scale of MOOC courses, the only kinds of student assessment that can be accomplished are superficial. And we will have to hope that some enrolled students, unlike Friedman, still believe in note taking. The MOOC lecture system, however, puts that sage right back on the stage -- as Friedman’s very title for his op-ed indicates.

Moreover, his discussion of Michael Sandel, the Harvard professor whose Justice course will have its American debut on March 12 as the first humanities offering on the M.I.T./Harvard edX online learning platform, focuses not on aspects of the course but on Sandel’s old-fashioned appearances on the lecture circuit. 

Sandel, whose course has been translated into Korean and shown on national South Korean television, recently traveled to Seoul (again, why?), where he lectured “in an outdoor amphitheater to 14,000 people, with audience participation.” There was no indication as to how long the Q&A session ran.

Academicians often fall prey to magical thinking; at my former college, each time we hired a new provost (10 in my 16 years), we were certain that this was the one who would be our savior.  

Each time we created a new central curriculum (three in my 16 years; the final stage just before I left was to exempt adult students from completion of the college’s core requirements), we were certain that this was the answer. Smaller, struggling colleges may see offering licensed supersized online courses as cost-saving -- an escape from the situation they currently find themselves in, in which every small school worries about going online or bust.

Many of these colleges turned to creating their own individual online courses -- already being referred to as “traditional online courses” -- as a solution, only to find that the expenses have outweighed the successes: they are costly in terms of faculty training, serve very small audiences (often sitting only a building or two away), and put severe strain on IT departments.

Online consortiums in which struggling schools have banded together have also proved to be problematic; I am thinking in particular of one class that I was asked to review for my former college, which was a member of such a consortium: an accelerated multi-genre writing class, which asked students to write one poem, one short story, and one essay over a period of five weeks. The "final project" consisted of one additional work, in the students' choice of genre. It was thus possible to fufill 50 percent of the course requirements with two haiku.

MOOCs, of course, have their ur-versions, which include not only Henry Ford’s production line and the rise of fast food, but massive online delivery experiments in the mid-1990s, online remedial drills, large introductory-course in-seat lectures, Sunrise Semester, and the Great Lecture Series, but also the 19th-century lecture. And possibly there was someone who asked Harvard for credit for attending Thoreau’s lecture on “Society” -- or for attending a lecture by P. T. Barnum.

Friedman does note, near the end of his exhortatory column, that “We still need more research on what works.”

Indeed. Along with the return of the sage on the stage, this newest educational/industrialized development has brought along with it -- no surprise to anyone who has taught a traditional online class, a class with online components, or a traditional in-seat class -- some old concerns: problems with technology; problems with underprepared and unmotivated students; problems with class participation in discussions (one sage walked off the stage); and concerns about retention and plagiarism.

Assessment will continue to be one of the biggest concerns: both assessment of the overall course and assessment of any student work that goes beyond the level of a drill. Financial issues will come in to play, as will work force issues. Hierarchical divides among students, faculty members, and institutions will not disappear.

Finally, there is a dynamic in a traditional classroom that MOOCs simply can’t provide. In small, in-seat courses and workshops, students discover that they are part of a community, in which each person has a responsibility to contribute and the reward of personal interaction. Such courses allow for flexibility, Socratic questioning, and serendipity. Face-to-face meetings and small-group dynamics are important parts of education and socialization. And they provide an essential break for students from their hours of online gaming, posting and browsing.

One other analogy that comes up in discussions of MOOCs is “correspondence course.” It’s considered a dirty term, and yet, it may be an accurate description as thousands of students and piecework adjuncts labor at their solitary tasks.

And there may be something to be learned from a fictional account of a correspondence school: J. D. Salinger’s “De Daumier-Smith’s Blue Period.” The alienated protagonist concludes that “We are all nuns” -- working silently, separately, seeking salvation.

Carolyn Foster Segal is a professor emeritus of English at Cedar Crest College. She currently teaches at Muhlenberg College.

Editorial Tags: 

Colleges try to beat textbook costs with book reserves

Smart Title: 

To lessen the impact of rising textbook costs, three institutions have created programs that allow students to borrow course materials.

JSTOR to offer limited free access to content from 1,200 journals

Smart Title: 

After a successful pilot, JSTOR is launching its Register & Read program, which lets anyone read up to three articles from 1,200 of its journals every two weeks in exchange for demographic information.

Survey suggests students feel satisfied but not ecstatic about library services

Smart Title: 

Survey says students are generally satisfied with campus libraries, although a significant minority view them as irrelevant to academic success.

Oxford debates the role of its librarians and libraries

Smart Title: 

"Ask me" badges distributed at U. of Oxford raise hackles and draw questions as academics debate future of its library system.

At Educause, a call for digital preservation that will outlast individual institutions and companies

Smart Title: 

At Educause, a pitch for a digital preservation project that will outlast individual institutions and companies.

Don't let your personal flaws swamp your career (essay)

Blisters may not be career-endangering, at first. Notice such vulnerabilities before they get out of control, Maria Shine Stewart advises.
Editorial Tags: 

Essay on usage statistics and the research library

Intellectual Affairs

In a passage surely written with tongue in cheek, Friedrich Nietzsche states that a scholar of his era would consult 200 volumes in the course of a working day. That’s far too many, he suggests: a symptom of erudition’s decay into feeble bookwormery and derivative non-thinking. “During the time that I am deeply absorbed in my work,” he says, “no books are found within my reach; it would never occur to me to allow anyone to speak or even to think in my presence.” A noble modus vivendi, if not quite an admirable one, somehow.

Imagine what the philosopher would make of the 21st century, when you can carry the equivalent of the library of Alexandria in a flash drive on your keychain. Nietzsche presents the figure of 200 books a day as “a modest assessment” – almost as if someone ought to do an empirical study and nail the figure down. But we’re way past that now, as one learns from the most recent number of Against the Grain.

ATG is a magazine written by and for research librarians and the publishers and vendors that market to them. In the new issue, 10 articles appear in a section called “Perspectives on Usage Statistics Across the Information Industry.” The table of contents also lists a poem called “Fireworks” as part of the symposium, though that is probably a mistake. (The poem is, in fact, about fireworks, unless I am really missing something.)

Some of the articles are a popularization -- relatively speaking -- of discussions that have been taking place in venues with titles like the Journal of Interlibrary Loan, Document Delivery & Electronic Reserves and Collections Management. Chances are the non-librarians among you have never read these publications, or even seen them at a great distance, no matter how interdisciplinary you seek to be. For that matter, discussing the ATG articles at any length in this column would risk losing too many readers. They are peer communications. But the developments they address are worth knowing about, because they will undoubtedly affect everyone’s research, sooner or later, often in ways that will escape most scholars’ notice. 

Most of us are aware that the prominence and influence of scholarly publications can be quantified, more or less. The Social Science Citation Index, first appearing in 1956, is an almost self-explanatory case.

As an annual list of the journal articles where a given paper or book has been cited, SSCI provides a bibliographical service. Counting the citations then yields bibliometric data, of a pretty straightforward kind. The metric involved is simplicity itself. The number of references to a scholarly text in the subsequent literature, over a given period of time, is a rough and ready indicator of that text’s influence prominence during said period. The reputation of an author can be similarly quantified, hashmark style.

A blunt bibliometric instrument, to be sure. The journal impact factor is a more focused device, measuring how often articles in a journal have been cited over a two-year period relative to the total number of articles in the same field, over the same period. The index was first calculated in the 1970s by what is now Thompson Reuters, also the publisher of SSCI. But the term “journal impact factor” is generic. It applies to the IDEAS website’s statistical assessment of the impact of economic journals, which is published by the Research Division of the Federal Reserve Bank of St. Louis. And there's the European Reference Index for the Humanities, sponsored by European Science Foundation, which emerged in response to dissatisfaction with “existing bibliographic/bibliometric indices” for being “all USA-based with a stress on the experimental and exact sciences and their methodologies and with a marked bias towards English-language publication.”

As the example of ERIH may suggest, bibliometric indices are not just a statistical matter. What gets counted, and how, is debatable. So is the effect of journal impact factors on the fields of research to which they apply – not to mention the people working in those fields. And publication in high-impact journals can be a career-deciding thing. A biologist and a classicist on a tenure committee will have no way of gauging how good the candidate’s work on astrophysics is, as such. But if the publications are mostly in high-impact journals, that’s something to go by.

The metrics discussed in the latest Against the Grain are newer and finer-grained than the sort of thing just described. They have been created to help research libraries track what in their collections is being used, and how often and intensively. And that, in turn, is helpful in deciding what to acquire, given the budget. (Or what not to acquire, often enough, given what’s left of the budget.)

One contributor, Elizabeth R. Lorbeer, associate director for content management for the medical library at the University of Alabama at Birmingham, says that the old way to gauge which journals were being used was to look at the wear and tear on the bound print volumes. Later, comparing journal-impact factors became one way to choose which subscriptions to keep and which to cancel. But it was the wrong tool in some cases. Lorbeer writes that she considered it “an inadequate metric to use in the decision-making process because sub-discipline and newer niche areas of research were often published in journals with a lower impact factor.”

From the bibliometric literature she learned of another statistical tool: the immediacy index, which measures not how often a journal is cited, but how quickly. In some cases, a journal with a low impact factor might have a higher immediacy index, as would be appropriate for work in cutting-edge fields.

She also mentions consulting the “half-life” index for journals – a metric as peculiar, on first encounter, as the old “count the footnote citations” method was obvious. It measures “the number of publication years from the current year which account for 50 percent of current citations received” of articles from a given journal. This was useful for determining which journals had a long-enough shelf life to make archiving them worthwhile.

Google Scholar is providing a number of metrics – the h-index, the h-core, and the h-median – which I shall mention, and point out, without professing to understand their usefulness. Lorbeer refers to a development also covered by Inside Higher Ed earlier this year: a metric based on Twitter references, to determine the real-time impact of scholarly work.

One day a Nietzsche specialist is going to be praised for writing a high-twimpact paper, whereupon the universe will end.

Other contributions to the ATG symposium paint a picture of today’s research library as a mechanism incessantly gathering information as well as making it available to its patrons – indeed, doing both at the same time. Monitoring the flow of bound volumes in and out of the library makes it relatively easy to gauge demand according to subject heading. And with digital archives, it’s possible to track which ones are proving especially useful to students and faculty.

A survey of 272 practicing librarians in ATG’s subscriber base, conducted in June of this year, shows that 80 percent “are analyzing [usage of] at least a portion of their online journal holdings,” with nearly half of them doing so for 75 to 100 percent of those holdings. It’s interesting to see that the same figure – 80 percent – indicated that “faculty recommendations and/or input” was used in making decisions about journal acquisitions. With book-length publications entering library holdings in digital form, the same tools and trends are bound to influence monograph acquisition. Some of the articles in the symposium indicate that it’s already happening.

Carbon-based life forms are still making the actual decisions about how to build the collections. But it’s not hard to imagine someone creating an algorithm that would render the whole process cybernetic. Utopia or nightmare? I don't know, but we're probably halfway there.

Editorial Tags: 


Subscribe to RSS - Librarians
Back to Top