Delivering on the Promise of Digital Data
The boom in digital technology has been a boon for research, resulting in a remarkable explosion in the number and quality of data collections and a marked expansion in their availability to a broad range of interested parties. But to ensure that the databases continue to be viable and are used to the fullest extent, the federal government, led by the National Science Foundation, must craft a clearer strategy for managing and financially supporting them, a National Science Board task force said in a report Tuesday.
The report, "Long-Lived Data Collections: Enabling Research and Education in the 21st Century," was prepared by a special committee of the science board's Committee on Programs and Plans.
The board created the panel out of a sense that as new technologies have driven a welcome proliferation in digital collections of various kinds -- many of which the science foundation has financed -- NSF "strategies and policies governing long-lived data collections ... have been developed incrementally and have not been considered collectively."
The report says the board is concerned about the situation, and calls the need to address it "urgent."
In many ways, this is the kind of "problem" that policy makers tend to like: one that has created significant new opportunities, in this case for better and more freely available scientific and other research. The report focuses on what it calls "long-lived digital data collections," which it defines as databases or systems that collect text, images or other information in digital form, are made available over the Internet, and are likely to exist long enough that they could be affected by changing technology over time. And the report virtually gushes about the significance of the impact that digital technologies have had on the research and education enterprises.
Such digital collections "enable analysis at unprecedented levels of accuracy and sophistication and provide novel insights through innovative information integration," the report says. "Through their very size and complexity, such digital collections provide new phenomena for study. At the same time, such collections are a powerful force for inclusion, removing barriers to participation at all ages and levels of education."
To keep them that way, and to ensure that they contribute as fully as possible to the "democratization" of science and education, the science board's report recommends a series of steps aimed at supporting digital collections financially and technologically and at crafting policies that encourage their creation and use.
- The National Science Foundation should coordinate its investments in digital data collections so that they are in sync with the foundation's investments in research and education that utilizes those collections.
- The NSF should work with organizations that manage digital collections to help them make decisions on how to collect and share the data -- decisions they are making on behalf of current and future users.
- The foundation should ensure that training about how to use digital data collections is widely available, to "broaden participation in digitally enabled research."
- The NSF, working with universities and the research community generally, "should act to develop and mature the career path for data scientists and to ensure that the research enterprise includes a sufficient number of high-quality data scientists."