The Digital Public Library of America @ ALA
6/30. Late afternoon. 3:30. The ALA President’s Program on the Digital Public Library of America (DPLA) begins…
If I’d kept a daily journal during the conference, one entry for a Sunday would’ve read as above. Alas, I didn’t keep a journal. But I did take notes and noted mentally the strange appropriateness of the session on the DPLA occurring on the traditional day of the Christian Sabbath. The mastermind behind the DPLA, Mr. Dan Cohen, conceives of his and the DPLA’s mission as no less than “to make the cultural and scientific heritage of humanity available, free of charge, to all.” In other words, to build an über-collection of all of America’s smaller archival and library (unter-(?))collections.
It’s a task of mythic proportions but perhaps not that bold (that is, in a practical sense—it certainly is bold in a psychological sense) a task to undertake in the 2010s. We have, in this decade, Web 2.0, Google, Cray supercomputers, WorldCat, warehouses of servers, mass digitization (and then discarding of) physical historical documents, and a glut of satellites around the earth enough to make our planet look like a twin ringed Saturn from outer space.
The possibility of the interconnectedness of all the country’s major and minor libraries, museums, universities, and other cultural institutions—which is the grand dream of the DPLA—through a central portal (i.e. the DPLA’s website, simply, dp.la), seems realistic, if only someone with the initiative, like Cohen, would oversee its enactment.
Well, he’s seen the DPLA into existence after several preliminary meetings in 2010, and the project is still growing, a work-in-progress that aims to be the portal to thousands of national, regional, and local library records across the United States, so that no researcher in the land cannot find a piece of sought information if he or she wants it.
What the DPLA does is bring together voluntarily contributed access to digitized material from archives and libraries—academic, public, special, and other—from across America and then organizes all that contributed info according to the metadata standard created by Europeana, the DPLA’s European counterpart, which was founded earlier, in 2008 (see www.europeana.eu). Thus, says Cohen, as an incidental plus, with this common standard in place, searchers can, in the future, when a partnership between the DPLA and Europeana is formed, be able to simultaneously search the inconceivably large joint database of info hosted by not just one or the other but both aggregator-sites. (gasp!)
The prospect—hypothetical though it is, as yet—of so much of the world’s “cultural and scientific heritage” being primarily accessed by researchers and the general public through a single online archive overseen by Cohen seems disconcerting at first. Does—and will—the DPLA accept any information from any institution? If not, then what is the process by which Cohen and his team pick and choose what to allow into the DPLA? Can one entity handle so much responsibility: technical, ethical, legal, and so on? (Is this desirable?)
To answer that last non-parenthetical question, the answer is a provisional “yes.” For the other questions, one must look at how the DPLA is structured. The DPLA, in its own words, is an aggregator of data. It is also—and at the same time—an aggregator of other aggregators of data.
Here a distinction is made between so-called Content Hubs and Service Hubs. Large individual institutions from which the DPLA aggregates data directly are called Content Hubs. Service Hubs, on the other hand, are “state or regional digital libraries” that gather info from a network of smaller individual institutions, each too small to host their own unique digital libraries.
Here’s more on what this means. Content Hubs are large enough to “provide more than 250,000 unique metadata records,” each record corresponding to digitized item that the DPLA can make available on its website. The DPLA’s current Content Hub partners are ARTstor, the Biodiversity Heritage Library, the David Rumsey Map Collection, the Harvard Library, HathiTrust Digital Library, the National Archives & Records Administration, the Smithsonian, the NY Public Library, and the University Libraries of Virginia, Southern California, and Illinois at Urbana-Champaign.
Service Hubs are what might be called “umbrella” digital libraries which are not their own unique entity as the above Content Hub partners are, but rather simply aggregate data from, as stated, a network of many small-size unique entities (those entities who have less than 250,000 of their own metadata records). These smaller entities provide their existing records and/or materials to a Service Hub which then digitizes and creates metadata records for each digitized item in accordance with the DPLA’s metadata standard. Then the Service Hub provides this aggregation of data to the DPLA which includes the data on its site. Current Service Hub partners with the DPLA include Digital Commonwealth (Massachusetts Collections Online), Mountain West Digital Library, and the state Digital Libraries of Georgia, Kentucky, Minnesota, and South Carolina.
Collecting all this data and metadata on one website is, again, a task of mythic proportions but doable, and it seems like, given the complex interweaving of the DPLA, the Content Hubs, and the Service Hubs, the project will remain, at least for the time being, unavoidably egalitarian and truly open to all.
However, as for whether or not such a project will make a positive, negative, or indifferent impact on researchers’ or the public’s lives—given that those who are driven to research are already aware anyway of the online presence of open digital collections of various museums, universities, and libraries, which are all accessible without the DPLA, and that there already exist such encyclopedic portals of information as Wikipedia, the online Stanford Encyclopedia of Philosophy, and Google Scholar, just to name a few—is yet to be seen.