Volume 14 Number 1
[Editor's note: SUNYConnect looks to the SUNY university centers for leadership. Edward's article is a great example of such leadership related to an essential part of a successful digital library future.]
Late in 2010, Binghamton University Libraries embarked on a project to implement a Digital Preservation System. Binghamton has been interested in preserving digital objects for quite a while and needed to find a system that met our needs. The Digital Preservation System that Binghamton chose to implement is Rosetta, developed by Ex Libris. Before making the final decision, Binghamton had created pilot projects using OCLC's CONTENTdm and Innovative Interfaces' Content Pro digital repository software. While both CONTENTdm and Content Pro have their strengths, neither of them had a built-in digital preservation component.
What is Digital Preservation?
According to the Digital Preservation Coalition (2009; references found in the Linkable Links section), digital preservation is a "series of managed activities necessary to ensure continued access to digital materials for as long as necessary [...] beyond the limits of media failure or technological change." These digital materials may be either born-digital or the result of digitization projects. An important part of this definition is that these actions need to go "beyond the limits of media failure." In other words, simply backing up data files is not digital preservation. While backups are an important part of any digital preservation strategy, they are not by themselves a complete strategy. Most people can probably relate to someone who has old files, such as an early WordPerfect file, that they saved but they no longer have the software that can read it. Simply preserving the bits does not necessarily ensure access. At the time of Binghamton's decision, Rosetta was the only platform with a built-in preservation system. While some universities are working to add preservation components to open source digital library systems such as DSpace and Fedora, these projects still were not ready to be used by a library, especially one that does not have software programmers that they could dedicate to developing and maintaining the software. Since our decision the Florida Center for Automation announced in November 2011 that they have released their DAITSS (Dark Archive in the Sunshine State) digital preservation software under a GPL v3 license (Caplan, 2011). While Binghamton did not evaluate this software, libraries that are currently looking for an open source solution may want to investigate DAITSS.
Rosetta is based on the Open Archival Information System (OAIS) model and is designed to conform with the requirements to be considered a Trusted Digital Repository. Rosetta also "supports preservation metadata, including PREMIS (PREservation Metadata: Implementation Strategies) objects." (Library of Congress, 2010) The PREMIS data dictionary came out of a project that "was charged to define a set of metadata elements that are implementation independent, practically oriented, applicable to all types of materials, and likely to be needed by most preservation repositories." (Caplan and Guenther, 2005)
Using these Open Standards for metadata is important. By adhering to open and internationally accepted standards differing systems will be able to communicate with each other or be able to use cross-walks between the open formats. It is unreasonable to believe that any software system will last forever and Digital Preservation Systems are not exempt. At some point a library will most likely need to migrate their data to another system. If both systems rely on internationally accepted standards it is less likely that important metadata will be lost in the process.
Why Digital Preservation?
Rosetta at Binghamton
Each object (or set of objects) is deposited into Rosetta as an Intellectual Entity (IE). An IE contains descriptive metadata (in Dublin Core format), the object itself, and access rights forms (who can view the content). While coming up with the appropriate descriptive metadata fields proved to be more involved than first thought, Binghamton felt it was important to use standard Dublin Core metadata fields and to use them consistently across collections. Dublin Core can be very flexible and that is a strength. However, it is also a weakness. It was time-consuming to create metadata best practices. This was important to make theobjects accessible in Primo, but to also allow for easier migration in the future. It was time well-spent.
Rosetta and Discovery
The Rosetta portion of our discovery layer is made possible because it has a digital publishing system. Sets of records can be made available for harvest by any system that can ingest records utilizing the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). OAI-PMH is a low-barrier mechanism designed for repository interoperability. Binghamton was the first institution to work with Primo and Rosetta together so there was a learning curve for Ex Libris and Binghamton. but it has proven to be a good choice.
Each collection in Rosetta is unique. But, because there are many similarities in the processes, Binghamton has been able to handle projects with the few staff involved. Metadata Librarians create metadata forms and provide guidelines on how to use them. While in some cases they actually create the metadata, it is often the case that other staff with more collection-specific knowledge create the actual metadata. Binghamton has used staff from Special Collections, a Subject Librarian, and staff from other departments on campus as appropriate. The Metadata Librarians provide training to the staff creating the metadata as well as determining controlled vocabulary terms. In many respects, the Metadata Librarians can be seen as project managers for Binghamton's Digital Preservation System.
Currently Ex Libris does not offer a hosted approach to implementing Rosetta that might be of interest to smaller libraries. Instead, they have suggested that libraries that do not have the staff or expertise necessary look to consortia and other partners for digital preservation services via Rosetta. Belgium's K.U. Leuven, the oldest Catholic university in the world, has added Rosetta to its Leuven Integrated Archive System (LIAS), which provides services to other libraries, museums, and archives throughout Flanders via the LIBIS network. This is a model that SUNY may wish to investigate in the future where one (or more) of the campuses and/or the Office of Library and Information Services (OLIS) offers digital preservation services to other campuses on a cost-recovery basis.