SUNYergy Archive: Access to All Issues January 2012
Volume 14 Number 1
Page 1
Implementing Rosetta at Binghamton University Libraries
by Edward Corrado, Binghamton University

Edward Corrado

Cover Story

Binghamton and Rosetta

Features

Focus on SUNY and SUNY Libraries

SCLD Taskforces

Discovery Systems

SUNYConnect Updates

How to Contact Us

Linkable Links

[Editor's note: SUNYConnect looks to the SUNY university centers for leadership. Edward's article is a great example of such leadership related to an essential part of a successful digital library future.]

Late in 2010, Binghamton University Libraries embarked on a project to implement a Digital Preservation link to BinghamtonSystem. Binghamton has been interested in preserving digital objects for quite a while and needed to find a system that met our needs. The Digital Preservation System that Binghamton chose to implement is Rosetta, developed by Ex Libris. Before making the final decision, Binghamton had created pilot projects using OCLC's CONTENTdm and Innovative Interfaces' Content Pro digital repository software. While both link to RosettaCONTENTdm and Content Pro have their strengths, neither of them had a built-in digital preservation component.

What is Digital Preservation?
Most digital repository software is designed to provide access and some level of content management, not to ensure long-term digital preservation. Long term preservation is something that libraries, especially research libraries, have been doing with print volumes, so why should digital objects be treated differently?

According to the Digital Preservation Coalition (2009; references found in the Linkable Links section), digital preservation is a "series of managed activities necessary to ensure continued access to digital materials for as long as necessary [...] beyond the limits of media failure or technological change." These digital materials may be either born-digital or the result of digitization projects. An important part of this definition is that these actions need to go "beyond the limits of media failure." In other words, simply backing up data files is not digital preservation. While backups are an important part of any digital preservation strategy, they are not by themselves a complete strategy. Most people can probably relate to someone who has old files, such as an early WordPerfect file, that they saved but they no longer have the software that can read it. Simply preserving the bits does not necessarily ensure access. At the time of Binghamton's decision, Rosetta was the only platform with a built-in preservation system. While some universities are working to add preservation components to open source digital library systems such as DSpace and Fedora, these projects still were not ready to be used by a library, especially one that does not have software programmers that they could dedicate to developinglink to DAITSS and maintaining the software. Since our decision the Florida Center for Automation announced in November 2011 that they have released their DAITSS (Dark Archive in the Sunshine State) digital preservation software under a GPL v3 license (Caplan, 2011). While Binghamton did not evaluate this software, libraries that are currently looking for an open source solution may want to investigate DAITSS.

Open Standards
When talking about long-term digital preservation the importance of open standards and open formats should not be underestimated. If data is locked into proprietary formats, it is more likely to become inaccessible in the future. While the rapid growth of digital technologies has led to more information becoming more accessible, it has also led to issues involving obsolescence and dependency (Corrado, 2005). The use of open standards is important in both file formats and in metadata. File formats that are created from open standards are less likely to become obsolete (Vilbrandt et al., 2004). If a file format based on an open standard does become obsolete it is likely that it will be able to be migrated to a newer format.

Rosetta is based on the Open Archival Information System (OAIS) model and is designed to conform with the requirements to be considered a Trusted Digital Repository. Rosetta also "supports preservation link to LOC PREMISmetadata, including PREMIS (PREservation Metadata: Implementation Strategies) objects." (Library of Congress, 2010) The PREMIS data dictionary came out of a project that "was charged to define a set of metadata elements that are implementation independent, practically oriented, applicable to all types of materials, and likely to be needed by most preservation repositories." (Caplan and Guenther, 2005)

Using these Open Standards for metadata is important. By adhering to open and internationally accepted standards differing systems will be able to communicate with each other or be able to use cross-walks between the open formats. It is unreasonable to believe that any software system will last forever and Digital Preservation Systems are not exempt. At some point a library will most likely need to migrate their data to another system. If both systems rely on internationally accepted standards it is less likely that important metadata will be lost in the process.

Why Digital Preservation?
There are various reasons why libraries, and in particular academic or research libraries, may wish to embark on a digital preservation project. Libraries have always been entrusted to preserve physical objects so it is only natural that they should continue this role in a digital-domain. Preserving local content is not link to HaithiTrusta new idea. Now, such preservation is dealing with new formats. And, such preservation furthers that existing role of the library in the digital world. Additionally with projects such as HathiTrust and Google Books having the potential to replace a large portion of a library's physical collection in the future, local, specialized content is likely to become an increasingly more important component for the library in the academy.

Rosetta at Binghamton
Binghamton University Libraries started implementing Rosetta in December 2010 when initial overview training was provided by Ex Libris and a server was configured to install Rosetta. This proved to be helpful because there are decisions that need to be made when configuring the Rosetta system that was covered by this training. The installation of the software took place in January 2011 and shortly thereafter Ex Libris provided the necessary functional training. By the end of March, the first collection, the Edwin A. Link Jr. Collection, was migrated from ContentDM to Rosetta and Binghamton went live with Rosetta. Since then other collections have been migrated and new collections have been created. Projects in Rosetta include the Saeedpour Kurdish collection from our own Special Collections, newsletters and photographs from the Office of Communications and Marketing, and material related to a blood serum archive in conjunctionlink to Edwin R. Link archives with a teaching faculty member. There are also plans to work with electronic theses and dissertations, research data, and other scholarly content.

Each object (or set of objects) is deposited into Rosetta as an Intellectual Entity (IE). An IE contains descriptive metadata (in Dublin Core format), the object itself, and access rights forms (who can view the content). While coming up with the appropriate descriptive metadata fields proved to be more involved than first thought, Binghamton felt it was important to use standard Dublin Core metadata fields and to use them consistently across collections. Dublin Core can be very flexible and that is a strength. However, it is also a weakness. It was time-consuming to create metadata best practices. This was important to make theobjects accessible in Primo, but to also allow for easier migration in the future. It was time well-spent.

Rosetta and Discovery
The Rosetta software does not include a public discovery layer. Binghamton University Libraries decided to implement Ex Libris' Primo for the discovery layer. It was decided thislink to ExLibris PRIMO provided the best available choice for a single interface that can search Rosetta, Aleph, Primo Central, and future library systems. Primo allows the Libraries to offer one-stop shopping to patrons where they can discover various types of content including local physical, local digital, and remote e-content in one place.

The Rosetta portion of our discovery layer is made possible because it has a digital publishing system. Sets of records can be made available for harvest by any system that can ingest records utilizing the Open link to Open ArchivesArchives Initiative Protocol for Metadata Harvesting (OAI-PMH). OAI-PMH is a low-barrier mechanism designed for repository interoperability. Binghamton was the first institution to work with Primo and Rosetta together so there was a learning curve for Ex Libris and Binghamton. but it has proven to be a good choice.

Staffing
Binghamton University Libraries have implemented Rosetta with four people playing significant roles. The Director of Library Technology has done most of the system configuration while three librarians from the Cataloging Department have worked with the metadata. Each of these librarians have other responsibilities and it is estimated a total of 1.5 FTE have worked on implementing and managing Rosetta. Other library staff have played additional roles with Rosetta. For example, Special Collections, Preservation and University Archives have worked to identify, digitize, and describe materials to be deposited in Rosetta and the Web Services Librarian helped with the design of the Primo interface to Rosetta.

Each collection in Rosetta is unique. But, because there are many similarities in the processes, Binghamton has been able to handle projects with the few staff involved. Metadata Librarians create metadata forms and provide guidelines on how to use them. While in some cases they actually create the metadata, it is often the case that other staff with more collection-specific knowledge create the actual metadata. Binghamton has used staff from Special Collections, a Subject Librarian, and staff from other departments on campus as appropriate. The Metadata Librarians provide training to the staff creating the metadata as well as determining controlled vocabulary terms. In many respects, the Metadata Librarians can be seen as project managers for Binghamton's Digital Preservation System.

Conclusions
Implementing Rosetta at Binghamton has proven to be a challenging but exciting project. It has allowed us to learn about Rosetta, Primo, and digital preservation during the past year. We currently have about a dozen digital preservation projects in various stages from planning to completion. In the coming years Binghamton hopes to move forward with additional digital preservation projects, working with other departments across campus and possibly entities from off-campus.

Currently Ex Libris does not offer a hosted approach to implementing Rosetta that might be of interest to smaller libraries. Instead, they have suggested that libraries that do not have the staff or expertise necessary look to consortia and other partners for digital preservation services via Rosetta. Belgium's K.U. Leuven, the oldest Catholic university in the world, has added Rosetta to its Leuven Integrated Archive System (LIAS), which provides services to other libraries, museums, and archives throughout Flanders via the LIBIS network. This is a model that SUNY may wish to investigate in the future where one (or more) of the campuses and/or the Office of Library and Information Services (OLIS) offers digital preservation services to other campuses on a cost-recovery basis. link to Binghamton digital collections

 

Go forward to page 2


Cover Story

Binghamton and Rosetta

[Image: Bobby Approved Logo]

Features

Focus on SUNY and SUNY Libraries

SCLD Taskforces

 Features

Discovery Systems

SUNYConnect Updates

 

 

How to Contact Us

Linkable Links