Standards and other e-preservation initiatives: Electronic Records in Colleges and Universities

Standards and other e-preservation initiatives: Electronic Records in Colleges and Universities

October 26, 2019 0 By Stanley Isaacs


In this next section of the New York State Archives’ Webinar on Preserving Electronic Records in Colleges and Universities, we will discuss some of the existing standards that can assist in your preservation efforts as well as some examples of other institutions’ electronic preservation initiatives. Questions are usually asked about what standards exist and what are other organizations are doing to preserve electronic records. While there are many organizations with varying degrees of electronic records preservation work, we have to keep in mind that all of this work is relatively new and is still developing. One of the most commonly cited standards is the Open Archival Information System, or OAIS. Our intent is not to go too in depth into the OAIS model, but to provide a general overview so that you are aware of it and its positioning. You will find additional information regarding OAIS in the supplemental handouts included with this workshop. According to the OCLC web site, OAIS is defined as an ISO conceptual framework for an archival system dedicated to preserving and maintaining access to digital information over the long term. Basically OAIS has three main purposes: 1. To increase awareness and understanding of concepts. 2. Create a framework to guide the identification and development of standards, and 3. To provide a common language for everyone to use. This slide depicts a very simplistic diagram of the OAIS environment. On the left side of the slide is the producer of the content. Producers supply the information, or records, that the archive preserves. On the right slide of the slide are the consumers. Consumers access and view the preserved information. The large box in the center is the archive itself where the records are stored. Management is the entity responsible for establishing the broad policy objectives of the archives, such as determining what types of information are to be archived, identifying funding sources, and other activities. This is not meant to be confused with the day-to-day administration of the archives which is performed within the archive entity itself. Again, additional information can be found in the supplemental handout. Another standard as well as an initiative you should be aware of is the Dublin Core Metadata Initiative. The Dublin Core metadata set is designed to promote discovery of electronic resources and is popular in the library, computer science, and museum fields. It is intended to be used for cross-domain information resource description and defines conventions for describing things online in ways that make them easier to find. Dublin Core is widely used to describe digital materials such as video, sound, images, text and compound media, like web pages. It also should be noted that Dublin Core is an ISO standard. Some other examples of preservation initiatives are listed on this slide. LOCKSS which standard for “Lots of Copies Keep Stuff Safe” is an international community initiative based at Stanford University Libraries. LOCKSS provides libraries with tools for digital preservation in order to collect and preserve their own copies of electronic content inexpensively. They provide libraries with the open-source software and support to preserve authorized web-published materials for tomorrow. The Chronopolis project, is a partnership program, designed to leverage the storage capabilities at the San Diego Super Computer Center, the University of San Diego Libraries, the National Center for Atmospheric Research, and the University of Maryland’s Institute for Advanced Computer Studies. This project provides a preservation data grid between dissimilar and highly redundant storage systems. It is focused on digital collections related to the Library of Congress’ National Digital Information Infrastructure and Preservation Program (NDIIPP). The partnership will also develop best practices for the Library of Congress’ preservation program for data packaging and transmission among different systems. Although the Collaborative Electronic Records Project of the Rockefeller Archives Center and the Smithsonian Institution Archives has since ended, since it completed its original objective for tools to assist in email preservation, it is important to note that the efforts are continuing under the EMCAP, which stands for E-mail Collection And Preservation initiative. EMCAP is a NHPRC grant-funded collaboration effort among the state archives of North Carolina, Pennsylvania, and Kentucky. Among the work provided is an XML-based parsing and preservation tool. Additional information is included in the supplemental handouts. This slide shows a listing of some other electronic record preservation initiatives that you are encouraged to look at to learn more about the work they have performed. The intent is not to go into detail with each one of these institutions mentioned here, but to recognize some of the more experienced programs. NARA’s electronic records initiative, which is a project aimed at handling the federal government’s massive volume of electronic records. Too often we only look within the US for best practices. The National Archives of Australia has done significant work in electronic records preservation that has lead to the development of many international standards and tools. The National Archives of the Netherland’s Digital Longevity Testbed is a three year research project focused on digital preservation as part of their Digital Longevity Program. The University of Florida’s Dark Archive in the Sunshine State, or DAITSS, is another program that has done some great work that you should look into. The mission of the Library of Congress’ National Digital Information Infrastructure and Preservation Program is to develop a national strategy to collect, preserve and make available significant digital content for current and future generations. You should also look into the MetaArchive Cooperative which is leveraging the LOCKSS work that is being done at Stanford University. This slide shows a small sampling of other state governments that have an active electronic records program. By no means is this meant to be an exhaustive listing of all state government operations, and my apologies to those not listed that have active electronic records programs. But I encourage you to visit the web site of each of these, or contact these programs directly to learn more about their individual efforts. Being from New York State, obviously we would be remiss in not discussing the fine work being done within the New York State Archives’ Electronic Records Unit. Although this is a relatively new initiative, the group has made good progress, especially considering the limited budget and available staff. In essence, the Electronic Records Unit has adopted an electronics records preservation model developed by the National Archives of Australia. This model is designed around three separate processes or stations: a quarantine station, a preservation station, and a storage station. We will discuss these in more detail on the following slides. The first workstation is the quarantine station. The purpose of this workstation is to be the receiving point of the files sent to the Electronic Records Unit. Files are matched against what was suppose to be sent by the agencies and to make sure there has not been any corruption or other changes to the files during the transfer process. This also isolates files and allows them to be checked for any viruses to avoid contaminating the repository. Once they are checked and verified, the files are moved to a preservation station. At this station they could be converted to a normalized format such as PDF/A or XML in the future. The last station is the storage station where both the original bit stream files and the normalized files are retained. Back up copies of both the original file and the native files are made and taken off site for further protection. Although the Electronic Records Unit has done some wonderful work, it is still a relatively small unit, with a very limited number of staff and budget. We are all hoping that further investment in their work is made possible in future state budgets. The Electronic Records Unit is continuing to research best practices and investigate the electronic record preservation work performed by other institutions. As with your own efforts, they have the challenge of creating awareness of their work among the various agencies. But this has to be balanced given the size of their current staff and their internal capabilities. As always, they are continuing to revisit best practices as holdings grow and technologies change. For more information, contact the New York State Archives Electronic Records Unit at the phone number listed on the slide. You have probably seen or heard about other operational examples from other institutions. I encourage you to look into these further and continually revisit them since each institution has probably adapted to new environments and new challenges in technologies in the recent years. This concludes Section Four. You are encouraged to continue on to see Section Five of this webinar on Preserving Electronic Records in Colleges and Universities, where we will offer some strategies for preserving your institutions’ electronic records.