Educating archivists and record keepers is the first step in developing a digital program. Recently, members of the Processing and Collections Management teams at the Rockefeller Archive Center attended a two-day workshop titled “Appraisal, Accessioning, and Ingest of Digital Records” offered by SAA and presented by Erin Faulder, Digital Archivist at Cornell University Library’s Division of Rare and Manuscript Collections. The Digital Archives Specialist (DAS) course delved into the challenges of preserving and managing electronic records and offered strategies for institutions both unfamiliar and well versed in the realm of digital archiving.

Here are the reflections of four archivists who participated in the DAS workshop.

Emeline Swanson, Archival Assistant, Collections Management

My experience as a first-time DAS course attendee in conjunction with RAC’s Project Electron training introduced me to new concepts and even reacquainted me with some familiar notions. As an archival assistant primarily responsible for accessioning, I found that the workshop provided valuable insight into best practices for appraisal, accessioning, management, and storage of digital records. I gained a better understanding of the OAIS Reference Model and strategies for seamless acquisition, accessioning, and digital record management, inevitably juxtaposed with my experience accessioning and maintaining physical records. However, my most important realization was that digital preservation is not an easy endeavor. It takes considerable manpower and resources to ensure the longevity of electronic records, whether it is manual fixity checks, digital forensics, software development on a departmental scale, or choosing the right tools from an extensive list to fit the institution’s goals for preservation.

Technological obsolescence and dependency on outdated and dying hardware as well as physical media deterioration are all long-term struggles that don’t apply to paper materials. Erin Faulder stated in her presentation that “digital records do not survive by accident.” These records require data integrity and authenticity checks, as well as proper storage to house multiple copies of each record (three copies seems to be the agreed upon rule of thumb). Depending on an institution’s needs and budget, multiple storage types (e.g. network or cloud) and their associated costs, as well as the types of files slated for storage are all key considerations. For instance, audiovisual materials require more space than MS Word-type documents, so determining file formats of sought-after records might be a crucial facet of the appraisal process to narrow down storage options, depending on the institution’s collecting policy.

Although finding efficient long-term space for physical materials in our vaults is always a challenge, Erin Faulder’s remarks on digital storage sparked a deeper understanding of factors that repositories must consider when choosing storage plans. In an attempt to visualize what one digital accession might look like in the vault, Faulder estimated that 50 gigabytes of general office files (i.e. spreadsheets, images, Powerpoints, Word files, audio, and video) roughly translates to around 75-170 record carton boxes, on average comprising 125 boxes. This is a loose calculation based off of one accession Faulder worked with and is no way representative of every situation (which almost always varies). However, it is still fascinating to imagine all of those boxes on one flash drive despite not seeing the records physically on the shelves. Electronic records are not “out of sight, out of mind” and require a solid storage and preservation plan to ensure stability and accessibility.

Katie Martin, Assistant Archivist, Processing

I approached this workshop from a digital processing perspective because I am involved in an RAC project to rescue records from aging legacy media, like optical and floppy disks. The ultimate goal is to train processing staff to inventory, create a disk image (a preservation copy of the content and file structure), and run a virus scan on any digital media they run across in the course of their work. I thought there were many interesting points made in the SAA course that are relevant to our developing workflow. The importance of documenting all aspects of transferring, accessioning, and appraising digital media was repeated throughout the course. Anytime any action is taken on a digital record, documentation is required. For the digital media project, we utilize a digital media log to record the unique digital media number, format, transfer status, date of transfer, and collection information. Another great point made in the course is to ensure justifiable technical appraisal, which is to take into account the ease of capturing data versus the time to capture intended data. Although there is no real equivalent to this kind of appraisal in the paper world, sometimes it is simply not worth the time to spend hours trying to get a perfect disk image from uncooperative legacy media. Another major theme throughout the workshop was to just take the plunge and start acting in regards to digital records. Ignoring legacy media and born digital records will not make them go away. As a call to action, the instructor reinforced the obsolescence and fragility of digital media time and time again. In the instructor’s own experience, only 80% of disk image attempts are successful and optical disks can become unreadable within five years. In the course of the digital media project, we have experienced days where all materials seem to be unreadable. While that can be frustrating, it is all part of working toward preserving all that we can, while we can. The course reinforced that there are currently no perfect solutions to the accession and ingest of digital materials, but all archivists are in this together. The RAC, like all institutions, can only strive to maintain best practices while actively working to save fragile digital records.

Erich Chang, Archival Assistant, Collections Management

This was my first time taking a DAS course, which introduced me to a plethora of different professionals and their experience with digital records. For me, this course helped me bring new perspectives to creating an accessioning workflow for born digital material. Erin Faulder talked about the transfer process which made me think about different approaches and methods to consider such as requiring more than one workflow if the transfer media is distinct. For accessioning, this is where you must determine the level of control you want. Another great point was the importance of creating workflows that are scalable so that regardless of size or format, it can be handled. Workflows are also important because they help streamline your tasks and identify and remove unnecessary steps/processes. I’m also currently taking a project management course, and it also reiterates the importance of workflows. Since taking on managing accessioning at the RAC, it has been difficult learning about how to accession electronic records, but after taking this course I was able to build a solid foundation for understanding electronic records.

Darren Young, Assistant Archivist, Processing

When preparing for the workshop, I hoped to learn what kind of information would be collected and produced during appraisal, accessioning, and ingest of digital records that would influence my work as a processing archivist. More importantly, I wanted to find out how this information would be obtained, generated, and managed. I imagined that this sort of data would mostly pertain to file formats, various kinds of dates regarding file creation or modification, and technical information related to systems and software that helped produce or provide access to content. By describing how each of these pieces of information helps preserve and verify the authenticity and integrity of a digital object as well as by discussing the tools that could be used to gather and protect them, the workshop more than satisfied my original expectations. Presenter Erin Faulder used the term significant properties to refer to these different information components, and she explained that they are responsible for making an electronic record accessible and meaningful over time. The workshop also discussed types of information related to accessioning and ingest of electronic records that I had not previously considered. For example, I learned how archivists can use checksums to determine if a digital object has changed after a period of time has passed or after a specific action has been applied.

Nevertheless, in spite of how much information I learned in terms of what kind of data archivists must receive, collect, generate, store, and transfer during appraisal, accessioning, and ingest of digital records, I believe that my most valuable takeaway from the workshop concerns the relationships between these different properties and the potential opportunities for using those connections to more effectively manage and utilize archival data. When looking at accessioning information, I often think of it merely in terms of the document that contains it. It is an inventory on an Excel spreadsheet, an accession record in ArchivesSpace, or a PDF donor agreement. Yes, each document has data that I copy and export, edit, and enhance when I create archival description during processing, but I still struggle to think of the data as independent of the document because I am comfortable reading and searching each individual resource for the particular information I need. Listening to Faulder discuss the ways archivists can receive electronic records and submission information from information producers, I was reminded how data can potentially be more useful if it is not only human readable but if it can also be understood, validated, and processed by a machine. For instance, Faulder described how archives can establish submission workflows so that they receive electronic record content and transfer information in the form of METS files. Some of the key benefits to the METS metadata standard arise from its flexibility and extensibility. METS files can wrap content and the metadata describing the content’s significant properties together, or they can point to the file locations where digital records may be stored. They can be expanded to include other metadata standards that are focused on describing a certain type of information such as PREMIS for preservation data, and they can make the information packages they encode interoperable with preservation tools and systems.

Before participating in the workshop, I think I had a good grasp on the benefits of utilizing a data-focused perspective when creating and sharing archival information that impacts the access and discovery of content. However, I now believe that I can more meaningfully apply that perspective to how archivists themselves manage and use information collected and created during the pre-processing steps of appraisal, accessioning, and ingest.