CurateGear 2013

A couple weeks ago I attended CurateGear, a day-long symposium on digital curation tools held by the University of California, Chapel Hill. The day was packed full of presentations and break-out sessions.

One presentation of interest was from a developer (Mark Evans) at Tessella, touting their new-ish product, Preservica. Tessella came to my attention last SAA as the sponsor listed on everyone’s conference badges (advertising - it works!). I call Preservica a new-ish product because it’s based on a previous offering, something Tessella calls the Safety Deposit Box, the difference being that Preservica stores AIPs in the cloud, rather than a local hardware set-up. Overall, Preservica seemed pretty easy to use, reasonably priced, and easy to migrate to and from. Although we are content with our relationship with Artefactual and Archivematica, we may contact Tessella to explore Preservica. It’s vital to keep on top of this field, as changes in preservation systems and their functionality is to be expected. A few other items from my notes on the product are listed below:

No fixed metadata schema- can bring in your own standards (like PREMIS or METS) or use a custom one. Metadata is indexed and searchable. File content is also searchable.
Reporting features are available
Web harvesting and normalization is available
There is the built-in ability to connect to Sharepoint installations and pull content from them
Integrity checking can be scheduled
Similar to Archivematica, there are a series of microservices ingests run through, with decision points at each step
Role-based access control
Can attach ingest to a collection - manage by collection rather than SIP
Can attached security tags to define whether materials are open or closed, etc
AIPs are constructed logically- content and metadata are physically separate. Metadata may change even though the object should not. This also helps performance rates (searching, etc.)
Unlike Archivematica, previews of ingested materials are viewable from within the system. Browsing through materials and metadata hierarchies (based on collection descriptions) is much easier.
Preservation planning, an essential function within OAIS, is supported through risk-scoring and identification of format types. Migration strategies can be built out based on this info.

Another presentation that caught my interest was by David Pearson from the National Library of Australia on Preservation Intent Statements. He describes these in more detail in a recent D-Lib article. He talked about the need to break from a systems view, in which digital materials are seen as a collection of formats, and talk to curators about the essential characteristics of different types of documents, and which of these characteristics are essential for preservation.

There were a couple other presentations of note. Leslie Johnston from the Library of Congress discussed a few in-house tools they use for tracking the digitization -> preservation system workflow, including one they use to QC digitized files. These systems could potentially be extremely useful here, however they haven’t been released as open source software yet. Cam Woods showed off some new fancy BitCurator features. I’ll be discussing this program more in a future post. Last but definitely not least, was Doug Reside and Don Mennerich’s review of what they’ve been working on at NYPL. Don detailed some fascinating forensics work he did on pulling WordStar files off floppies formatted by a CP/M machine. He was even able to retrieve data from discs previously listed as corrupted, but were thankfully not discarded.

All in all, a great day, chock-full of great information that really represented the state of the field. If there’s a 2014 CurateGear, I will be pretty tempted to attend.

#conferences #CurateGear

Digital Preservation