Hatching ArchivesSnake

In the four years since we migrated to ArchivesSpace, the RAC has found many uses for the ArchivesSpace API. It has helped to transform our work in countless ways and aided us in enabling organizational change and fostered greater comfort with technology. However, with every tool that we worked on and every script that we wrote, we found ourselves writing some of the same portions of our code over and over again. It became clear to us that we needed a simplified way to do the routine and repetitive API work.

If we were running into this issue, we assumed that other institutions must be as well. While we hadn’t performed a formal survey, we knew that others working with ArchivesSpace were struggling with the same sort of data issues as us, and would probably welcome a solution to repetitive work. To that end, in mid-2017, I asked the ArchivesSpace community whether there was any interest in a Python library that would help those looking to work with the API. We specified Python because we generally worked with Python 2.7 for our work, and we knew that a lot of library and archives coders worked in Python too.

After the initial query, there was general support for the idea, but we didn’t have anyone to really handle the deep technical side until the irreplaceable Dave Mayo volunteered to help out with code development and thus, ArchivesSnake was born. From the outset we imagined ArchivesSnake to be a comprehensive client library that would help reduce the duplication of effort in coding and simplify scripting against the backend API of ArchivesSpace. Without Dave, none of this would be possible. I can’t speak too much about Dave’s process of writing the library, since a lot of what he did goes a bit over my head, but I can talk about the process of helping with a community-driven open source project from the start. Before we could start writing code, we had some administrative work to undertake.

I created the ArchivesSnake repository on August 8, 2017, and our first step was to create an open Slack channel on a larger Slack group for archivists so that we’d have an easy and open way to communicate about the upcoming work. We had representatives from about twenty different institutions interested in helping with the project, so it was necessary to provide a space where we could get together and talk about any design decisions or issues. We then immediately put out a call for information to ArchivesSpace community members for examples of their Python scripts that they used with the backend API for bulk tasks. We kept the call for examples open for a couple of months, and got scripts from the University of Denver, Duke University, Harvard and Smith, Johns Hopkins University, Penn State University, University of Albany, and Yale University.

With these examples in hand we could get a better picture of the different ways community members interacted with the backend API, and also what they were interested in doing with it. The examples gave us a vital foundation in our work of creating the building blocks of the library. Additionally, while gathering the script examples, we noticed that many institutions were making the switch from Python 2 to Python 3 in their code. While the RAC was still primarily working in Python 2, we could see the writing on the wall that adopting Python 3 would be the smart, future-proof decision for ArchivesSnake. Dave and I spoke with the rest of the ArchivesSnake volunteers, who all overwhelmingly agreed that a Python 3 library made the most sense moving forwards. We were finally ready to start writing code.

Two people have really pushed ArchivesSnake development forward: Dave Mayo and Greg Wiedeman. Dave and Greg did most of the code creation, while I tried to act as a de facto community manager when I could. I’d also try to review some pull requests, add comments about certain features, and test changes whenever possible. Check the commits, it’s almost all Dave and Greg, and they really did the heavy lifting.

What we ended with (not really, we’re still developing and improving) is a fully featured Python library that lets you interact with ArchivesSpace through both a low level API that lets users easily fetch JSON and save it to a variable, as well as a higher-level abstraction layer that lets you ignore some of the lower-level details of the ArchivesSpace API. It’s a one-stop shop for all of your ArchivesSpace API scripting needs, and it even comes packaged with improved logging and configuration settings to help you get started.

The coding work that Dave and Greg have done is fantastic, but I can’t commend them and Scott Carlson enough on the work they did in creating clear and comprehensive documentation. We knew that we wanted it to be easy for librarians and archivists new to coding to pick up and use ArchivesSnake, so proper documentation was vital to the success of the project. One of the ArchivesSnake team’s biggest complaints with other open source projects was their lack of sufficient documentation, which made it really difficult to understand how the system worked, how to use it, or how to even install it. ArchivesSnake has fully detailed API docs, installation instructions, a usage guide, and a GitHub wiki page that’s packed full of information, including a guide to getting started. Dave also used the wiki page to create a page for technical planning, providing a roadmap for future development. This emphasis on good documentation makes the project infinitely more adoptable for those wanting to get involved or started on using it; I helped create the project and I still find myself referring to the wiki pages for examples of how to do what I want with a new script.

I’ve learned a lot throughout this process. This post talks a lot about the beginning of ArchivesSnake and how great it is, but it’s also a little bit about my first experience being a part of a larger, community-driven open source project from the beginning. Working on ArchivesSnake has taught me how difficult it is to help shepherd a project from conception to a relative release (ArchivesSnake is still being developed, but it is also working and fully functional). This project would never have seen the light of day without the support of dedicated colleagues like everyone that helped throughout the process. ArchivesSnake really drove home how difficult it is to work on an open source project, even one as small as this one. This has been a learning experience for me and I really wanted to take the opportunity to shout out all of the hard work of the ArchivesSnake team.

I’ve already seen the benefits of the ArchivesSnake library in my work helping a processing archivist, Katie Martin, learn to code with Python 3. Besides some technical issues in getting Python installed on Windows, ArchivesSnake has made it much easier for beginners like Katie to get her hands dirty and dive right in. The documentation, along with the simplified code, has proved an invaluable teaching asset. Building tools like ArchivesSnake helps us enable and empower users to feel powerful and in control of the systems they work with. Work continues on ArchivesSnake, but I’ve already seen it’s power firsthand. I’m looking forward to helping out however I can. Thank you all to whoever has helped out, and I hope you stop in and take a look at the project. We’re always looking for more people to help us out.

#APIs #ArchivesSpace #open source #ArchivesSnake #code

Software and Systems Collaboration