Sustainability Measures with the Archives Unleashed Team

Samantha Fritz
Archives Unleashed
Published in
8 min readMay 31, 2018

--

By: Samantha Fritz and Ian Milligan

Among OSS (Open Source Software) and DH (Digital Humanities) communities there is a recognized need for sustainability efforts. For many scholars and researchers, sustainability models have become a standard element on grant applications, but more importantly, they are an essential process to ensure a project’s survival and continued efforts once the grant cycle has ended.

Sustainability can give off an “elephant in the room” vibe — everyone knows it’s there, but it can be an intimidating situation to tackle. The question most projects face is, “how do we plan for and ensure a sustainable future?” Early on it’s important to recognize that this question is multifaceted and an ever-evolving process.

So if everyone is in the same (sustainability) boat, surely we can look at various success cases and create a “how to” recipe book? Yes and no. When looking at the breadth of sustainability case studies, materials, resources, and tools, it becomes apparent that there is no one model fits all. This is primarily due to the varying needs of every project, which means the definition of sustainability and resulting roadmap is unique to each community.

If by now you’re a little overwhelmed, just know there is hope! Sustainability planning is a core objective for the Archives Unleashed Project, and our team, currently in its first of three years of funding, has begun to dig into sustainability research and planning. Over the past few weeks, we have reviewed grey literature sources and reports, connected with other OSS/DH projects, and conducted a preliminary analysis of Archives Unleashed Toolkit (AUT) technical processes. A primary objective of our research is to outline available options for sustainability models to make an informed decision. We also look forward to documenting our processes so we can share our experience and resources with the broader community.

Lessons About Sustainability (what the literature says)

We have predominately looked at sustainability efforts within the context of digital humanities project and open source tools. One of the most valuable resources to inform our sustainability efforts, is “It Takes a Village: Open Source Software Sustainability.” This report is the closest thing to a “how to guide” and provides some great case studies.

In a nutshell, these are the seven most significant takeaways for our sustainability planning:

  1. Sustainability, a word that makes you acutely aware of a million unanswered questions;
  2. There is no one-fit model or step by step process because every project is unique, and so is their sustainability model;
  3. Sustainability must be defined, so there can be direction when creating a framework to include objectives, outputs, and tasks;
  4. Teams should plan with four sectors in mind: governance, technology, resources, and community engagement;
  5. Planning is essential to implementation and execution;
  6. There are two sides of sustainability: success and struggle;
  7. “Sustainability is not a static process, but always changing and evolving” (Gemmill Arp, L., et al. (2018). It Takes a Village: Open Source Software Sustainability. Lyrasis)

Archives Unleashed Sustainability Efforts

The AU team has been working on a number of different components that tie into sustainability planning. Time for a tour!

  1. Defining Sustainability

Through research and discussions with a variety of groups and individuals, we recognize that to create a roadmap for sustainability, our team needs to answer “what does sustainability mean to Archives Unleashed?” In teasing out a definition, we have explored the question as a team, reflected on community discussions, and referred back to our grant for guidance.

Ultimately, sustainability means the survival and continuation of the Archives Unleashed Toolkit once our formal funding from the Andrew W. Mellon Foundation ends in 2020. More specifically, the AU team is working towards building a sustainability model to ensure the Archives Unleashed project will:

  • Remain an open-source project,
  • Persist as a toolkit and service for web archive analysis, and
  • Provide support for structural maintenance and advancement.

2. Performance Logs Analysis and Cost Analysis

In our grant application, the AU Team discussed specific activities to help ensure the sustainability of AUT beyond our grant funding. These activities include building and developing a diverse user-base and community through local datathons and developing a cost-recovery model for ongoing operations.

Our Mellon funding and in-kind contributions cover the initial infrastructure, development, and operation of both the Archives Unleashed Toolkit and Cloud. As such, we are in the preliminary stages of investigating the financial contribution needed from users to cover the real costs of storing and analyzing WARCs after the funding cycle ends.

Two sets of preliminary analysis were completed to inform our cost-recovery model:

  • We took a deep dive into AUT performance logs to understand the time intervals for generating derivatives (GEXF, text, etc.) and overall processing time.
  • We estimated the cost of working with WARCS via Amazon Web Services (AWS) using comparable computing resources. (Watch out for a separate blog post that uncovers the cost of a WARC).

Our performance logs consist of 60 collections from six Canadian universities, ranging from 41MB to 43TB in size.

Conclusions from our analysis:

  • We saw an expected correlation between collection size and processing duration. The larger the collection, the longer it took to process and provide the standard set of derivatives.
  • Outliers from this pattern were generally due to systems interruptions.
  • Preliminary cost estimates revealed that storage costs tend to be more expensive than processing costs, simply because you tend to hold on to the data longer than what it takes to use it for analysis. For instance, you may store data for 20 days, but only spend 15 hours analyzing it.

3. Incorporation

Incorporation is one of the steps necessary to facilitate continued operations, fiscal responsibility and to help further regulate organizational structures and decision-making processes. While there are a variety of options available for Canadian incorporation, our team is currently researching the process of incorporating as a Canadian soliciting not-for-profit. Presently, we are comparing incorporation options to ensure we make an informed decision which best fits the needs of our community and provides the infrastructure necessary for the Archives Unleashed Toolkit to grow. As transparency is important to our team and project, we will continue to inform the community of our efforts and include their input in our decisions.

4. Connect and Collaboration

We are fortunate to be part of a network that is open to collaboration and sharing resources and ideas. The AU team has had a chance to connect with individuals and organizations that have experience with, or are in the middle of, sustainability planning. Discussions with these groups have been invaluable, as they’ve brought a unique perspective to our own sustainability brainstorm, helped us to understand trends among OSS projects, and demonstrated potential growth and collaborative opportunities.

One of the biggest lessons learned so far has been that sustainability planning is not an A-Z process. There are many components that we will be working through independently, but will invariably overlap. For the foodie readers out there, it’s like being in the MasterChef kitchen, with multiple pots cooking all of which contribute to one final dish.

Challenges

Even at such an early stage, our team has come across some challenges, but nothing a can-do attitude and a gallon of coffee can help us hopefully overcome. Here are some of the struggles we’ve come across so far:

  • There is so much information! I would never have thought that having information at the touch of my fingertips would be a challenge. Let’s just say that when doing sustainability research we may have gone down a few late night rabbit holes. The key here has been to focus our attention and searches on sustainability within an OSS and DH context, as well as direct tasks that contribute to our sustainability plans, like incorporation.
  • Don’t reinvent the wheel! My favourite part when starting a new project is the initial brainstorm! Getting out the Post-It notes, markers, and taking over a room filled with floor to ceiling whiteboards. Our initial sustainability brainstorm consisted of figuring out the unique characteristics of the Archives Unleashed project, and assessing what communities we share similarities with. This exercise illustrated that although every community is different, there are similar elements in all sustainability models that we can draw from. Instead of trying to reinvent the wheel we have assessed the information currently available to use and identified ourselves within multiple communities.
  • A happy medium. When it comes to establishing a cost-recovery model our biggest challenge has been estimating real costs on the data processing side, and at the same time our financial model supports future operation. Our goal is to strike a fair medium in our cost-recovery model, to ensure we uphold our belief in accessibility and usability in a transparent and sustainable manner. These developments are still in the early stages and we a currently conducting tests and experiments to make sure we hit that happy medium for our users. Discussions with our advisory board, community, and examining trends has been invaluable in shaping our direction.
  • Chicken and egg syndrome. As was mentioned before, sustainability is not a streamlined process. Throughout the initial stages of research and planning, we found multiple components correlated and informed one another. As the project manager, I had an opportunity to be the first one to take a deep dive into the material. The biggest challenge I faced was, what I like to refer to as, the chicken and egg syndrome, or this sense of not knowing where to start because nothing seemed like a true beginning.

So after doing what felt like a 150-foot free fall into the ocean, there were four methods that helped initiate AU sustainability planning:

  1. Brainstorms! you may have noticed my enthusiasm for them! This was an effective method because it allowed our team to see the bigger picture, while at the same time outlined tasks and questions we could slowly chip away at by building a sustainability resource portfolio.
  2. Feedback from the AU team has been instrumental in being able to juggle multiple areas at once. Each team member brings a unique perspective and experiences to the project, which in turn has helped to answer questions in areas I was unfamiliar with. Having others as sounding boards really brings out a team’s collaborative spirit.
  3. Dive In! Some people might refer to this as ripping off the band-aide, but ultimately the best way to start is by diving straight into the material. This meant a lot of searching, reading, and summarizing, but also reaching out to individuals with specific expertise.
  4. 80s/90s music, okay, so this isn’t an essential ingredient for sustainability planning, but these jams helped me to get into the zone. And let’s be honest everyone has a quirk or two that help us get into the right headspace, mine just happens to be lyrics from Counting Crows, Goo Goo Dolls, Def Leppard, and Bon Jovi.

The Way Forward

So what can you expect from the Archives Unleashed team as we continue to develop our plans for sustainability?

  • Consulting with our Advisory Board and our community for input and advice
  • Outlining the processes and responsibilities of incorporating as a soliciting not-for-profit organization in the province of Ontario
  • Maintain transparency as we develop our sustainability and cost-recovery model
  • Continue to connect with and build our network to inform our sustainability efforts
  • Share blog posts, updates, and resources related to AU sustainability measures

Get Involved & Stay in Touch

If you are interested in learning more about our sustainability efforts or want to get involved with the Archives Unleashed Project and Toolkit:

--

--