Reflecting on Collaborations and Community at Archives Unleashed Datathons

Samantha Fritz
Archives Unleashed
Published in
5 min readJul 30, 2020

--

Washington Datathon, George Washington University — Photo by Samantha Fritz

Datathons combine the terms “data” and “marathon”, and are generally short, intensive events where individuals have access to datasets to solve, answer, and/or investigate questions.

Our team adapted the hackathon/datathon model for the purpose of providing individuals with an opportunity to explore web archives, through hands-on experience of Archives Unleashed tools and external platforms that exist within the web archiving ecosystem. Before the Archives Unleashed project, some of our investigators worked with other colleagues to run a series of earlier “Archives Unleashed” datathons from March 2016 to June 2017; we carried on from these lessons learned in our current Andrew W. Mellon Foundation-funded project.

Datathons allowed our team to build community and foster a sense of belonging and collaboration among web archiving enthusiasts. Who are these enthusiasts? They are tool builders, digital content access providers, and researchers from a variety of disciplines.

As the project is nearing its final weeks, we wanted to reflect on the moments and individuals who’ve contributed to these four successful datathons as part of our Archives Unleashed Project..

About Archives Unleashed Datathons

Over the past three years, the Archives Unleashed team has partnered with academic research institutions across Canada and the United States to host four datathons, which support the scholarly exploration of web archives at scale. These were:

  • (Apr 2018) Toronto — co-hosted with the University of Toronto Libraries and Nich Worby at the Robarts Library
  • (Nov 2018) Vancouver — many thanks to Simon Fraser University and Key (SFU’s Big Data initiative) for their support, and our co-organizer Rebecca Dowson
  • (Mar 2019) Washington — we were pleased to co-host this event with colleagues Laura Wrubel, Dan Kerchner, Rachel Trent, and Robin Delaloye at George Washington University’s Butler Library
  • (Mar 2020) New York — special thanks to our co-organizers Pamela Graham and Alexander Thurman from Columbia University and Samantha Abrams from the Ivy Plus Libraries Confederation; our final datathon was held online due to the COVID pandemic
Overview of Archives Unleashed Datathons

The datathons opened with welcoming remarks from the organizing team, short introductory presentations that explored the importance of web archiving, the challenges faced by researchers, and an overview of the Archives Unleashed Toolkit and Cloud. Participants were given a preview of the datasets they would have access to. Access was a combined effort between hosting institutions who shared their web archive collections, and our project Co-PI Nick Ruest who processed collections into ready-to-use derivatives. This has been one of the greatest lessons learned by our team: access to pre-computed derivatives is critical to time efficiency and team success. Our recent article explores this theme and is available as a preprint; the paper will be presented at the 2020 JCDL Conference.

You can explore and use over twenty dataset derivatives from our New York datathons, available through our Web Archives for Historical Research Groups on Zenodo and Dataverse. For anyone who would like to use these derivatives in their research, a citable DOI is available for each collection. For example, check out the Global Webcomics Web Archive collection.

How are teams formed? Through a sticky note exercise — which is considered a favourite activity among organizers and participants! This exercise draws on methods from the field of participatory design and is how we bring teams together.

(Group formation activity using sticky notes at Washington Datathon, George Washington University | Photo by Samantha Fritz)

The majority of time spent at the datathon was dedicated to teamwork and exploring web archives through different tools and visualization platforms. Teams were all given an opportunity to come together during the event wrap up to showcase their results and were a great way to share strategies and approaches with their peers.

For those wanting an inside look at the full datathon experience, please check out the trip reports written by some of our participants

Datathon Accomplishments

The Archives Unleashed Datathons have brought together a rich, diverse, and interdisciplinary community of colleagues. We are so thankful to all of the institutions who‘ve supported the running of these events, and the participants for bringing their curiosity and enthusiasm for web archiving!

These datathons have been a significant component for community development, and March 2020 marked a major milestone as we hosted our final event. Let’s take a look at the datathon summary.

Over three years, we’ve been able to host 73 participants, and with support from the Andrew W Mellon Foundation, our project provided 36 travel grants (usually C$1,000-$1,300 per attendee) to help support attendance and participation.

Our attendees brought interdisciplinary perspectives and experiences from across libraries, archives, digital humanities, social sciences, computer science, public libraries, and international and non-profit organizations.

The eighteen collaborative projects tackled technical challenges with creative ingenuity and highlighted various methods, datasets, and visualization techniques. Congratulations to all of our teams! You can check out their presentation slides by following the links below to our event pages.

Toronto Datathon

  • Team BC Teachers’ Labor Dispute
  • Team Make Tweets Great Again
  • Team Pipeline
  • Team Spamlinks

Vancouver Datathon

  • Team IDG
  • Team BC 2017 Politics
  • Team Wildfyre
  • Team Seeds of Anarchy

Washington, DC Datathon

  • Team #metoo Group 1
  • Team #metoo Group 2
  • Team DC Punk Media Unleashed
  • Team Kompromat
  • Team Psyarchives
  • Team Punkavists

New York Datathon

  • Team Latin American and Caribbean Contemporary Art Web Archive
  • Team Contemporary Composers Web Archives
  • Team Global Web Comics Web Archives
  • Team Stonewall

Datathon participants made these events a truly international experience, as there was representation from 52 unique institutions and 7 countries, with our furthest traveled participant coming from New Zealand.

The overarching goal of running regional datathons was to build community. Our team feels grateful to have witnessed community formation on a number of levels: between individuals at the event through teams, connecting with academic institutions to share and highlight their web archiving collections, and to see continued collaboration and discussions beyond the event.

Many Thanks

We would sincerely like to thank all of our participants for bringing their curiosity and collaborative spirit to the datathons. Our project would also like to extend our gratitude to our colleagues and their institutions for hosting our events, without whom we wouldn’t have been able to offer an opportunity to gather in such inspiring locations. We would also like to say thank you to the Andrew W. Mellon Foundation, whose support facilitated the running of datathons and the attendance of scholars from across the globe. Additional thanks are due to Compute Canada, the University of Waterloo, and York University.

--

--