New data paper and datasets from crowdsourcing on Living with Machines

After lots of hard work by me, Nilo Pedrazzini, Miguel V., Arianna Ciula and Barbara McGillivray, we have a data paper in the Journal of Open Humanities Data: Language of Mechanisation Crowdsourcing Datasets from the Living with Machines Project.

And huge thanks to the thousands of Zooniverse volunteers who annotated 19th century newspaper articles to create the datasets we've published alongside the data paper!

Abstract: We present the ‘Language of Mechanisation’ datasets with examples of re-use in visualisations and analysis. These reusable CSV files, published on the British Library’s Research Repository, contain automatically-transcribed text from 19th century British newspaper articles. Volunteers on the Zooniverse crowdsourcing platform took part in tasks that asked ‘How did the word x change over time and place?’ They annotated articles with pre-selected meanings (senses) for the words coach, car, trolley and bike.

The datasets can support scholarship on a range of historical and linguistic research areas, including research on crowdsourcing and online volunteering behaviours, data processing and data visualisations methodologies.

The two datasets described are at:

Keynote video 'Evolutionary Innovations: Collections as Data in the AI era' for Making Meaning 2024

Making Meaning 2024: Mia Ridge Keynote

My slides for #SLQMakingMeaning #CollectionsAsData, 'Evolutionary Innovations: Collections as Data in the AI era', are online at https://zenodo.org/records/10795641

‘Collections as data’ describes the movement to publish open data from museum, library and archive collections that began in the noughties. The benefits of machine learning for better discoverability and research with digitised/born digital collections are alluring. And the popularity of generative AI – and an increased awareness of the biases it reinscribes – has focused attention on responsible computational access to collections – but what does this mean in practical terms? Mia will share examples from the British Library and the Living with Machines data science project.

2022: an overview(ish)

A work-in-progress post about what I got up to last year.

The biggest thing I did in 2022 was co-curate an exhibition at Leeds City Museum for the British Library and Living with Machines project.

November: I was invited to the Archives nationales de France conference 'Crowdsourcing et patrimoine culturel écrit', where I spoke on Crowdsourcing as connection: a constant star over a sea of change / Établir des connexions : un invariant des projets de crowdsourcing par Mia Ridge, British Library, Royaume-Uni

In December I gave an online keynote on 'Citizen Science as Public History?' for the conference 'When publics co-produce history in museums: skills, methodologies and impact of participation' at The Luxembourg Centre for Contemporary and Digital History (C²DH), University of Luxembourg.

Crowdsourcing workshop activities: ideation and elaboration

I've been working on structures for online workshops for people working on crowdsourcing and other digital participation projects for museums, libraries and archives for over a decade now, learning from each institution I work with. I thought I'd share one of the slide decks I'm currently using.

The deck is labelled 'Coming up with and developing crowdsourcing ideas'. In a workshop or class on crowdsourcing it usually comes after sessions that explain the whats and whys of crowdsourcing in cultural heritage. It's designed to get people quickly working on practical ideas, anticipating issues and ensuring that their projects will fit into their specific institutional context.

The prompts currently include: What does success look like? Which audiences are interested? Why? What could you learn from trying this? Which collections are involved? Links to mission? Pros? Cons? How could you ensure data quality? Costs  (staff, tech)? Dependencies / assumptions? What problem does it address? Questions, concerns? What volunteer skills, experience needed? What will they learn? What tech, data is needed?

You can develop your own prompts based on the attributes that are important to you. The Collective Wisdom Handbook is a useful guide to figuring out what's important to you, from data quality to integration with existing workflows.

I mostly recently used this for a Europeana-funded workshop for the Estonian War Museum – General Laidoner Museum in March 2022.

The museums have shared some lessons from the workshop in a post for Europeana. Their report 'Estonian museums' experience in the field of crowdsourcing' not only provides some background on volunteering and crowdsourcing in Estonian museums, it also shows how they applied the prompts.

Crowdsourcing workshop activities: ideation and elaboration by Mia Ridge is licensed under CC BY-SA 4.0

Living with Machines exhibition launched

For the past year I've been co-curating the Living with Machines exhibition with John McGoldrick and working intensively with many others at Leeds Museums and Galleries and the British Library. It's inspired by the Living with Machines research project, and very much shaped by my interactions with volunteers on our Zooniverse crowdsourcing projects.

I've written a blog post These are a few of my favourite things… in the Living with Machines exhibition for the Living with Machines blog that explains something of the challenge.

I've also deposited our interpretation for wall panels and object labels for 'Living with Machines: human stories from the industrial age' at the British Library research repository.

The Living with Machines exhibition was open until January 2023 at Leeds City Museum.

Spot me and an exhibition object on the front page of the weekend Yorkshire Post
LwM team absorbed in the loom demonstration - Living with Machines exhibition opening
My Flickr album of installation and opening night photos

Official photography from Leeds Museums and Galleries

Me and co-curator John McGoldrick