I presented on 'Living with Machines: Crowdsourcing transcriptions for digitised historical collections of the British industrial revolution'. The video from the seminar is below.
Here's some of what I had to say: 'The British Library’s Ridge suggests people play around with AI to understand what might be coming.
“AI literacy is an important part of good governance,” she says. “People need a solid understanding of where biases are likely to appear, how to review and contest decisions made by algorithms and where sharing data might have privacy or legal implications, so that they can make good decisions about the products they buy or implement. It also helps people plan so that AI tools enhance jobs, rather than attempting to replace them.”'
And huge thanks to the thousands of Zooniverse volunteers who annotated 19th century newspaper articles to create the datasets we've published alongside the data paper!
Abstract: We present the ‘Language of Mechanisation’ datasets with examples of re-use in visualisations and analysis. These reusable CSV files, published on the British Library’s Research Repository, contain automatically-transcribed text from 19th century British newspaper articles. Volunteers on the Zooniverse crowdsourcing platform took part in tasks that asked ‘How did the word x change over time and place?’ They annotated articles with pre-selected meanings (senses) for the words coach, car, trolley and bike.
The datasets can support scholarship on a range of historical and linguistic research areas, including research on crowdsourcing and online volunteering behaviours, data processing and data visualisations methodologies.
‘Collections as data’ describes the movement to publish open data from museum, library and archive collections that began in the noughties. The benefits of machine learning for better discoverability and research with digitised/born digital collections are alluring. And the popularity of generative AI – and an increased awareness of the biases it reinscribes – has focused attention on responsible computational access to collections – but what does this mean in practical terms? Mia will share examples from the British Library and the Living with Machines data science project.
In January 2024 I presented with Kaspar Beelen at a virtual Research Colloquium on Digital History at the Humboldt-Universität zu Berlin. I also gave a talk online for the Home Office's Data & Information Week with Karen Tingay (Head of Data and Methods, Office for Statistics Regulation).
I was in Australia (Melbourne, Ballina, Brisbane) in February-March. In February 2024 I took part in a panel on 'The Machines looking back at us' at the Future of Arts, Culture & Technology Symposium (FACT 2024) at ACMI, in Melbourne, Australia.
In April I gave a keynote on 'Machine Learning for Collections' at the University of Cambridge Cultural Heritage Data School, and had a great time talking to the students and staff there. I'll also spoke at an event for the Association for Manuscripts and Archives in Research Collections (AMARC).
In early June I travelled to Dundee, Scotland as one of the CILIPS Annual Conference 2024 keynotes. A brief immersion in the world of Scottish libraries was a refreshing diversion from the ongoing issues at work.