‘Creating a Digital History Commons through crowdsourcing and participant digitisation’ at Herrenhausen DH Conference

I was awarded a travel grant to attend the Herrenhausen Conference: “(Digital) Humanities Revisited – Challenges and Opportunities in the Digital Age” in Hannover, Germany, over December 5-7, 2013. I’d like to thank the Volkswagen Foundation (VolkswagenStiftung) for funding travel for 37 early career scholars and for the opportunity to present there.

My lightning talk notes, further information and references for ‘Peer production models for academic and amateur historians: challenges and opportunities’ are below. Obviously the full reference list for my PhD would be huge so below I’ve selected items that relate specifically to my poster and talk. PDF of my poster on ‘Creating a Digital History Commons through crowdsourcing and participant digitisation’.

My lightning talk

I might as well start with a provocation: the best technology in the world won’t solve a single problem unless it’s accompanied by social solutions.

My poster outlines an approach for uniting collections from memory institutions (museums, libraries, archives) and ‘shoebox’ archives from the public into a shared Commons. Items in this Commons could be enhanced by combining crowdsourcing with the work historians do on their personal research collections of documents. In the tradition of ‘history from below’, my goal is to make material about ordinary people available alongside official archives – if you’ll excuse the pun, it could be a form of ‘open source history’.

In this talk I want to address the challenges rather than the opportunities because successful scholarly technology projects are as much about change management as they are about code.

A lot of my PhD research has investigated crowdsourcing in cultural heritage because I’m interested in attributes like interaction and microtask design that make crowdsourcing immensely more productive than other forms of user-generated content. Can you combine the energy of crowdsourcing with the knowledge historians create while doing their work? There’s no point setting up a Commons if the content can’t be used by historians so what challenges around authority, reliability, trust, academic credit and authorship need to be addressed?

Trust between historians is usually negotiated over a series of personal encounters. Historians will share data, but they do so in ways that let them maintain control of their material, and in specific contexts –  usually personal exchanges with someone who’s either distant enough not to be a competitor (for academics), not going to misuse the material (for family historians), or is trusted not to rip off material. Information is often shared progressively, and getting access to more information depends on your behaviour after the initial exchange – for people used to this, dumping stuff onto a website would be quite challenging.

Historians already have methods for assessing reliability, so it’s a matter of understanding and supporting them in digital interfaces. Unverified material is winnowed out before publication, even if it provides useful background context. No historian I interviewed would admit to using a source where they hadn’t seen an image of the original document, so a Commons would always need to include images.

Inexperienced or untrained historians don’t yet have the same tacit knowledge about the normal range of data represented by sources and might not think to look for silences and omissions, so representing other historians’ judgement about sources might be useful.

It appears that reliability is not vested in the identity of the digitiser but in the source itself. Content found on online sites is tested against a set of finely-tuned ideas about the normal range of documents rather than the authority of the poster. So for participatory Commons, authority matters less than I’d expected – obviously it’s different for scholarly publications.

Authorship and academic credit for collaborative resources are tricky. Putting materials into context is often an interpretive process, but at what point does it become an authorial act? And how should credit be assigned for these acts? (Particularly at a time when scholars are still fighting to have digital projects recognised alongside traditional publications.)

Most academic historians are wary about sharing data from current research projects. One solution might be social change that normalises sharing data collected for major publications when that publication is launched. Another solution might be to disregard academic credit and focus on the benefit the historian receives after sharing their data collections – they can get help cataloguing and digitising their research collections through crowdsourced tagging and transcription.

In addition to the usual questions about the circumstances through which a historic document came to survive, there’s an invisible context behind any item that ends up in a digital Commons from an unofficial source like a personal research collection or shoebox archive. Information about who collected or digitised the content and why; what wasn’t digitised, transcribed, or noted; not just what percentage of the whole was collected but what shaped the collection; points where information might have been partially, incorrectly or just hurriedly transcribed… The shape of the shadow archive that did not survive or was not digitised should also be represented in such a Commons. As digital humanists apply more methods from ‘big data’ it’s particularly important to problematise the visual representations of official collections by hinting at the absences in the archive.

Finally, a plea to archives, libraries and museums – if you haven’t digitised everything in your archive yet, then please let people take photos of documents!

This paper was based on my PhD research and various publications. If you’re curious about projects that are putting parts of this into practice, start with the Imperial War Museum’s Appeal for war stories to create biggest ever digital history archive, public contributions to Trove Australia and the UK National Archives’ Capturing Academic Expertise: How and Why?.

And what problem does this all solve? As Tim Hitchcock said when discussing digital history, ‘Until we get around to including the non-canonical, the non-Western, the non-textual and the non-elite, we are unlikely to be very surprised’.

‘Abstract of the project you want to present’ (from my application)

Historians have happily adopted some digital research tools – for example, using online catalogue to find and view digital images of documents or trying out the names of historical figures they’re researching in Google to see what it can find – but are they missing the full potential of new methodological approaches? Researchers often repeat the same tasks on historical materials – for example, recording information about a document from the archives and transcribing sections of the text – but the results sit in private databases and folders rather than becoming a resource for the next researcher interested in the same material. Could open source and peer production methods provide models for ‘participant digitisation’, creating wider benefits from the activities that historians are already undertaking for their own purposes? How might this disrupt current notions of authority, reliability, trust, academic credit and authorship?

This project is part of my final year PhD research and the results presented would be based my latest findings from my in-depth interviews with academic and ‘amateur’ family or local historians, and contextualised through current literature on crowdsourcing, online collaboration, peer production, archival research practices and digital content lifecycles.

‘What specific value does your project add to the digital humanities framework?’ (from my application)

My project is motivated by a vision of scholarly collaboration (or ‘peer production’) as a method for creating repositories of historical documents for use by both academic and ‘amateur’ researchers. Humanities crowdsourcing projects are generally open to participants who are judged on the quality of their work, not their qualifications, and the productivity of those projects is seen as a solution to the digitisation backlog – but are the resources created usable by historians? Understanding how people assess the quality of resources and potential barriers to their decision to contribute to and use such repositories is vital if they are to be widely used. More broadly, my PhD aims to understand the impact of digitality on humanities scholarship by comparing the practices and attitudes of academic and ‘amateur’ family or local historians regarding the evaluating, using and contributing to scholarly crowdsourcing.

