Crowdsourcing, also known as “cultural heritage co-creation” or “commons-based per production,” is the practice of organizations inviting volunteers from diverse backgrounds to engage with collections such as digitized manuscript images, camera trap data, or microscopy images, in order to create new datasets to aid researchers and patrons of many kinds. Ideally, crowdsourcing projects further research and widen access to information, while also giving participants the opportunity to gain or sharpen their own research skills and interest in the subject area. Academics and cultural heritage practitioners can leverage crowdsourcing to democratize access to knowledge, and skills such as digital literacy, and palaeography—the ability to interpret handwriting, and transcribe documents.
Transcription radically improves access to collections by making them word-searchable, as well as legible to screen readers used by visually or cognitively impaired people. Crowdsourcing transcription projects have proliferated in the last fifteen years, and are frequently led by cultural heritage organizations keen to increase access to their holdings. This talk will explore two different methods for online crowdsourced text transcription developed by Van Hyning and her colleagues at Zooniverse.org (Oxford University), and the Library of Congress, and their different but overlapping methods of community engagement.
While at Zooniverse, Dr Van Hyning led an interdisciplinary team to develop several line-by-line transcription approaches that attempt to lower barriers to participation, while also improving the accuracy of transcriptions. Each line of text is transcribed by multiple independent volunteers, and then compared using various algorithms. At the Library of Congress she worked with an interdisciplinary team to develop a different transcription approach for By the People (https://crowd.loc.gov).