A platform for hosting, accessing and sharing transcripts of non-digitised historical manuscripts

At ETH Library Lab, Yvonne Fuchs and Dominic Weber are developing an online platform for transcriptions of historical manuscripts. This infrastructure would enable researchers and hobbyists to collect and share their text sources, and make them publicly available.


Every day the research community creates many transcriptions of historical manuscripts from a broad variety of collections and archives. These are important raw data, but are very time consuming to create. As they are rarely published and there is no platform to share them, these valuable transcriptions remain inaccessible to the public. Existing only in isolated data silos on local computers. Consequently, other researchers must begin at the same starting point as their predecessors: mining of data by creating their own transcription of the same document. Sometimes only after doing so, it is discovered, that the specific document does not even relate closely to the actual topic of interest.

Empowering the community
Our project provides the infrastructure to collect and share crowdsourced transcriptions. Our goal is to reduce redundant work and improve the transcription process. Everybody will be able to up- and download transcriptions. We execute little control over the content of the platform but rather pursue a crowd-driven approach. Whatever content is of interest to users can be added and shared.

The jigsaw of exploration – piece by piece
Our independence from digitized sources (people can upload digital copies or photos, but do not have to) allows for new ways of thinking about transcriptions. In this way, smaller archives which do not have the funds to digitize their collections, can still be represented well.
During the uploading process, users are asked to add metadata for the source they have transcribed in order to provide a rich set of additional information for each record. This allows collections to be explored in further depth and sources to be evaluated for relevance to a particular use. We also want to improve the network within the research community by building bridges between historians, data scientists, students, archives and citizen scientists.

Acting in concert
We believe that working together with other platforms and institutions is essential; for example for community building, sharing and attracting new users. Additionally, as our transcriptions can be linked with digitized manuscripts, we can provide valuable training data for data scientists interested in developing HTR-models (Handwritten Text Recognition) which can automatically transcribe other sources by the same scribe or scriptorium.


Project Duration

1. January 2020 – 30. June 2020

Project related tags

Project Owner

Yvonne Fuchs, Innovator Fellow ETH Library Lab

Yvonne Fuchs

Master Student at University of Basel, Alumna Innovator Fellowship Program

Dominic Weber, Innovator Fellow ETH Library Lab

Dominic Weber

Master Student at University of Basel, Alumnus Innovator Fellowship Program


Professor of Late Medieval and Renaissance History at the University of Basel

Prof. Dr. Lucas Burkart

Academic Supervisor

Assistant Professor of Digital Humanities at the University of Bern

Prof. Dr. Tobias Hodel

Academic Supervisor

We use cookies to help us give you the best possible user experience on our website. By continuing to browse the site you are agreeing to the use of cookies. More information about privacy can be found here.