This documentation is aimed at Digital Humanities specialists who want to transcribe handwritten or printed documents using our Arkindex platform.
In this section you will discover how to run a full project using Arkindex and Callico.
We will cover the following topics:
- Import openly available datasets into Arkindex,
- Create ground truth annotations for segmentation using Callico,
- Train a new segmentation model with your own annotated data.
- Create ground truth annotations for transcription using Callico,
- Train a new transcription model with your own annotated data.
- Run your own models on any image you wish to segment and transcribe.
- Export the segmentation and transcription results out of Arkindex, and in a PAGE XML format.
- Conclusion and limitations.
The tutorial overview that will introduce this section's content is a good place to start.
This tutorial is a work in progress, and we are working on new topics to cover page classification.