Skip to content

Import a Transkribus collection

You can import Transkribus collections into Arkindex. When importing documents from Transkribus to Arkindex, you keep the folder organisation of the collection, as well as import its images and the annotations you have produced.

In order to import documents from Transkribus to Arkindex, you must first export them from Transkribus, and then import the downloaded ZIP files to Arkindex.

Warning

File imports to Arkindex are limited to 2GB. If your files are larger, you may need to ask an instance administrator to do the import for you.

Export documents from Transkribus

Using the web interface

Having logged into the Transkribus web interface, you can export documents from any of the collections you have access to. Access that collection from the Home page or the Collections tab, and select all or only some of the documents/folders in it. Then click on the next to Train Model in the actions bar above the documents, and click on Export.

Export Transkribus documents

This opens a configuration modal. In order for your collection to be correctly imported into Arkindex, you need to select the following options:

  • Standard export in the select menu at the top;
  • Images and Page XML as the formats to be included in the export.

Make sure that the Alto XML checkbox is unchecked, by expanding the “Expand more formats” menu.

Export configuration

You can then click the Start export button. More or less immediately, depending on the number of images you are exporting, the configuration modal will close and a notification will appear, informing you that your export has started, and that you will receive an email containing a link to download the exported archive when it is ready.

Export start notification

Info

The export generates one ZIP file per exported document/folder, so you will receive as many emails and end up with as many archives as the number of exported documents.

Using the desktop application

Warning

Transkribus have announced they will no longer be maintaining the desktop client, which means these instructions may cease to be valid in the future, if something changes or is disabled in Transkribus. You can export your documents from the web interface instead.

You can export collections from the Transkribus desktop client. Once logged in, you can export any of the collections you have access to.

First, open one of the documents from that collection. Then, click on the Export document button in the toolbar, or use Main MenuDocumentExport document.

The Export document button

In the dialog box, in the Server export tab, click on Current collection to export the whole collection. If you want to restrict the export to a few documents, click on Choose documents to export to select them.

Make sure that the following options are selected:

  • Only the Transkribus Document export format is selected;
  • The Mets tab is selected;
  • The Image type field is set to Original;
  • Export Page and Export Image are checked;
  • Use word layer and Do blackening are unchecked.

Other options are not relevant for Arkindex imports.

Example of a properly configured export

Once your configuration is complete, click on OK to start the export. You will receive an email with a link to a single ZIP archive that contains all your exported documents.

Import ZIP files into Arkindex

Once you have downloaded your ZIP files, you can import them to Arkindex using the file import.