Skip to content

Arkindex: a document processing platform

Arkindex is an open-source platform developed by Teklia for processing large-scale collections of digitized cultural, historical, and scientific content — from textual documents to photographs or cultural artefacts.

Built for institutions, researchers, and data professionals, Arkindex enables large scale analysis of heterogeneous documents using customizable machine learning pipelines and distributed computing infrastructure.

Akindex makes it possible to apply multiple automatic processes to collections of digitised documents in order to extract all types of information.

Arkindex is based on multi-level data modelling, allowing users to represent and organise complex digitised collections - from high-level archival structures to the detailed layout of individual documents. It integrates open source machine learning algorithms and models for tasks such as segmentation, transcription and information extraction. Designed for flexibility, Arkindex allows full customisation through its REST API, command-line tools and Python library, giving advanced users and developers complete control over processing workflows, model integration and data pipelines.

Arkindex is open-source and freely available. The Enterprise Edition adds advanced user management, scalability features, and dedicated support → Learn more about licensing options


Get Started