Arkindex: a document processing platform¶
Arkindex is an open-source platform developed by Teklia for processing large-scale collections of digitized cultural, historical, and scientific content — from textual documents to photographs or cultural artefacts.
Built for institutions, researchers, and data professionals, Arkindex enables large scale analysis of heterogeneous documents using customizable machine learning pipelines and distributed computing infrastructure.

Arkindex is based on multi-level data modelling, allowing users to represent and organise complex digitised collections - from high-level archival structures to the detailed layout of individual documents. It integrates open source machine learning algorithms and models for tasks such as segmentation, transcription and information extraction. Designed for flexibility, Arkindex allows full customisation through its REST API, command-line tools and Python library, giving advanced users and developers complete control over processing workflows, model integration and data pipelines.
Arkindex is open-source and freely available. The Enterprise Edition adds advanced user management, scalability features, and dedicated support → Learn more about licensing options
Get Started¶
- 👉 Get into Arkindex in 4 steps
- 🌐 Try our live demo: demo.arkindex.org
- 🧑💻 Developers: explore the API documentation and CLI tools
- 📦 You need to self-host Arkindex ? Follow the deployment guide
- ❓ Need help? Visit support.teklia.com