The whole content is assigned to a Project🔗

    A Project is a collection of elements and Machine Learning results on those elements.

    The Project is a purely organisational object: it allows us to organize Element in different units, and access rights.



    In the API, a Project is referenced as a Corpus, as it was its initial name.

    For example corpus_id in the API is the ID of a Project.

    You'll only see mentions of projects in the web interface (no corpus).

    We plan on removing every mention of corpus and corpus_id in Q3 2021.


    This section will explain in details the different parts of content management in Arkindex:

    • Elements are the base unit to represent any document.
    • Metadatas allow you to describe more precisely the elements.
    • Classifications are used to apply classes on an element.
    • Transcriptions represent the extracted text from your documents.
    • Entities are named and known entities linked to your elements.
    • Exports can be generated to retrieve all of the project's content at once. To learn how to export, see How to export a project.

    Web interface🔗

    A project administrator can edit all project properties from the web interface (More information on roles). To access the management page of a corpus, you can use the Actions dropdown on the main page of a project.

    Management link from the main page of a project
    Management link from the main page of a project

    This menu is also accessible from the projects list by clicking on the users value.

    Quick link to management page from the projects list
    Quick link to management page from the projects list

    All project properties are editable from the management page:

    API endpoints🔗

    The most useful endpoint is ListCorpus which will give you a non-paginated full list of all the projects you have access to, including the associated types and access rights.