Data concepts

    This section is an introduction to the main Data concepts we use to build Arkindex and process millions of documents. For a deeper dive, please look into the Projects section.

    Images🔗

    Arkindex uses images stored on IIIF servers. IIIF is a standard defining APIs for image sharing and manipulation.

    This allows us to use images shared by public organisations (museums, digital humanities researchers, libraries, ...) but also private enterprises (using VPNs or dedicated instances).

    Arkindex also embeds its own IIIF server to make its own users' images accessible.

    Elements🔗

    Elements are the core part of Arkindex. They can represent anything with a structural meaning; a shelf, a book, a chapter, a page, a single word, and anything in between.

    Elements are grouped together in projects. Projects allow access control management and define element types. This allows defining flexible structures to answer the needs of any project.

    If you want to know more about Elements, please read the dedicated section.

    Polygons🔗

    Elements can, optionally, be linked to a polygon on an image.

    Images are handled entirely separately from projects and elements, to make images from external IIIF image servers easier to manage.

    Graph structure🔗

    Most of the time, elements will be organized in a tree structure: a book holds several pages, a page holds paragraphs, a paragraph has lines.

    But sometimes, more levels of structuration are needed: a chapter could hold several paragraphs spanning multiple pages, and some pages could have paragraphs belonging to multiple chapters at once. An element of type Topic could hold several pages related to a given topic on many books.

    To provide the required flexibility for these structures, Arkindex structures elements in a graph. An element can have multiple parents and children.

    The frontend is able to display all the possible parent paths of an element as well as its neighboring elements (previous and next elements in this particular parent path). This allows browsing pages by topic, then flipping to the next page of a book, then switching to the pages of another topic.