Skip to content

Element

An Element is the base unit to describe any type of document. At its core, it only has a name, and a type; other fields are optional.

Structure

To represent a document in Arkindex, we’ll need to:

  1. Apply a Type to each element. Types are managed at projects level, and can be thought of as categories for elements,
  2. Link elements together, so we can build a complex hierarchy that will represent your document: this is where we’ll use the Path,
  3. Link images to some elements; each Element can use a part of an Image by specifying the image and polygon to use.
Links between Element and structural components

Types

For example, a project with historical books could have the following simple structure:

Example of Element structure for an historical project

In this example, we have 4 different Element types:

  • Volume is a folder, that will be used to group several elements of type Page
  • Page represents a single page of a Volume. These elements will be directly linked to a full image provided by the client.
  • Paragraph will be created either by a human annotator, or by a Machine Learning tool.
  • Line will generally be generated by a Machine Learning tool. In this case it could either be directly linked to a Page and/or a Paragraph.

Arkindex does not assume any structure for a Project’s types, the Project administrator is free to create as many types as needed. To know more about Element types, please read the next page dedicated to Types.

Hierarchy

The hierarchy between elements is not enforced at the Type level (there is no graph between different types), but at the Element level.

A user can create any hierarchy between Elements using multiple Paths. A Path is simply a link between elements, linking a Parent element with Children elements. Using the example above, we can see that different instances of Page are all linked to a single Volume.

We would then have two paths to represent that hierarchy:

  1. From Page 1 to Volume 1
  2. From Page 2 to Volume 1

To know more about Element paths, please read the page dedicated to Paths.

Associate to an image

An Element may have a link towards an Image, but this is not mandatory. Some elements are purely present for organisation purposes (like a folder on a file system).

In the example above, different elements would be linked to parts of images:

  • Page elements would be linked to full size images directly provided by the client,
  • Paragraph elements would be linked to a large portion of an image,
  • Line elements would be linked to a thinner portion of an image.

Elements can be linked to images with a polygon. A polygon lists three or more points that specify which part of the image is represented by an element.

Graph structure

Most of the time, elements will be organized in a tree structure: a book holds several pages, a page holds paragraphs, a paragraph has lines.

But sometimes, more levels of structuration are needed: a chapter could hold several paragraphs spanning multiple pages, and some pages could have paragraphs belonging to multiple chapters at once. An element of type Topic could hold several pages related to a given topic on many books.

To provide the required flexibility for these structures, Arkindex structures elements in a graph. An element can have multiple parents and children.

The frontend is able to display all the possible parent paths of an element as well as its neighboring elements (previous and next elements in this particular parent path). This allows browsing pages by topic, then flipping to the next page of a book, then switching to the pages of another topic.

API Endpoints

These endpoints are the most useful to handle Element: