Element
An Element
is the base unit to describe any type of document. At its core, it only has a name, and a type; other fields are optional.
Structure¶
To represent a document in Arkindex, we’ll need to:
- Apply a
Type
to each element. Types are managed at projects level, and can be thought of as categories for elements, - Link elements together, so we can build a complex hierarchy that will represent your document: this is where we’ll use the
Path
, - Link images to some elements; each
Element
can use a part of anImage
by specifying the image and polygon to use.
Types¶
For example, a project with historical books could have the following simple structure:
In this example, we have 4 different Element types:
- Volume is a folder, that will be used to group several elements of type Page
- Page represents a single page of a Volume. These elements will be directly linked to a full image provided by the client.
- Paragraph will be created either by a human annotator, or by a Machine Learning tool.
- Line will generally be generated by a Machine Learning tool. In this case it could either be directly linked to a Page and/or a Paragraph.
Arkindex does not assume any structure for a Project’s types, the Project administrator is free to create as many types as needed. To know more about Element types, please read the next page dedicated to Types.
Hierarchy¶
The hierarchy between elements is not enforced at the Type level (there is no graph between different types), but at the Element level.
A user can create any hierarchy between Elements using multiple Paths. A Path is simply a link between elements, linking a Parent element with Children elements. Using the example above, we can see that different instances of Page
are all linked to a single Volume
.
We would then have two paths to represent that hierarchy:
- From
Page 1
toVolume 1
- From
Page 2
toVolume 1
To know more about Element paths, please read the page dedicated to Paths.
Associate to an image¶
An Element
may have a link towards an Image
, but this is not mandatory. Some elements are purely present for organisation purposes (like a folder on a file system).
In the example above, different elements would be linked to parts of images:
Page
elements would be linked to full size images directly provided by the client,Paragraph
elements would be linked to a large portion of an image,Line
elements would be linked to a thinner portion of an image.
Elements can be linked to images with a polygon. A polygon lists three or more points that specify which part of the image is represented by an element.
Graph structure¶
Most of the time, elements will be organized in a tree structure: a book holds several pages, a page holds paragraphs, a paragraph has lines.
But sometimes, more levels of structuration are needed: a chapter could hold several paragraphs spanning multiple pages, and some pages could have paragraphs belonging to multiple chapters at once. An element of type Topic could hold several pages related to a given topic on many books.
To provide the required flexibility for these structures, Arkindex structures elements in a graph. An element can have multiple parents and children.
The frontend is able to display all the possible parent paths of an element as well as its neighboring elements (previous and next elements in this particular parent path). This allows browsing pages by topic, then flipping to the next page of a book, then switching to the pages of another topic.
API Endpoints¶
These endpoints are the most useful to handle Element
: