Concepts overview

This page offers an overview of what workers and worker versions are in Arkindex, and what they can be used for. For more details on specific features, see the advanced worker documentation.

Workers and worker versions¶

In Arkindex, a Worker is a high-level object that represents an algorithm and its accompanying code, wrapped in a Docker image, that can retrieve data from Arkindex and do something with this data. For example, you can have a worker called “Text Line Segmenter” which identifies text lines in an image. What this worker does is retrieve images from Arkindex, detect text lines, and publish these text lines as elements on Arkindex.

In Arkindex, you can execute a worker on elements by creating a Process. You can read the process documentation to find out more about what processes are and how to configure them.

However, you never actually use the worker itself: the worker is only the parent object, and what you execute are Worker versions. Worker versions exist so that the workers can evolve and improve: when a worker’s code is updated, a new version is published on Arkindex, so that the latest version is available.

For example, perhaps version 1 of the “Text Line Segmenter” worker only published text_line elements with no additional information, but version 2 also publishes a confidence score as a metadata on these elements. The core principle is still the same, but there’s been an improvement. Worker versions can also be published to follow changes in Arkindex’s API, or package updates, etc.

See the worker version documentation for more details.

Inference workers and training workers¶

Arkindex supports two types of workers:

Inference workers¶

An inference worker is a worker that takes Arkindex elements as input, and outputs objects that are published on Arkindex (as elements, classifications, metadata etc.)

If you want to develop your own worker to use in Arkindex, you can follow the worker developer documentation.

Inference workers can be based on very simple or very complex algorithms. Complex workers based on machine learning technologies may use Models: the same code can handle results from different machine learning models. For example, you could have a generic document layout analysis worker, and depending on the model it uses, the objects that are predicted and published on Arkindex would be different. See the models documentation for more information on models in Arkindex.

Arkindex workers can use models in inference processes, but they can also be used to train models.

Training workers¶

You can use Arkindex to train machine learning models. In order to do so, you must use a training worker. The difference between a training worker and an inference worker only lies in what it does and how it is used. Instead of predicting and publishing objects on Arkindex, a training worker uses data from Arkindex to train a machine learning algorithm, and at the end of the training process create a new model version. This model version can then be used by inference workers in processes.

If you want to develop your own training worker to use in Arkindex, you can follow the worker developer documentation.

Workers on Arkindex¶

You can see a list of the workers you have access to on an Arkindex instance by clicking Workers in the user e-mail dropdown menu in the top-right corner.

You can see the details of a worker and its description by clicking on its name.

Recommended worker version¶

When viewing the details of a worker on Arkindex, a recommended worker version is highlighted in the top-right corner. The recommended worker version is either:

The latest available version from a main or master branch, or
The latest available version with a tag that starts with a number and is not an unstable version according to the Python version specifiers specs.

The recommended worker version is the latest stable version, and the one most users should use in their processes.

If no recommended worker version is found for a worker, an error message is shown.

All other worker versions are available under the Versions tab.

For workers that are linked to a Git repository, only the versions from the main / master branches, or tagged versions, are displayed in the Versions tab. To see all versions regardless of branch or tag, you can click the Display all versions toggle.

The "Display all versions" toggle in the worker versions list

Creating workers and worker versions¶

You can create workers and worker versions from the Arkindex frontend.