Advanced documentation
Process templates¶
Process templates are already-configured worker runs that can be applied to new processes. This can be useful for example if you want to process large amounts of documents: if they are split into distinct folders, you can apply a set template to multiple processes to run on each folder, instead of running one big process on all the data, the completion of which would take a long time.
When creating an inference process, instead of building your own combination of workers and models, you can use existing process templates. These templates are available by clicking the Select template button in the process configuration view.
When you select a template and apply it to your process, it copies the template and replaces any existing configured worker runs. However, once you have applied a template you can update your process freely by adding or removing worker runs, changing models or configurations etc. This will not impact the template, only your current process.
Templates can also be created from the process configuration view: once you have added and configured all your worker runs, you can click the Create template button to save them as a new process template. You will then be able to reuse this template in future processes.
Info
Process templates are processes, and as such they are attached to the project they were created on. In order to apply a template to a process outside of its original project, you must have:
- contributor access to the template’s project;
- execution access to all its workers.
Delete worker results¶
You can delete all the results (elements, transcriptions, classifications etc) produced by a given worker run by using the Delete worker results action in the “Process” menu.
In the default mode of the modal, each row represents a worker run (a worker version, a model version and a configuration). By clicking the delete button, you launch (after another click in a confirmation modal) a deletion task that will run in the background until it is finished.
Warning
Deleting worker results is irreversible and the deletion is recursive: if you delete elements produced by a specific worker run, all their child elements (and classifications, transcriptions etc) will be deleted as well, whether or not they were created by that worker run.
By toggling the Advanced mode in the worker results deletion modal, you can directly input the ID of a worker run to delete the results of. You can also use this advanced mode to delete worker results not from specific worker runs but from specific worker versions, model version or worker configurations.
Deleting the worker results produced by a worker run also deletes the corresponding worker activities, allowing for the elements to be processed again with the same combination of worker version, model version and worker configuration.
Worker activity¶
Worker activities are enabled on inference processes, and allow for the tracking of an element’s processing by a worker run. They are used to ensure that the same combination of worker version, model version and worker configuration does not process the same element twice.
This is useful when some elements in a process failed to be processed, for example because there was a temporary error when downloading the images. You can retry the process, and the worker runs will only process the elements that have not been marked as already processed.
Worker activity states¶
Through worker activity monitoring, elements can be assigned four different processing states:
- queued: the element is waiting to be processed.
- started: the element is currently being processed.
- processed: the element has been successfully processed.
- error: something went wrong when processing the element.
If everything goes well, an element in a process goes from queued
to started
to processed
. If something goes wrong during its processing, its final state will be error
instead. However, re-processing failed elements in case the worker encountered errors or some tasks timed out is possible, and so other worker activity state transitions are possible, as shown in the graph below.
stateDiagram
state queued
queued --> started
error --> started
error --> queued
started --> error
started --> started
started --> processed
The transition from started
to started
activity state is only possible in the case of elements stuck in the started
state for which processing is retried after the worker activity timeout has expired.
Process progress monitoring¶
Worker activities also allow you to track the progress of your processes: from the process status view, you can click on Worker activities in the Actions menu. This takes you to a page showing the progress of each worker run, with a count of the number of elements in each worker activity state, and an estimated completion time while the process is still running.
Process retrying¶
If a worker encountered errors while processing some elements, these elements are assigned the error
activity state. Some elements may also get stuck in the queued
or started
state. Elements in the started
state can be processed again after the worker activity timeout expires. Elements in the queued
state can be processed again when the process is retried.
If you Retry the process (the “retry” action is available both from the processes list and from the process status view), all elements except those in the processed
worker activity state will be processed again, for each worker run. Elements in the started
activity state will only be processed again if the worker activity timeout has expired. The same is true if you create a new process with the same worker runs on these elements.
Two additional actions are available from the Worker activities monitoring page:
- The Select all failed elements button adds all the elements in an
error
state to your selection; - The Create process button create a new blank process from all the elements in an
error
state, which you can then configure and run.