This page is intended for system administrators, to document common tasks and tips needed to efficiently manage an Arkindex instance.
Arkindex requires the following S3-API buckets to store various files:
Configuration | Default name | Description |
---|---|---|
s3.staging_bucket | staging | Staging area where user uploaded files are stored before validation |
s3.export_bucket | export | SQLite databases generated by the project export task |
s3.thumbnails_bucket | thumbnails | Folder thumbnails generated by the thumbnails generation worker |
s3.training_bucket | training | Machine Learning models |
s3.ponos_artifacts_bucket | ponos-artifacts | Artifact files produced by processes |
s3.ponos_logs_bucket | ponos-logs | Log files for all tasks of all processes |
N/A | iiif-cache | (Optional) Bucket used by the Cantaloupe IIIF server to cache image renderings |
N/A | uploads | (Optional) Bucket used by the Cantaloupe IIIF server to expose locally uploaded images |
You can create these buckets on MinIO or any other S3-API compatible service by using the MinIO Client (open-source):
# Login on your provider by creating an alias
mc alias set arkindex-s3 <URL> <LOGIN> <PASSWORD>
# Create required buckets by prefixing them with the alias name
mc mb arkindex-s3/staging
mc mb arkindex-s3/export
...
As an administrator, you can use the administration interface on your Arkindex instance, available at https://<INSTANCE_URL>/admin/
(for example, if your instance lives at ark.localhost
, you'll be able to use the interface here).
Beware of your actions on this interface: you can delete most items in the Arkindex database, without any rights checks. You are an administrator: with great power comes great responsibility.
By clicking on the Users > Users link from the main page, you'll reach the user management. From there you can:
By clicking on the Documents > Corpora link from the main page, you'll reach the project management (projects are internally named corpus). From there you can:
This page is not linked from the main administration page, and is only reachable by using the link https://<INSTANCE_URL>/rq/
From that page, you can view all the asynchronous queues, current tasks and active workers.