View all files | ||||
This repo contains prototyping work for creating an OPTIMADE API for searching and accessing structures from the Inorganic Crystal Structure Database (ICSD).
The structures are accessed via the ICSD REST API and cast to the OPTIMADE format; the optimade-maker and optimade-python-tools are then used to launch a local OPTIMADE API.
After cloning this repository and using some appropriate method of creating a virtual environment (current recommendation is uv), this package can be installed with
or
Important
Any attempts to use ICSD data will additionally require a valid ICSD license with login details provided at runtime.
The CSD can be ingested into the OPTIMADE format using the icsd-ingest entrypoint:
This will use multiple processes (controlled by --num-processes) to ingest the local copy of the CSD database in chunks of size --chunk-size until the target --num-structures has been reached (defaults to the entire CSD). Each batch will be written to an OPTIMADE JSONLines file, and combined into a single JSONLines file on completion, with name <--run-name>-optimade.jsonl.
The icsd-serve entrypoint provides a thin wrapper around the optimade-maker tool, and bundles the simple configuration required to launch a local OPTIMADE API with a simple in-memory database (if --mongo-uri is provided, a real MongoDB backend will be used). Just provide the path to your combined OPTIMADE JSONLines file:
You should now be able to try out some queries locally, either in the browser or with a tool like curl:
For ease of deployment, as containerised version of the ingestion pipeline is available.
Important
You should verify that your license agreement allows for any kind of deployment outside of your private network; it likely does not.
For development, you may prefer to use the bake definitions in docker-bake.hcl to build and tag the relevant build stages.
All development of this package (bug reports, suggestions, feedback and pull requests) occurs in the csd-optimade GitHub repository. Contribution guidelines and tips for getting help can be found in the contributing notes.
This project was developed by datalab industries ltd., on behalf of the UK's Physical Sciences Data Infrastructure (PSDI).