This repository contains a baseline version of the AMR Protein Explorer sample application. It consists or a distributed, in-memory data grid (Hazelcast), a Java SE data ingestion job, and a MicroProfile API service, for deployment in Kubernetes.
This baseline version focuses on a minimal deployment for testing in Minikube. It does not include any optimisations in infrastructure or application code.
This project is inspired in the course published by the EMBL EBI Using UniProt to explore proteins involved in AMR.
Antimicrobial resistance (AMR) is a biological process developed when disease-causing microorganisms are no longer affected by medicines such as antibiotics, antivirals, antifungals, and antiparasitics. Mutations or acquisition of genes, as well as intrinsic resistance displayed by some microorganisms will express as proteins with certain characteristics that cause resistance.
UniProt is a curated database of proteins. This resource can be used to analyse data on proteins linked to antimicrobial resistance.
Retrieve a small sample (100 entries) of antimicrobial resistance linked entries from UniProt, store in a in-memory data grid in a one-off data import job. Then implement an API to query entries by id and keyword (the latter is a work in progress).
This sample application consists of:
- A 3-node Kubernetes StatefulSet
- Distributed in-memory key/value data store and stream processor
- Store UniProt entries related to AMR
- Provides access for APIs and processing pipelines
- User Code Deployment enabled
- Requires Jackson 3.x libraries installed via an initContainer
- A Kubernetes Job that loads AMR-related UniProt JSON data into Hazelcast
- Uses Hazelcast discovery mechanism
- A Java MicroProfile API running on Open Liberty
- Exposes Hazelcast-managed data
- Packaged as a Kubernetes Deployment and exposed as a Cluster IP Service
- Ingress configured to route external traffic (
api.local -> api-service:9080)
k8s/manifests/ ├── api-ingress.yaml ├── api-service.yaml ├── hz-hazelcast.yaml ├── ingestor-job.yaml └── pv.yaml
The Hazelcast nodes mount a PersistentVolume at /opt/hazelcast/bin/user-lib
which is added in the CLASSPATH automatically.
Jackson 3.x libraries required for processing of JSON data are retrieved in an initContainer running BusyBox.
A Helm chart definition is available under k8s/chart (in development).
- Minikube (version 1.37.0 tested with VFkit driver on MacOS 13)
- Java 21
- Maven 3
- Docker cli (engine not required)
Build Java artifacts
mvn clean packageBuild images
eval $(minikube -p minikube docker-env)
docker build -t amr-ingestor:1.0 -f ingestor-service/Containerfile ingestor-service/
docker build -t amr-api:1.0 -f api-service/Containerfile api-service/Examples use kubectl from Minikube
1. Create storage
minikube kubectl -- apply -f k8s/manifests/pv.yaml2. Apply Hazelcast RBAC rules
minikube kubectl -- apply -f https://raw.githubusercontent.com/hazelcast/hazelcast/master/kubernetes-rbac.yaml3. Deploy Hazelcast cluster
minikube kubectl -- apply -f k8s/manifests/hz-hazelcast.yamlCheck logs
minikube kubectl -- logs hz-hazelcast-04. Run ingestion job
minikube kubectl -- apply -f k8s/manifests/ingestor-job.yaml5. Deploy API service
minikube kubectl -- apply -f k8s/manifests/api-service.yaml6. Expose API via Ingress
minikube kubectl -- apply -f k8s/manifests/api-ingress.yaml
echo "$(minikube ip)" | sudo tee -a /etc/hosts7. Call the API
curl -H "Accept: application/json" http://api.local/amr/proteins/P0DX93Please, file an issue for any suggestions for features to add or improvements.
Apache 2.0