Skip to content

feat: fix Prometheus metrics + add health probes and log aggregation (observability) #377

@NotYuSheng

Description

@NotYuSheng

Description

The observability stack is incomplete and partly broken. Identified in the production readiness assessment (#366, docs/production-readiness.md, findings P1-1, P1-2, P1-6).

1. Prometheus metrics do not work

application.yml:81-93 exposes a prometheus actuator endpoint and enables the Prometheus registry, but backend/pom.xml declares only spring-boot-starter-actuator — the micrometer-registry-prometheus dependency is missing, so /actuator/prometheus returns 404 and nothing is scrapeable. Additionally, nginx does not proxy /actuator (nginx/nginx.conf.template), so actuator is only reachable on the backend container's :8080.

2. No health probes

The backend and nginx containers define no Docker healthcheck (only PostgreSQL and MinIO do). There are no liveness/readiness probes. management.endpoint.health.show-details: when-authorized plus no security means health details are never shown.

3. No log aggregation

All services log to stdout (Docker-captured only). The prod profile would write a rotating file to /var/log/tracepcap/ but no volume is mounted for it and the profile is inactive (see #366 P0-3).

Acceptance Criteria

  • Add micrometer-registry-prometheus; /actuator/prometheus returns metrics.
  • Expose/scrape metrics in the target deployment (Kubernetes ServiceMonitor).
  • Add liveness/readiness probes hitting /actuator/health for backend (and a healthcheck for nginx).
  • Ship container stdout to a log aggregation stack (e.g. Loki/ELK); drop the unmounted file-logging path or mount a volume for it.

Affected Files

Metadata

Metadata

Assignees

No one assigned

    Labels

    devopsDeployment and operations

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions