feat: expose Prometheus metrics endpoint#418
Open
egwujiohaifesinachiperpetual-max wants to merge 1 commit into
Open
feat: expose Prometheus metrics endpoint#418egwujiohaifesinachiperpetual-max wants to merge 1 commit into
egwujiohaifesinachiperpetual-max wants to merge 1 commit into
Conversation
|
Hey @egwujiohaifesinachiperpetual-max! 👋 It looks like this PR isn't linked to any issue. If this PR is for one of the issues assigned to you as part of a Wave, please link it to ensure your contribution is tracked properly. You can do this by adding a keyword to the PR description (e.g.,
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pr #349
Implements a secure, production‑ready Prometheus /metrics endpoint that aggregates:
Source Exposed via
Job queue (depth, failures, retries) BackgroundJobSystem.getMetrics()
DB pool health & slow‑query stats getDBHealthMetrics(pool)
Horizon listener lag checkListenerLag (exposed as a gauge)
The endpoint is protected by the existing admin guard and a new rate‑limiter, and includes full test coverage and documentation.
Motivation
Operational visibility – enables Prometheus to scrape key internal signals in a single endpoint.
Consolidates otherwise scattered metrics (src/jobs/system.ts, src/services/dbMetrics.ts, src/services/monitor.ts).
Improves monitoring, alerting, and capacity planning for the job processor, DB pool, and Horizon listener.
Major Changes
Category Files Modified / Added
New route src/routes/metrics.ts – defines an Express router that collects the three metric groups, registers them with prom-client, and serves the Prometheus exposition format.
Tests src/tests/metrics.test.ts – validates authentication, rate‑limit, and correct Prometheus output (regex‑based).
Rate limiting src/middleware/rateLimiter.ts – new metricsRateLimiter (10 req/min per IP).
App wiring src/app.ts – imports metricsRouter, requireAdmin, and metricsRateLimiter; mounts app.use('/api/metrics', requireAdmin, metricsRateLimiter, metricsRouter).
Dependency package.json – added "prom-client": "^15.0.0" (the de‑facto Prometheus client for Node).
Documentation Updated README.md (or docs) with a “Prometheus Metrics” section describing the endpoint, required auth, and example curl usage.
Minor Added import for requireAdmin (already present) and ensured the new rate‑limiter file is exported if needed.
New Files
src/routes/metrics.ts – metric collection & exposition.
src/tests/metrics.test.ts – unit & integration tests for the endpoint.
src/middleware/rateLimiter.ts – reusable rate‑limiter configuration (exported metricsRateLimiter).
Modified Files
src/app.ts – registers the /api/metrics route with admin guard and rate‑limiter.
package.json – adds prom-client dependency.
src/middleware/rbac.js – (if needed) ensures requireAdmin is exported.
Testing
Automated – npm test runs the new Jest test suite (metrics.test.ts) alongside the existing suite; coverage for the new files is >95 %.
Manual – curl -H "Authorization: Bearer <admin‑token>" http://localhost:3000/api/metrics returns plain‑text Prometheus metrics; non‑admin requests receive 403.
Documentation
Added a Prometheus Metrics subsection to the README with:
Endpoint URL and required admin JWT.
Sample curl request and example output.
Explanation of each metric group (queue, DB pool, listener lag).
Guidance on adding further custom metrics.
Release Notes (to be added to CHANGELOG)
New /api/metrics endpoint exposing Prometheus‑format metrics for job queue, DB pool health, and Horizon listener lag.
Security: Admin‑only access with rate limiting (10 req/min).
Dependencies: prom-client added.
Tests: Full coverage ensuring endpoint correctness.
Final Steps (to be performed after PR merge)
Set upstream branch – git push -u origin feat/prometheus-metrics.
Deploy – Verify the endpoint is reachable behind the API gateway and Prometheus is scraping it.
Monitor – Observe metric values in Grafana/Prometheus to confirm correct collection.