feat(serve): export Prometheus metrics via /metrics endpoint by ilyaters · Pull Request #553 · lightseekorg/tokenspeed

ilyaters · 2026-06-29T11:34:53Z

Problem

The ts serve orchestrator (SMG path) collected runtime metrics via EngineMetrics/RequestMetrics in the engine subprocess but never exposed them to users. The build_server() function in http_server.py lacked the add_prometheus_middleware() call, causing the /metrics endpoint on the control server (port + 1) to return 404 or empty responses.

This made --enable-metrics a no-op for the standard ts serve launch path — metrics were collected into PROMETHEUS_MULTIPROC_DIR but had no HTTP endpoint for scraping.

Solution

http_server.py: Extended build_server() and start() with an enable_metrics parameter. When enabled, mounts the /metrics Prometheus endpoint via add_prometheus_middleware() + enable_func_timer(). Added _metrics_route_mounted() helper for idempotent mounting.
serve_smg.py: Wired enable_metrics through run_smg() → _start_control_server() → build_server(). When enabled, calls set_prometheus_multiproc_dir() before spawning the engine subprocess, ensuring the engine and control server share the same multiprocess directory. This allows MultiProcessCollector to aggregate the engine's tokenspeed:* metrics. Added _engine_args_enable_metrics() to detect --enable-metrics in the engine argv and pass it through run_smg_from_args().
Tests: Added unit tests for _engine_args_enable_metrics() flag parsing (bare flag, =true/1/yes, =false/0, absent, mixed with other flags) and for /metrics route mounting behavior (absent by default, mounted when enabled, skipped when disabled).
Docs: Updated docs/configuration/server.md Observability section with a Prometheus Metrics subsection documenting the /metrics endpoint on the control port, --control-port parameter, and a prometheus.yml scrape config example.

Verification

tokenspeed serve <model> --port 8000 --enable-metrics --tensor-parallel-size 1

# Runtime metrics (tokenspeed:*) available on control port = serve port + 1
curl http://localhost:8001/metrics

Expected output includes tokenspeed_* metrics for request latency, token throughput, and engine performance counters.

Breaking Changes

None. The /metrics endpoint remains disabled by default; existing behavior is preserved unless --enable-metrics is explicitly passed.

Related: # (add issue number if applicable)

…path The ts serve orchestrator (SMG path) collected runtime metrics via Signed-off-by: Ilya Tarasevichia <newilya8@gmail.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: da6f7b2840

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-29T11:38:19Z

+        if token == "--enable-metrics":
+            return True
+        if token.startswith("--enable-metrics="):
+            return token.split("=", 1)[1].lower() in ("1", "true", "yes")


Normalize metrics flag values before detection

When this helper is reached from run_smg_from_args, split_argv() has already normalized --enable-metrics=false or --enable-metrics=true into ['--enable-metrics', 'false'/'true'], so the bare-token branch returns True before the value can be inspected. That makes the newly documented/tested false form enable the control /metrics setup anyway, and the child engine then receives a stray value for a store_true flag. Either reject equals/value forms in the splitter or detect metrics from the raw argv before normalization.

Useful? React with 👍 / 👎.

… normalization split_argv._normalize() rewrites --flag=value into ['--flag', 'value'], Signed-off-by: Ilya Tarasevichia <newilya8@gmail.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 04d4099408

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-29T11:50:42Z

+    if enable_metrics:
+        from tokenspeed.runtime.utils.common import set_prometheus_multiproc_dir
+
+        set_prometheus_multiproc_dir()


Avoid leaking PROMETHEUS_MULTIPROC_DIR to smg

When ts serve --enable-metrics is used, this call mutates the parent os.environ before both children are launched; spawn_gateway() is invoked later without a custom env, so the SMG gateway inherits the same runtime/control multiprocess directory even though it has its own Prometheus exporter. In deployments scraping both endpoints, prometheus_client switches behavior based on this env var, so gateway metrics can be written into the runtime scrape directory or disappear from the gateway exporter; scope this env to the engine/control path or scrub it for the gateway subprocess.

Useful? React with 👍 / 👎.

ilyaters added 2 commits June 29, 2026 12:04

feat(metrics): mount /metrics endpoint on control server in ts serve …

6c56b92

…path The ts serve orchestrator (SMG path) collected runtime metrics via Signed-off-by: Ilya Tarasevichia <newilya8@gmail.com>

Merge branch 'feat/metrics-endpoint-serve' into main

da6f7b2

ilyaters requested a review from a team as a code owner June 29, 2026 11:34

chatgpt-codex-connector Bot reviewed Jun 29, 2026

View reviewed changes

fix(metrics): detect --enable-metrics from raw argv before split_argv…

04d4099

… normalization split_argv._normalize() rewrites --flag=value into ['--flag', 'value'], Signed-off-by: Ilya Tarasevichia <newilya8@gmail.com>

chatgpt-codex-connector Bot reviewed Jun 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(serve): export Prometheus metrics via /metrics endpoint#553

feat(serve): export Prometheus metrics via /metrics endpoint#553
ilyaters wants to merge 3 commits into
lightseekorg:mainfrom
ilyaters:main

ilyaters commented Jun 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ilyaters commented Jun 29, 2026

Problem

Solution

Verification

Breaking Changes

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant