Refs #406: Add public discovery routes by ifanatics-media · Pull Request #552 · ramimbo/mergework

ifanatics-media · 2026-05-28T00:54:03Z

Summary

Add bounded public discovery routes for /robots.txt, /sitemap.xml, and /favicon.ico.
Link the base template to /favicon.ico so browsers stop probing a missing icon path.
Keep the sitemap conservative: stable public entry points only, using MERGEWORK_PUBLIC_BASE_URL rather than the request/test host.

Evidence

Live production smoke before the fix, using unauthenticated public requests only:

GET https://mrwk.ltclab.site/robots.txt -> HTTP 404 JSON {"detail":"Not Found"}
GET https://mrwk.ltclab.site/sitemap.xml -> HTTP 404 JSON {"detail":"Not Found"}
GET https://mrwk.ltclab.site/favicon.ico -> HTTP 404 JSON {"detail":"Not Found"}
API-host discovery routes also returned bounded JSON 404s.

This is a small browser/crawler polish fix: standard discovery URLs should either exist intentionally or fail closed. The current behavior is bounded, but it creates routine browser 404 noise and gives crawlers/agents no sitemap entry point for public docs, bounties, ledger, wallet, and status pages.

Live bounty preflight for #406 / internal bounty 66 showed status=open, awards_remaining=15, and no active attempts.

Validation

.\.venv\Scripts\python.exe -m pytest tests\test_api_mcp.py::test_public_discovery_routes_are_bounded_and_use_public_origin tests\test_api_mcp.py::test_head_requests_match_get_routes_without_body -q -> 2 passed
.\.venv\Scripts\python.exe -m pytest tests\test_api_mcp.py tests\test_hub.py -q -> 81 passed
.\.venv\Scripts\python.exe -m pytest -q -> 415 passed
.\.venv\Scripts\python.exe -m ruff check . -> passed
.\.venv\Scripts\python.exe -m ruff format --check . -> 79 files already formatted
.\.venv\Scripts\python.exe -m mypy app -> success
.\.venv\Scripts\python.exe scripts\docs_smoke.py -> docs smoke ok
git diff --check -> clean

No secrets, wallet material, private keys, tokens, cookies, OAuth state, private data, production mutation, price claims, liquidity claims, exchange claims, bridge promises, or private security details are included.

Summary by CodeRabbit

New Features
- Added /sitemap.xml, /robots.txt, and /favicon.ico endpoints to improve SEO and site discoverability
- Added favicon link in site header
- Discovery endpoints use a configured public base URL and normalize it to avoid malformed or double-slash URLs
Tests
- Added tests verifying discovery endpoints return correct content and headers, respond to HEAD with empty bodies, are excluded from the API schema, and respect base URL normalization

coderabbitai · 2026-05-28T00:54:12Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: f82189fb-8e43-4643-8f81-160e248d3554

📥 Commits

Reviewing files that changed from the base of the PR and between 2a1de8a and d405c15.

📒 Files selected for processing (2)

app/main.py
tests/test_api_mcp.py

📝 Walkthrough

Walkthrough

Adds three SEO discovery endpoints (/robots.txt, /sitemap.xml, /favicon.ico) to FastAPI with inline favicon content and sitemap configuration, links the favicon in the base template, and tests that all routes return expected content using the public base URL.

Changes

Public discovery routes

Layer / File(s)	Summary
SEO route implementation and constants `app/main.py`	Imports `xml_escape`, defines `PUBLIC_SITEMAP_PATHS` list and `FAVICON_SVG` content, and registers three non-schema routes that serve robots directives, dynamically generated XML sitemap, and SVG favicon using the configured public base URL.
Template favicon link `app/templates/base.html`	Adds `<link rel="icon">` tag referencing `/favicon.ico` with SVG image type in the template head.
Discovery route test coverage `tests/test_api_mcp.py`	Tests all three endpoints for correct status codes, content types, exact/partial body matches (including public URLs and absence of test-host), normalization of configured public base URL, omission from OpenAPI `paths`, and validates HEAD requests return empty bodies.

🚥 Pre-merge checks | ✅ 6

✅ Passed checks (6 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Refs `#406`: Add public discovery routes' clearly names the changed surface (discovery routes) and is directly related to the main changeset.
Description check	✅ Passed	The description includes all required template sections with substantive content: Summary (features added), Evidence (production behavior and bounty preflight), and Test Evidence (all checks marked and detailed validation results provided).
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Mergework Public Artifact Hygiene	✅ Passed	No investment claims, price claims, cash-out/off-ramp claims, fabricated payouts, or private security details found in PR files (main.py, base.html, test_api_mcp.py) or description.
Bounty Pr Focus	✅ Passed	PR `#552` (Refs `#406`) correctly implements discovery routes with focused scope: three routes, one template change, two tests. No unrelated changes detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: f12fc1fa-de83-4ee4-8420-87ea4d8ec7a8

📥 Commits

Reviewing files that changed from the base of the PR and between d8532d4 and 33f56be.

📒 Files selected for processing (3)

app/main.py
app/templates/base.html
tests/test_api_mcp.py

Baijack-star

Requesting changes for a narrow test-coverage gap before merge.

The route implementation itself looks correct in the slice I checked: /robots.txt, /sitemap.xml, and /favicon.ico return bounded 200 responses; the sitemap uses MERGEWORK_PUBLIC_BASE_URL rather than the test/request host; the base template links /favicon.ico; and the routes are currently absent from /openapi.json because they are registered with include_in_schema=False.

The missing piece is regression coverage for that last contract. Since these are public browser/crawler discovery routes and intentionally hidden from the API schema, please add assertions to test_public_discovery_routes_are_bounded_and_use_public_origin (or a focused companion test) that /robots.txt, /sitemap.xml, and /favicon.ico are not present in client.get("/openapi.json").json()["paths"]. This matches the CodeRabbit pre-merge warning and prevents a future refactor from accidentally exposing these non-API routes in OpenAPI.

Validation I ran on head 33f56be6bffbaa16110d3af5b9b7cec51c537a62:

focused discovery/HEAD tests -> 2 passed
tests/test_api_mcp.py tests/test_hub.py -> 81 passed
route smoke: all three discovery routes returned 200 with expected content types and / rendered the favicon link
OpenAPI probe: /robots.txt, /sitemap.xml, and /favicon.ico are currently absent from paths
scoped Ruff check/format on Python files passed
mypy app/main.py passed
docs smoke passed
git diff --check origin/main...HEAD clean

ifanatics-media · 2026-05-28T01:01:29Z

Addressed in 2a1de8a by extending test_public_discovery_routes_are_bounded_and_use_public_origin to assert /robots.txt, /sitemap.xml, and /favicon.ico stay absent from /openapi.json paths.

Validation after the update:

.\.venv\Scripts\python.exe -m pytest tests\test_api_mcp.py::test_public_discovery_routes_are_bounded_and_use_public_origin tests\test_api_mcp.py::test_head_requests_match_get_routes_without_body -q -> 2 passed
.\.venv\Scripts\python.exe -m pytest tests\test_api_mcp.py tests\test_hub.py -q -> 81 passed
.\.venv\Scripts\python.exe -m pytest -q -> 415 passed
.\.venv\Scripts\python.exe -m ruff check . -> passed
.\.venv\Scripts\python.exe -m ruff format --check . -> 79 files already formatted
.\.venv\Scripts\python.exe -m mypy app -> success
.\.venv\Scripts\python.exe scripts\docs_smoke.py -> docs smoke ok
git diff --check -> clean

yunrongy424-oss

Requesting changes for one small URL-normalization edge case.

/sitemap.xml already normalizes settings.public_base_url with rstrip(/), but /robots.txt builds the sitemap URL from the raw setting. Deploy validation allows MERGEWORK_PUBLIC_BASE_URL with path /, so https://mrwk.example.test/ is a valid origin-style setting. With that setting, the current route returns:

Sitemap: https://mrwk.example.test//sitemap.xml

That is avoidable crawler/discovery noise and inconsistent with the sitemap route's normalized origin. Please reuse a stripped base URL in robots_txt() and add a regression assertion for the trailing-slash setting.

Evidence on head 33f56be6bffbaa16110d3af5b9b7cec51c537a62:

focused discovery/HEAD tests -> 2 passed
scoped Ruff check/format on app/main.py and tests/test_api_mcp.py -> passed
mypy app/main.py -> passed
docs smoke -> ok
git diff --check origin/main...HEAD -> clean
extra probe with MERGEWORK_PUBLIC_BASE_URL=https://mrwk.example.test/ reproduced the doubled sitemap slash above

No secrets, wallet material, private deployment values, private vulnerability details, live mutation, price claims, liquidity claims, or off-ramp claims were used.

ifanatics-media · 2026-05-28T01:04:24Z

Addressed the trailing-slash base URL edge in d405c15 by normalizing the base URL in robots_txt() before appending /sitemap.xml, matching the sitemap route. I also added test_public_discovery_routes_normalize_public_base_url, which sets MERGEWORK_PUBLIC_BASE_URL=https://mrwk.example.test/ and verifies neither robots nor sitemap emits doubled slashes.

Validation after this update:

.\.venv\Scripts\python.exe -m pytest tests\test_api_mcp.py::test_public_discovery_routes_are_bounded_and_use_public_origin tests\test_api_mcp.py::test_public_discovery_routes_normalize_public_base_url tests\test_api_mcp.py::test_head_requests_match_get_routes_without_body -q -> 3 passed
.\.venv\Scripts\python.exe -m pytest tests\test_api_mcp.py tests\test_hub.py -q -> 82 passed
.\.venv\Scripts\python.exe -m pytest -q -> 416 passed
.\.venv\Scripts\python.exe -m ruff check . -> passed
.\.venv\Scripts\python.exe -m ruff format --check . -> 79 files already formatted
.\.venv\Scripts\python.exe -m mypy app -> success
.\.venv\Scripts\python.exe scripts\docs_smoke.py -> docs smoke ok
git diff --check -> clean

yunrongy424-oss

Re-reviewed current head d405c15ab50b962fa4c76346bab5da87ef8bf13d after the follow-up fix.

The trailing-slash edge I flagged is resolved: robots_txt() now normalizes settings.public_base_url before appending /sitemap.xml, and the new regression test covers MERGEWORK_PUBLIC_BASE_URL=https://mrwk.example.test/ without emitting doubled slashes in either robots or sitemap output.

Validation on the current head:

focused discovery/normalization/HEAD tests -> 3 passed
Ruff check/format on app/main.py and tests/test_api_mcp.py -> passed
docs smoke -> ok

No remaining blocker in my reviewed slice.

Baijack-star

Re-reviewed current head d405c15ab50b962fa4c76346bab5da87ef8bf13d after the two follow-up commits.

My previous blocker is resolved: test_public_discovery_routes_are_bounded_and_use_public_origin now asserts /robots.txt, /sitemap.xml, and /favicon.ico stay absent from /openapi.json, so these public browser/crawler routes remain outside the API schema contract.

I also checked the later trailing-slash fix from this head. robots_txt() now normalizes settings.public_base_url before appending /sitemap.xml, matching the sitemap route, and the new regression covers MERGEWORK_PUBLIC_BASE_URL=https://mrwk.example.test/ without doubled slashes in robots or sitemap output.

Validation run locally on current head:

PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ./.venv/bin/python -m pytest tests/test_api_mcp.py::test_public_discovery_routes_are_bounded_and_use_public_origin tests/test_api_mcp.py::test_public_discovery_routes_normalize_public_base_url tests/test_api_mcp.py::test_head_requests_match_get_routes_without_body -q -> 3 passed
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ./.venv/bin/python -m pytest tests/test_api_mcp.py tests/test_hub.py -q -> 82 passed
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 ./.venv/bin/python scripts/docs_smoke.py -> docs smoke ok
./.venv/bin/python -m mypy app/main.py -> success
./.venv/bin/python -m ruff check app/main.py tests/test_api_mcp.py -> passed
./.venv/bin/python -m ruff format --check app/main.py tests/test_api_mcp.py -> already formatted
git diff --check origin/main...HEAD -> clean
ad hoc TestClient smoke with MERGEWORK_PUBLIC_BASE_URL=https://mrwk.example.test/ confirmed all three discovery routes return 200, no doubled sitemap URL is emitted, and all three remain absent from OpenAPI.

No remaining blocker in my reviewed slice.

tinyopsstudio · 2026-05-28T01:16:32Z

Reviewed PR #552 at d405c15ab50b962fa4c76346bab5da87ef8bf13d for the public discovery routes.

Evidence checked:

inspected app/main.py, app/templates/base.html, and tests/test_api_mcp.py;
confirmed /robots.txt, /sitemap.xml, and /favicon.ico are registered with include_in_schema=False, so they stay out of the OpenAPI contract;
confirmed robots and sitemap use settings.public_base_url.rstrip("/"), avoiding test-host leakage and doubled slashes when MERGEWORK_PUBLIC_BASE_URL has a trailing slash;
confirmed the base template points browsers at /favicon.ico, and the HEAD middleware leaves discovery-route bodies empty on HEAD requests.

Validation:

PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run --extra dev python -m pytest tests/test_api_mcp.py::test_public_discovery_routes_are_bounded_and_use_public_origin tests/test_api_mcp.py::test_public_discovery_routes_normalize_public_base_url tests/test_api_mcp.py::test_head_requests_match_get_routes_without_body -q -> 3 passed
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run --extra dev python -m pytest tests/test_api_mcp.py tests/test_hub.py -q -> 82 passed
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run --extra dev python -m pytest -q -> 416 passed
uv run --extra dev ruff check app/main.py tests/test_api_mcp.py -> passed
uv run --extra dev ruff format --check app/main.py tests/test_api_mcp.py -> 2 files already formatted
git diff --check origin/main...HEAD -> clean

Assessment: no blocker found. The change is bounded to browser/crawler discovery surfaces and keeps the API schema unchanged.

Add public discovery routes

33f56be

ifanatics-media mentioned this pull request May 28, 2026

MRWK bounty: useful bug reports and small fixes, round 5 #406

Open

coderabbitai Bot reviewed May 28, 2026

View reviewed changes

Comment thread tests/test_api_mcp.py

Baijack-star suggested changes May 28, 2026

View reviewed changes

Baijack-star mentioned this pull request May 28, 2026

MRWK bounty: review open MergeWork PRs with evidence, round 12 #447

Open

Cover discovery routes outside OpenAPI

2a1de8a

yunrongy424-oss suggested changes May 28, 2026

View reviewed changes

Normalize discovery route public origin

d405c15

yunrongy424-oss approved these changes May 28, 2026

View reviewed changes

Baijack-star approved these changes May 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refs #406: Add public discovery routes#552

Refs #406: Add public discovery routes#552
ifanatics-media wants to merge 3 commits into
ramimbo:mainfrom
ifanatics-media:codex/406-discovery-routes

ifanatics-media commented May 28, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 28, 2026 •

edited

Loading

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Baijack-star left a comment

Uh oh!

ifanatics-media commented May 28, 2026

Uh oh!

yunrongy424-oss left a comment

Uh oh!

ifanatics-media commented May 28, 2026

Uh oh!

yunrongy424-oss left a comment

Uh oh!

Baijack-star left a comment

Uh oh!

tinyopsstudio commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

ifanatics-media commented May 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Evidence

Validation

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Baijack-star left a comment

Choose a reason for hiding this comment

Uh oh!

ifanatics-media commented May 28, 2026

Uh oh!

yunrongy424-oss left a comment

Choose a reason for hiding this comment

Uh oh!

ifanatics-media commented May 28, 2026

Uh oh!

yunrongy424-oss left a comment

Choose a reason for hiding this comment

Uh oh!

Baijack-star left a comment

Choose a reason for hiding this comment

Uh oh!

tinyopsstudio commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ifanatics-media commented May 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 28, 2026 •

edited

Loading