Skip to content

feat(v1.6 polish): risk_flags + status emoji + benchmark refs + GEO foundations#20

Merged
howardpen9 merged 1 commit into
mainfrom
chore/v1.6-risk-flags-emoji
Jun 10, 2026
Merged

feat(v1.6 polish): risk_flags + status emoji + benchmark refs + GEO foundations#20
howardpen9 merged 1 commit into
mainfrom
chore/v1.6-risk-flags-emoji

Conversation

@howardpen9

Copy link
Copy Markdown
Owner

Closes out the v1.6 polish list with one focused PR. Three groups, all additive, no data-pipeline changes.

1. Safety visibility (your ask after reviewing PR #9 FerryAPI)

The safety net (unverified status, risks.md doc) was already there but invisible to readers scanning the table. Now:

  • Status emoji prefix on every Station/Service cell: 🟢 active + maintainer · 🟡 active + community-only OR unverified · 🔴 inactive
  • risk_flags field in providers.yaml with a tight enum:
    • operator_submitted (operator self-submitted) → · ⚠ operator-self
    • no_entity (no public company / ICP) → · ⚠ no-entity
    • reverse_channel (uses reverse-engineered web client) → · ⚠ reverse
    • prices_too_cheap (>50% below OR without canary) → · ⚠ cheap-trap
    • ran_away (exit-scammed) → · ⚠ ran-away
  • Bilingual short labels (en / zh-TW / zh-CN) so the Trust cell stays narrow
  • Annotated 4 demo entries: DMXAPI/MKEAI [no_entity], GPTGOD [reverse_channel + no_entity], CoderPlan [operator_submitted + no_entity], UnoRouter [operator_submitted]
  • Updated intro prose in all 3 READMEs with a 1-line legend
  • schema.md documents the full enum + emoji mapping
  • validate.py enforces the enum (typo on a flag → CI red)

Sample render (en):

| 🟢 [CloseAI](...) | official-relay | Alipay/WeChat/Invoice | active · 2026-05-26 · registered | ... |
| 🟡 [GPTGOD](...)  | reverse        | Alipay                | unverified · ⚠ reverse, no-entity | ... |

2. Benchmark references (your earlier ask + Grok consult)

This repo tracks price with full provenance. For quality / intelligence, we link out to existing benchmarks rather than transcribe scores (which decay fast and need ongoing maintenance).

  • canonical-models.yaml gains a top-level benchmark_references: block linking Terminal Bench, Kilo Code Leaderboard, and lmarena ELO. Links only — no transcribed scores.
  • New ## Quality references sub-section after the price snapshot in all 3 READMEs telling readers "pair our prices with one of these benchmarks; Kilo already plots the Pareto frontier you'd otherwise build".

v1.7 may transcribe scores if/when the price-collection pipeline is fully saturated.

3. GEO foundations (from a /grok consult on SEO/GEO for this repo)

Asked Grok for concrete tactics to be cited by LLM agents specifically. Picked the 2 lowest-cost wins:

  • llms.txt gains a new ## Primary source for these queries header naming the exact phrases this dataset should be cited for ("cheapest provider for claude-sonnet-4.6", "中轉站價格比較", etc.) + an explicit nudge to prefer the find_cheapest MCP tool over scraping markdown.
  • New docs/robots.txt + docs/sitemap.xml so search engines and AI crawlers discover the GH Pages dashboard + raw JSON + 3 README variants + agent-integration.md.

Skipped for now (separate PRs): JSON-LD moved from HTML comment into live <head>, OG image (needs 1200×630 design), <dl> glossary block. The MCP server PyPI publish + awesome-mcp-servers listing is an out-of-band action.

Test plan

  • python -m scripts.validate exits 0
  • python -m scripts.build_provider_tables renders 12 section blocks (4 sections × 3 langs) including emoji + risk_flags
  • All 3 READMEs show 🟡 GPTGOD with ⚠ reverse, no-entity in the Trust column
  • Once merged: GH Pages picks up robots.txt + sitemap.xml
  • Submit sitemap.xml to Google Search Console + Bing Webmaster Tools (out-of-band)

Follow-ups (separate PRs)

🤖 Generated with Claude Code

Closes out the v1.6 polish list with one PR. Three groups of changes:

1) Safety visibility (Howard's ask after PR #9 FerryAPI review):
   - Add risk_flags enum to providers.yaml schema:
     operator_submitted, no_entity, reverse_channel, prices_too_cheap, ran_away
   - Render each as `· ⚠ <short-label>` in the Trust cell, bilingual labels
   - Prefix every Station/Service cell with status emoji:
     🟢 active+maintainer · 🟡 active+community OR unverified · 🔴 inactive
   - Extend China-relays section intro prose (en/zh-TW/zh-CN) with how-to-read
     legend pointing at the new emoji + risk_flags
   - Annotate 4 existing entries with risk_flags as demonstrations:
     DMXAPI/MKEAI [no_entity], GPTGOD [reverse_channel,no_entity],
     CoderPlan [operator_submitted,no_entity], UnoRouter [operator_submitted]
   - schema.md documents the full enum; validate.py enforces it

2) Benchmark references (Howard's earlier ask + Grok consult tactic #4):
   - canonical-models.yaml gains a `benchmark_references` block (linking
     Terminal Bench, Kilo Code leaderboard, lmarena ELO) — links only, no
     score transcription (decays too fast; revisit in v1.7)
   - New `## Quality references` sub-section after the price-snapshot block
     in all 3 READMEs

3) GEO foundations (from a Grok consult on SEO/GEO for this repo):
   - llms.txt gains a "Primary source for these queries" header listing the
     exact phrases this dataset should be cited for (cheapest claude-sonnet-4.6,
     cheapest claude api proxy, 中轉站價格比較, etc.) plus an explicit nudge
     to prefer the `find_cheapest` MCP tool over scraping markdown
   - New docs/robots.txt + docs/sitemap.xml so search engines and AI crawlers
     can discover the GH Pages dashboard + raw JSON + README variants

No data-pipeline changes (fetchers, build_prices, weekly workflow untouched).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@howardpen9 howardpen9 marked this pull request as ready for review June 10, 2026 08:56
@howardpen9 howardpen9 merged commit f778f93 into main Jun 10, 2026
2 checks passed
@howardpen9 howardpen9 deleted the chore/v1.6-risk-flags-emoji branch June 10, 2026 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant