Add relay directory crawler by alokdangre · Pull Request #86 · saiy2k/nostr-components

alokdangre · 2026-06-15T12:10:15Z

Summary'

the relay crawler:

queries relays for kind:10011 and kind:0
extracts verifiable Twitter identity claims
verifies proof tweets
optionally checks NIP-57 zappability
computes lightweight Web-of-Trust signals
writes a structured JSON output for directory review

add an end-to-end relay directory crawler for X/Twitter -> Nostr mappings
crawl NIP-39 kind:10011 plus legacy kind:0 identity tags
verify proof tweets, check NIP-57 zappability, and add WoT/risk scoring as a ranking signal only

Trust model

NIP-39 proof verification is the only identity verification path
claimed-only profile links are never auto-zapped
WoT/NIP-85 data ranks and flags records, but never proves X account ownership
autoZapAllowed requires verified identity and an LNURL endpoint with allowsNostr

Tests

npm test -- scripts/relay-directory-crawler.test.mjs

Summary by CodeRabbit

New Features
- Added a directory crawler tool (crawl:directory) for building a verified mapping of X/Twitter handles to Nostr identities with proof verification, Lightning zap support detection, and Web-of-Trust scoring.
Tests
- Added test coverage for the directory crawler functionality.

coderabbitai · 2026-06-15T12:10:29Z

📝 Walkthrough

Walkthrough

A new Node.js script scripts/relay-directory-crawler.mjs is introduced that crawls Nostr relays via WebSocket to build a verified Twitter/X handle → Nostr pubkey directory. It includes CLI argument parsing, multi-strategy tweet proof verification, NIP-57 zap support checking, WoT/risk scoring, and JSON output. A companion Vitest test suite and a crawl:directory npm script are also added.

Changes

Nostr Relay Directory Crawler

Layer / File(s)	Summary
CLI scaffolding and npm script entry `package.json`, `scripts/relay-directory-crawler.mjs`	Registers the `crawl:directory` npm script and implements CLI argument parsing with validation, default relay/output constants, and a help/usage printer.
Relay querying and identity/event utilities `scripts/relay-directory-crawler.mjs`, `scripts/relay-directory-crawler.test.mjs`	WebSocket relay querying with timeout/bounded collection, pool de-duplication, latest-replaceable-event selection, signed event validation, Twitter/X handle normalization, tweet ID extraction, pubkey-to-npub conversion, and `extractDirectoryInputs`; tests for `normalizeTwitterHandle`, `extractTweetId`, and `extractDirectoryInputs`.
Tweet fetching, proof verification, and zap support `scripts/relay-directory-crawler.mjs`, `scripts/relay-directory-crawler.test.mjs`	Multi-strategy `fetchTweet` (X API → syndication → oEmbed), `verifyCandidate` (handle + npub checks), `checkZapSupport` (lud16→LNURLp + NIP-57 evaluation), `lightningAddressToLnurlp` helper, and its test.
WoT/risk scoring `scripts/relay-directory-crawler.mjs`, `scripts/relay-directory-crawler.test.mjs`	Aggregates follower/report/assertion metrics from Nostr kinds 3, 1984, and 30382 into a capped wot score per directory record; includes a scenario test asserting identity fields are preserved.
`runCrawler` end-to-end orchestration and entry point `scripts/relay-directory-crawler.mjs`	Orchestrates the full crawl pipeline: relay queries, candidate extraction, optional proof verification, kind:0 backfill, zap support, WoT scoring, auto-zap eligibility, structured JSON output, and the main-script entry point guard with error reporting and helper exports.

Sequence Diagram(s)

sequenceDiagram
  participant CLI
  participant runCrawler
  participant queryRelayPool
  participant fetchTweet
  participant checkZapSupport
  participant computeWotScores
  participant FileSystem

  CLI->>runCrawler: parsed args
  runCrawler->>queryRelayPool: REQ kinds 10011 + 0
  queryRelayPool-->>runCrawler: de-duplicated events
  runCrawler->>runCrawler: extractDirectoryInputs → candidates, claimed
  loop each candidate
    runCrawler->>fetchTweet: proofTweetId (X API / syndication / oEmbed)
    fetchTweet-->>runCrawler: tweet text + author handle
    runCrawler->>runCrawler: verifyCandidate → verified / rejected
  end
  runCrawler->>queryRelayPool: backfill kind:0 for verified authors
  loop each verified record
    runCrawler->>checkZapSupport: lud16 → LNURLp → allowsNostr
    checkZapSupport-->>runCrawler: zapSupport result
  end
  runCrawler->>computeWotScores: directory records + all events
  computeWotScores-->>runCrawler: records with wot scores
  runCrawler->>FileSystem: write JSON output
  runCrawler-->>CLI: stdout summary

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐇 Hoppity-hop through relays galore,
Sniffing out npubs and handles in store.
A tweet says "it's me!" — the rabbit checks twice,
With zaps and WoT scores to make things precise.
JSON lands softly, the burrow is complete,
The directory's verified, tidy, and neat! 🌟

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Add relay directory crawler' directly and accurately describes the main change: a new relay directory crawler implementation being added to the codebase.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

scripts/relay-directory-crawler.test.mjs (1)
102-117: 💤 Low value

Test event has invalid signature and will be skipped by computeWotScores.

The mock event uses sig: 'invalid-for-test' but computeWotScores calls isValidSignedEvent(event) on line 493 and skips events that fail signature verification. This means the follow event is never counted, so wot.followerGraphMentions will be 0.

The test still passes because it only asserts that identity fields are preserved and wot.note exists, but it doesn't actually test that follower counting works.

Consider either:

Removing the signature validation in computeWotScores (WoT is advisory, not security-critical)

Or adding a test with properly signed events (more complex)

Or accepting that this test only verifies field preservation, not scoring logic
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/relay-directory-crawler.test.mjs` around lines 102 - 117, The mock
follow event in the test uses an invalid signature ('invalid-for-test') which
causes computeWotScores to skip event validation when it calls
isValidSignedEvent() on line 493, meaning the follow event is never counted in
the WoT scoring. To fix this, either remove the signature validation check in
computeWotScores since WoT scoring is advisory and not security-critical, or
provide a properly signed event in the test. Choose the first option if WoT
verification should not be security-critical for advisory purposes, or the
second if signature validation must remain but the test needs to demonstrate
actual scoring logic by using a valid signature.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/relay-directory-crawler.mjs`:
- Around line 53-54: The argument parsing for `--kind10011-limit` and
`--kind0-limit` converts the input values using `Number()` but does not validate
them, allowing `NaN` values to propagate into the relay query logic. Add
validation checks after parsing these arguments (similar to how `timeoutMs` and
`maxProofs` are validated) to ensure the converted numbers are valid and non-NaN
before assigning them to `args.kind10011Limit` and `args.kind0Limit`. If
validation fails, either log an error message or set a sensible default value to
prevent invalid `limit` values from being used in relay queries.

---

Nitpick comments:
In `@scripts/relay-directory-crawler.test.mjs`:
- Around line 102-117: The mock follow event in the test uses an invalid
signature ('invalid-for-test') which causes computeWotScores to skip event
validation when it calls isValidSignedEvent() on line 493, meaning the follow
event is never counted in the WoT scoring. To fix this, either remove the
signature validation check in computeWotScores since WoT scoring is advisory and
not security-critical, or provide a properly signed event in the test. Choose
the first option if WoT verification should not be security-critical for
advisory purposes, or the second if signature validation must remain but the
test needs to demonstrate actual scoring logic by using a valid signature.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ac3557d2-3269-4cae-9df8-6143f6b05aeb

📥 Commits

Reviewing files that changed from the base of the PR and between dfc22e5 and 17dd08b.

📒 Files selected for processing (3)

package.json
scripts/relay-directory-crawler.mjs
scripts/relay-directory-crawler.test.mjs

coderabbitai · 2026-06-15T12:17:33Z

+    else if (arg === '--kind10011-limit') args.kind10011Limit = Number(next());
+    else if (arg === '--kind0-limit') args.kind0Limit = Number(next());


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Missing validation for --kind10011-limit and --kind0-limit.

These numeric arguments are parsed with Number() but not validated like timeoutMs and maxProofs. If invalid values are passed, they become NaN and propagate to the relay query's limit field.

Proposed fix

if (!args.relays.length) throw new Error('At least one relay is required.'); if (!Number.isFinite(args.timeoutMs) || args.timeoutMs <= 0) throw new Error('--timeout-ms must be positive.'); if (!Number.isFinite(args.maxProofs) || args.maxProofs < 0) throw new Error('--max-proofs must be >= 0.'); + if (!Number.isFinite(args.kind10011Limit) || args.kind10011Limit <= 0) throw new Error('--kind10011-limit must be positive.'); + if (!Number.isFinite(args.kind0Limit) || args.kind0Limit <= 0) throw new Error('--kind0-limit must be positive.'); return args;

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/relay-directory-crawler.mjs` around lines 53 - 54, The argument parsing for `--kind10011-limit` and `--kind0-limit` converts the input values using `Number()` but does not validate them, allowing `NaN` values to propagate into the relay query logic. Add validation checks after parsing these arguments (similar to how `timeoutMs` and `maxProofs` are validated) to ensure the converted numbers are valid and non-NaN before assigning them to `args.kind10011Limit` and `args.kind0Limit`. If validation fails, either log an error message or set a sensible default value to prevent invalid `limit` values from being used in relay queries.

alokdangre added 2 commits June 15, 2026 12:08

feat: add relay directory crawler

c3816f6

fix: harden relay directory crawler

17dd08b

coderabbitai Bot reviewed Jun 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add relay directory crawler#86

Add relay directory crawler#86
alokdangre wants to merge 2 commits into
saiy2k:mainfrom
alokdangre:relay-directory-crawler

alokdangre commented Jun 15, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 15, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		else if (arg === '--kind10011-limit') args.kind10011Limit = Number(next());
		else if (arg === '--kind0-limit') args.kind0Limit = Number(next());

Conversation

alokdangre commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary'

Trust model

Tests

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

alokdangre commented Jun 15, 2026 •

edited

Loading

coderabbitai Bot commented Jun 15, 2026 •

edited

Loading