Skip to content

Add time-native audio cognition research memo#43

Draft
Fearvox wants to merge 1 commit into
mainfrom
claude/audio-cognition-memo-kqdT9
Draft

Add time-native audio cognition research memo#43
Fearvox wants to merge 1 commit into
mainfrom
claude/audio-cognition-memo-kqdT9

Conversation

@Fearvox

@Fearvox Fearvox commented May 25, 2026

Copy link
Copy Markdown
Owner

Summary

Research-direction memo on time-native audio cognition, produced as a head-to-head benchmark against ChatGPT Deep Research and Gemini on the same brief. Competes on per-claim quality, citation accuracy, and epistemic discipline rather than breadth.

Two files:

  • audio-cognition-memo.md — the 10-section memo.
  • scratchpad-audio-cognition.md — working notes: search log, contested-claims tracker (both sides named), confirmed dead-ends/absences, and a citation reliability ledger.

What the memo argues

  • Reframes the brief's own thesis. "Spectrograms are a category error" is too strong; the defensible, task-stratified version is "the field optimized representations + objectives for the one task family (lexical recognition) that needs time least." Backbone: Saddler & McDermott 2024 (Nat Comms) — phase-locking is required for localization + voice ID, useless for word recognition in noise.
  • §1 distinguishes preserve vs generate vs use phase (Vocos generates phase from magnitude-only input; anti-spoofing shows phase is highly discriminative — MRP cuts EER 1.88% → 0.013%).
  • §4 corrects the "untrained-baseline critique" framing: robust in language/vision but partly pre-rebutted for audio (Tuckute's permuted controls underperformed); the real audio puncture is Feather's metamers.
  • §10 gives three small-lab experiments, including a non-obvious arrow-of-time probe isolating temporal-directionality content in learned representations.

Citation discipline

No fabricated citations. Sources marked [fetched] / [abstract-confirmed] / [listing-only] / [ID-unverified]; two arXiv IDs deliberately omitted rather than guessed. See the ledger at the end of the memo.

Test plan

  • Human review of memo claims against the cited primary sources
  • Side-by-side comparison vs ChatGPT DR and Gemini outputs on the same brief

https://claude.ai/code/session_016eF73F2qUYpEtE4DFJDs2K


Generated by Claude Code

Research-direction memo across 10 sections (phase-aware representation,
multi-scale temporal architecture, predictive-coding objectives,
brain-aligned models, music benchmarks, biological frontends, active
listening, cross-modal grounding, open problems, three experiments) plus
the working scratchpad with search log, contested-claims tracker, and a
citation reliability ledger marking verified vs listing-only sources.

https://claude.ai/code/session_016eF73F2qUYpEtE4DFJDs2K
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants