Vision: Transform Aurus into the ultimate voice-first productivity companion
Chain agents together for complex workflows
flowchart LR
V[Voice Input] --> T[Transcript]
T --> BD[Brain Dump]
BD --> |Tasks| AI[Action Items]
BD --> |Ideas| MM[Mental Mirror]
AI --> CAL[Calendar Integration]
MM --> EMAIL[Email to Self]
Use Case: Record a brain dump, automatically extract tasks to calendar, send reflections as email.
Hands-free agent activation
"Hey Aurus, shift this to professional"
"Aurus, translate to German"
"Create action items"
Implementation:
- Local hotword detection (Porcupine/Snowboy)
- Command parsing with lightweight NLU
- No cloud dependency for activation
Integrate with calendar and participants
sequenceDiagram
participant C as Calendar
participant A as Aurus
participant AI as AI Agent
C->>A: Meeting starting: "Sprint Planning"
C->>A: Participants: John, Sarah, Mike
A->>AI: Context: Sprint planning with team
Note over A: Record meeting
A->>AI: Extract action items with assignees
AI->>A: "John: Review PR #142 by Friday"
A->>C: Create calendar reminders
Features:
- Auto-detect meeting context from calendar
- Pre-populate participant names for assignment
- Post-meeting summary email to attendees
Privacy-first AI processing
| Model | Use Case | Size | Speed |
|---|---|---|---|
| Llama 3.2 3B | Action Items | 2GB | Fast |
| Mistral 7B | Tone Shifting | 4GB | Medium |
| Phi-3 Mini | Quick Tasks | 1.5GB | Fastest |
Benefits:
- Zero cloud dependency
- HIPAA/GDPR compliance
- Works offline
- No API costs
Implementation: Integrate llama.cpp or candle for Rust-native inference.
Team features for shared voice notes
graph TB
subgraph Team["Team Workspace"]
U1[User 1] --> |Records| WS[Shared Workspace]
U2[User 2] --> |Records| WS
U3[User 3] --> |Records| WS
end
WS --> AGG[Aggregated Notes]
AGG --> AI[AI Summary]
AI --> TEAM[Team Dashboard]
Features:
- Shared transcript library
- Team action item tracking
- Meeting insights across sessions
- Role-based access control
Persistent transcript library with search
┌─────────────────────────────────────────┐
│ 🔍 Search transcripts... │
├─────────────────────────────────────────┤
│ Today │
│ ├─ Sprint Planning (45 min) 🎯 5 tasks │
│ ├─ 1:1 with Sarah (30 min) 💭 2 ideas │
│ └─ Quick note (2 min) │
│ │
│ Yesterday │
│ ├─ Client call (1 hr) 📋 8 tasks │
│ └─ Brain dump (15 min) 🧠 categorized │
└─────────────────────────────────────────┘
Features:
- SQLite local database
- Full-text search
- Filter by agent type, date, duration
- Export to Markdown/JSON
Drag-and-drop agent arrangement
┌──────────┬──────────┬──────────┐
│ Action │ Tone │ Dev-Log │
│ Items │ Shifter │ │
├──────────┼──────────┼──────────┤
│ Brain │ Mental │ Custom │
│ Dump │ Mirror │ Agent │
└──────────┴──────────┴──────────┘
↑ Drag to reorder ↑
Features:
- Pin favorite agents
- Hide unused agents
- Custom agent shortcuts
- Theming per agent
System-aware theming
/* Current: Dark only */
/* Planned: Auto-detect + manual toggle */
:root[data-theme="light"] {
--voice-background: #ffffff;
--voice-surface: #f5f5f5;
--voice-primary: #6366f1;
}Real-time spectrogram display
Current: ▁▂▃▅▇▅▃▂▁ (amplitude bars)
Proposed: ████████████ 8kHz
██████████░░ 4kHz
████████░░░░ 2kHz
██████░░░░░░ 1kHz
████░░░░░░░░ 500Hz
Benefits:
- Frequency visualization helps identify speaker
- Visual feedback for audio quality
- Detect background noise issues
Bi-directional sync with Google/Outlook
flowchart LR
A[Action Items Agent] --> |Create Event| CAL[Google Calendar]
CAL --> |Meeting Starting| A
A --> |Join Link| MEET[Google Meet]
Features:
- Create tasks as calendar events
- Auto-record scheduled meetings
- Deadline reminders from extracted dates
Direct sync to knowledge bases
Aurus → Notion
├─ Action Items → Notion Tasks Database
├─ Brain Dump → Notion Page (Eisenhower embed)
├─ Dev-Log → Notion Changelog
└─ Mental Mirror → Notion Journal
Obsidian:
- Export as Markdown with YAML frontmatter
- Auto-link related notes
- Tag extraction from content
Share results directly to channels
/aurus summary → Posts meeting summary
/aurus tasks → Posts action items
/aurus standup → Generates standup update from yesterday's notes
Connect to 5000+ apps
{
"event": "action_items_extracted",
"payload": {
"items": [...],
"transcript_id": "abc123"
}
}Triggers:
- New transcript created
- Action items extracted
- Brain dump categorized
- Mental mirror generated
No-code agent creation
# custom_agent.yaml
name: "PR Reviewer"
model: gpt-4o
system_prompt: |
You are a code review assistant.
Extract: bugs, improvements, questions
output_schema:
bugs: array
improvements: array
questions: arrayUI:
- Visual prompt builder
- Test with sample transcripts
- Share agents with community
Cross-session context retention
flowchart TB
S1[Session 1] --> |"Project: Aurus"| MEM[(Memory Store)]
S2[Session 2] --> |Query| MEM
MEM --> |"Recall: Working on audio pipeline"| S2
Features:
- Vector embedding of past sessions
- Semantic search for context
- "Continue where we left off" capability
Real-time language detection
Current: German only (language=de)
Proposed: Auto-detect + specify
├─ "Detected: German (93% confidence)"
├─ "Switching to English..."
└─ "Mixed language: DE/EN code-switching"
Who said what?
[Speaker 1 - John]: Let's discuss the roadmap
[Speaker 2 - Sarah]: I think we should prioritize mobile
[Speaker 1 - John]: Agreed, let's add that to the backlog
Implementation:
- Deepgram diarization feature
- Speaker identification training
- Per-speaker action item assignment
Zero-knowledge architecture
flowchart LR
A[Audio] --> |Encrypt locally| E[Encrypted Stream]
E --> |Process| AI[AI Service]
AI --> |Encrypted response| D[Decrypt locally]
Features:
- Client-side encryption keys
- Encrypted local storage
- No plaintext on servers
Docker deployment for enterprises
# docker-compose.yml
services:
aurus-api:
image: aurus/voice-api:latest
environment:
- WHISPER_MODEL=large-v3
- LLM_ENDPOINT=http://ollama:11434
ollama:
image: ollama/ollama:latest
volumes:
- ./models:/root/.ollamaCompliance-ready tracking
{
"timestamp": "2025-01-29T10:00:00Z",
"action": "transcript_created",
"user": "user@company.com",
"data_classification": "confidential",
"retention_days": 90
}iOS/Android for on-the-go capture
┌─────────────────────┐
│ 📱 Aurus │
├─────────────────────┤
│ │
│ 🎙️ Recording... │
│ ━━━━━━━━━━━━━ │
│ 02:34 │
│ │
│ [Stop & Process] │
│ │
│ Syncs to desktop │
└─────────────────────┘
Features:
- Record on phone, process on desktop
- Push notifications for action items
- Widget for quick capture
Capture from any web page
Right-click → "Send to Aurus"
├─ Selected text → Tone Shifter
├─ Meeting transcript → Action Items
└─ Article → Brain Dump summary
Terminal-first interface
$ aurus record --duration 5m
Recording... (Ctrl+C to stop)
$ aurus process --agent action-items
Found 3 action items:
1. [HIGH] Review PR #142 - @john - Due: Jan 30
2. [MED] Update documentation - @sarah
3. [LOW] Research caching options
$ aurus export --format markdown > notes.mdRead results in your own voice
- Train personal voice model (5 min sample)
- Natural-sounding playback
- Emotional tone matching
Live interpretation during calls
You (English) → Aurus → Translated audio → Them (German)
Them (German) → Aurus → Translated audio → You (English)
Emotional journey visualization
😊 ────────────────────────────────────────
😐 ──────╲──────────────────────╱──────────
😔 ───────╲────────────────────╱───────────
0:00 1:00 2:00 3:00 4:00 5:00
↑ Discussed challenges ↑ Found solution
Guided reflection sessions
Aurus: "What was the highlight of your day?"
You: [Record response]
Aurus: "What challenged you?"
You: [Record response]
Aurus: "What are you grateful for?"
You: [Record response]
→ Generates comprehensive journal entry
quadrantChart
title Implementation Priority
x-axis Low Effort --> High Effort
y-axis Low Impact --> High Impact
quadrant-1 Do First
quadrant-2 Plan Carefully
quadrant-3 Quick Wins
quadrant-4 Avoid
"Voice Commands": [0.3, 0.8]
"Conversation History": [0.4, 0.7]
"Local LLM": [0.7, 0.9]
"Custom Agents": [0.6, 0.8]
"Calendar Integration": [0.5, 0.7]
"Mobile App": [0.9, 0.6]
"Theme Toggle": [0.2, 0.3]
"Notion Export": [0.3, 0.5]
"Speaker Diarization": [0.4, 0.6]
Ideas document for Aurus Voice Intelligence - Updated January 2025