A novel archival system that encodes documents as QR code frames in MP4 video files, enabling universal playback, Git-like version control, and sub-100ms semantic search.
Pixelog transforms any document into a .pixe file - an MP4 video where each frame is a QR code containing chunks of your data. This approach unlocks:
- Universal compatibility: MP4 plays on any device (phones, computers, TVs, browsers)
- Git-like version control: Delta encoding tracks changes (64% space savings)
- Sub-100ms semantic search: Vector embeddings enable meaning-based queries
- Interactive LLM chat: RAG-powered Q&A with 200+ models via OpenRouter
- Streaming architecture: Handle multi-GB files with constant 10MB memory
- Military-grade encryption: AES-256-GCM with tamper detection
- Air-gapped capable: Works completely offline
Technical Specs:
- Format: MP4 with H.264-encoded QR frames
- Density: 2.9KB per frame @ 1080p (87KB/sec)
- Error correction: Reed-Solomon (30% damage tolerance)
- Search: HNSW vector index with cosine similarity
go install github.com/ArqonAi/Pixelog/cmd/pixe@latestOr build from source:
git clone https://github.com/ArqonAi/Pixelog.git
cd Pixelog
go build -o pixe ./cmd/pixe# Convert document to .pixe format
pixe convert document.txt -o doc.pixe
# Build semantic search index
export OPENROUTER_API_KEY=sk-or-v1-xxx
pixe index doc.pixe
# Search by meaning
pixe search doc.pixe "machine learning concepts" --top 5
# Chat with your document
pixe chat doc.pixe- Convert any file type to .pixe format
- Extract original files from .pixe archives
- Display file metadata and structure
- Integrity checking via SHA-256 hashing
- AES-256-GCM encryption with password
- Build vector embeddings for sub-100ms search
- Meaning-based queries (not just keyword matching)
- Interactive LLM Q&A with automatic context retrieval
- Ranked results by cosine similarity
- Create version snapshots with messages
- List all versions with timestamps
- Compare versions (frame-level changes)
- Time-travel search across historical versions
- Delta encoding (64% average space savings)
- Sub-100ms search with HNSW indexing
- Constant 10MB memory footprint (any file size)
- Streaming support for multi-GB files
- Parallel frame encoding/decoding
- AES-256-GCM authenticated encryption
- PBKDF2 key derivation (600,000 iterations)
- Reed-Solomon error correction (30% damage tolerance)
- SHA-256 frame hashing for tamper detection
- Air-gapped operation (no internet required)
pixe convert <input> -o <output.pixe> # Convert to .pixe
pixe extract <file.pixe> -o <output> # Extract from .pixe
pixe info <file.pixe> # Show file info
pixe verify <file.pixe> # Verify integrityexport OPENROUTER_API_KEY=sk-or-v1-xxx
pixe index <file.pixe> # Build index
pixe search <file.pixe> "query" --top 5 # Search
pixe chat <file.pixe> # Interactive chat
pixe chat <file.pixe> --model openai/gpt-5 # Specific model
pixe chat <file.pixe> --list # Show modelspixe version <file.pixe> -m "message" # Create version
pixe versions <file.pixe> # List versions
pixe diff <file.pixe> <v1> <v2> # Compare versions
pixe query <file.pixe> <version> "query" # Time-travel querypixe convert file.txt -o file.pixe --encrypt --password mypass
pixe extract file.pixe -o output --password mypass
pixe index file.pixe --password mypass# Create and index
pixe convert docs/ -o knowledge.pixe
pixe index knowledge.pixe
# Semantic search
pixe search knowledge.pixe "authentication best practices"
# Track changes
pixe version knowledge.pixe -m "Added security section"
pixe diff knowledge.pixe 1 2# Encrypted archive
pixe convert compliance-docs/ -o audit.pixe --encrypt --password xxx
# Track all changes
pixe versions audit.pixe
# Time-travel query
pixe query audit.pixe 1 "Q1 data retention policy"
# Verify integrity
pixe verify audit.pixe --password xxx# Index papers
pixe convert papers/ -o research.pixe
pixe index research.pixe
# Semantic citation search
pixe search research.pixe "transformer attention mechanisms"
# Chat with research
pixe chat research.pixe# Encrypted, air-gapped storage
pixe convert classified/ -o vault.pixe --encrypt --password xxx
pixe verify vault.pixe --password xxx
pixe extract vault.pixe -o restored/ --password xxx# Streaming for multi-GB codebases
pixe convert monorepo.tar.gz -o codebase.pixe
# Auto-streaming: 2.5 GB with 10MB RAM
# Version control
pixe version codebase.pixe -m "Release v2.0"
# Semantic code search
pixe search codebase.pixe "authentication middleware"Document → Chunks (2.9KB) → Encryption → QR Codes → MP4 Frames → .pixe File
Each .pixe file is an MP4 video:
- Frame 0: Metadata (file info, encryption params, version history)
- Frame 1+: QR-encoded data chunks
- Audio track: Silent (required for MP4 spec)
pixelog/
├── cmd/pixe/ # CLI (12 commands)
├── internal/
│ ├── converter/ # Document → .pixe
│ ├── crypto/ # AES-256-GCM
│ ├── qr/ # QR generation
│ ├── video/ # MP4 creation/extraction
│ ├── index/ # Semantic search
│ │ ├── indexer.go # HNSW vector index
│ │ ├── embedder.go # OpenRouter embeddings
│ │ └── delta.go # Version control
│ └── llm/ # LLM client (OpenRouter)
├── pkg/config/ # Configuration
├── docs/ # Documentation
└── examples/ # Usage examples
| Operation | Time | Notes |
|---|---|---|
| Index Build | 136ms | One-time per file |
| Semantic Search | <100ms | With 1000+ frames |
| Frame Extraction | 20ms | Direct FFmpeg seek |
| LLM Chat Response | <200ms | Excl. LLM latency |
| Version Creation | 85ms | Delta calculation |
| Integrity Check | 50ms/frame | Parallel decoding |
- Delta encoding: 64% space savings
- GZIP compression: 75% reduction
- Combined: ~80% smaller than raw storage
| File Size | Traditional | Pixelog Streaming |
|---|---|---|
| 10 MB | 10 MB RAM | 10 MB RAM |
| 100 MB | 100 MB RAM | 10 MB RAM |
| 1 GB | 1 GB RAM | 10 MB RAM |
| 10 GB | 10 GB RAM | 10 MB RAM |
Streaming auto-enables for files >100MB.
- Algorithm: AES-256-GCM (authenticated encryption)
- Key Derivation: PBKDF2 (600,000 iterations, SHA-256)
- Salt: 32-byte random per file
- Nonce: 12-byte random per operation
- Auth Tag: 16-byte for tamper detection
- Reed-Solomon codes: 30% damage tolerance per frame
- QR Error Correction: Level H (highest)
- Data recovery: Even if portions of video corrupted
.pixe File (MP4 Container)
├── Video Track (H.264)
│ ├── Frame 0: Metadata
│ ├── Frame 1+: [32B salt][12B nonce][encrypted data][16B auth tag]
└── Audio Track (silent)
Pixelog uses OpenRouter for embeddings and LLM chat (200+ models, one API key).
Get free key: https://openrouter.ai/keys
export OPENROUTER_API_KEY=sk-or-v1-xxx| Rank | Model | Cost | Speed | Description |
|---|---|---|---|---|
| 1 | DeepSeek R1 | $0.14/1M | Fast | Best value (default) |
| 2 | Gemini 2.5 Flash | FREE | Very Fast | Latest Google, free |
| 3 | Gemini 2.5 Pro | $0.50/1M | Medium | Best Gemini |
| 4 | GPT-5 | $2.50/1M | Medium | Latest OpenAI |
| 5 | Claude 4.5 Sonnet | $3.00/1M | Medium | Best reasoning |
| 6 | Grok 3 | $5.00/1M | Fast | Real-time data |
| 7 | Llama 3.3 70B | $0.18/1M | Fast | Open source |
| 8 | Qwen 2.5 72B | $0.18/1M | Fast | Multilingual |
| 9 | Mistral Large | $2.00/1M | Fast | European |
| 10 | GPT-4o | $0.75/1M | Fast | Multimodal |
# List models
pixe chat doc.pixe --list
# Default (DeepSeek R1)
pixe chat doc.pixe
# Free tier (Gemini)
pixe chat doc.pixe --model google/gemini-2.5-flash-latest
# Premium (GPT-5)
pixe chat doc.pixe --model openai/gpt-5| Operation | Model | Cost | Notes |
|---|---|---|---|
| Embeddings (indexing) | text-embedding-3-large | $0.02/1M | One-time |
| Search queries | text-embedding-3-large | $0.0001/query | Per query |
| Chat (default) | deepseek/deepseek-r1 | $0.14/1M | Best value |
| Chat (free) | gemini-2.5-flash | FREE | Free tier |
Example: Index 1,000 docs ($2) + 10,000 searches ($1) + 1M tokens chat ($0.14 or FREE)
- Universal compatibility: MP4 plays everywhere
- Built-in streaming: Progressive loading
- Frame-level access: Direct seek without loading full file
- Visual inspection: See data as scannable QR codes
- Novel use cases: Video-based data transmission
Optional. Core operations work offline:
- Convert, extract, verify, version control: No API needed
Required for:
- Semantic search (indexing + search)
- LLM chat
Get free key: https://openrouter.ai/keys
Military-grade: AES-256-GCM encryption, same as classified government systems.
- 600,000 PBKDF2 iterations (brute-force protection)
- Authenticated encryption (tamper detection)
- Air-gapped operation (works offline)
- Suitable for HIPAA, SOC 2, ISO 27001
Yes, most features:
- Offline: Convert, extract, encrypt/decrypt, verify, version control
- Online: Semantic search, LLM chat (requires OpenRouter API)
No practical limit due to streaming:
- Small files (<100MB): Loaded into memory
- Large files (>100MB): Auto-streaming mode
- Memory: Constant 10MB footprint
- Tested: Up to 10GB files
All types: Documents, code, archives, media, databases, binaries. Pixelog is format-agnostic.
Sub-100ms:
- Index build: 136ms (one-time)
- Search query: <100ms (1000+ frames)
- Total: Query → Results in <100ms
package main
import (
"github.com/ArqonAi/Pixelog/internal/converter"
"github.com/ArqonAi/Pixelog/internal/index"
"github.com/ArqonAi/Pixelog/internal/llm"
)
func main() {
// Convert
conv, _ := converter.New("./output")
conv.ConvertFile("doc.txt", &converter.ConvertOptions{
OutputPath: "doc.pixe",
EncryptionKey: "password",
})
// Index
embedder := index.NewSimpleEmbedder("openrouter", apiKey, "auto")
indexer, _ := index.NewIndexer("./indexes", embedder)
idx, _ := indexer.BuildIndex("doc", "doc.pixe")
// Search
results, _ := indexer.Search(idx, "query", 5)
// Version control
deltaManager, _ := index.NewDeltaManager("./deltas", indexer)
deltaManager.CreateVersion("doc", "doc.pixe", "Initial", "user")
// LLM chat
client := llm.NewClient("deepseek/deepseek-r1", apiKey)
response, _ := client.Chat("Explain main concepts")
}See CONTRIBUTING.md
git checkout -b feature/amazing-feature
./test_e2e.sh
git commit -m "feat: Add amazing feature"
git push origin feature/amazing-featureApache License 2.0 - see LICENSE
Made by ArqonAi
Turn documents into videos. Search at the speed of thought. Track changes like Git. Chat with AI.