Skip to content

trollByte/dmarc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

80 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DMARC Aggregate Report Processor

Enterprise-Grade DMARC Analytics Platform

A production-ready enterprise platform that ingests, processes, and analyzes DMARC aggregate reports with advanced ML-powered threat detection, distributed task processing, and comprehensive security features.


πŸš€ Features

Core Functionality

  • πŸ“§ Automated DMARC Report Ingestion - IMAP inbox monitoring with Celery task queue
  • πŸ“€ Bulk File Upload - Drag-and-drop 50-200 reports simultaneously
  • πŸ”„ Idempotent Processing - SHA256-based duplicate prevention
  • πŸ’Ύ PostgreSQL Storage - Production-grade relational database
  • πŸ” JWT Authentication - Role-based access control (Admin/Analyst/Viewer)
  • πŸš€ RESTful API - FastAPI with auto-generated documentation
  • βœ… Comprehensive Testing - 70%+ code coverage enforced
  • 🐳 Docker Deployment - Single-command orchestration
  • πŸ”” Multi-Channel Alerting - Email, Slack, Discord, Microsoft Teams

🎯 Enterprise Features (NEW)

Phase 1: Distributed Task Processing

  • ⚑ Celery + Redis Queue - Asynchronous background job processing
  • πŸ“… Celery Beat Scheduler - Automated periodic tasks
    • Email ingestion every 15 minutes
    • Report processing every 5 minutes
    • Alert checks hourly
    • ML model training weekly
  • 🌸 Flower Dashboard - Real-time task monitoring at :5555
  • πŸ”„ Retry Logic - Exponential backoff with 3 attempts
  • πŸ“Š Task Tracking - PostgreSQL result backend

Phase 2: Authentication & Authorization

  • πŸ”‘ JWT Authentication - Access tokens (15min) + refresh tokens (7 days)
  • πŸ‘₯ Role-Based Access Control - Admin, Analyst, Viewer roles
  • πŸ” API Key Management - Per-user API keys with SHA256 hashing
  • πŸ›‘οΈ Password Security - bcrypt hashing (12 rounds)
  • πŸ“ User Management - Admin-only user creation (no self-registration)
  • πŸ”„ Token Refresh - Seamless token renewal
  • πŸ“‹ Audit Trail - User action tracking

Phase 3: Enhanced Alerting

  • 🎯 Alert Lifecycle - Created β†’ Acknowledged β†’ Resolved
  • πŸ”• Deduplication - SHA256 fingerprinting with cooldown periods
  • ⏰ Alert Suppressions - Time-based muting for maintenance windows
  • πŸ“Š Alert History - Persistent storage with full lifecycle tracking
  • πŸ“ Configurable Rules - UI-based threshold management
  • πŸ”” Teams Priority - Microsoft Teams notifications sent first
  • πŸ“ˆ Alert Analytics - Trends, resolution times, acknowledgment rates

Phase 4: ML Analytics & Geolocation

  • πŸ€– Anomaly Detection - Isolation Forest ML model for suspicious IPs
  • 🌍 IP Geolocation - MaxMind GeoLite2 offline mapping
  • πŸ—ΊοΈ Country Heatmaps - Geographic visualization of email sources
  • πŸ“Š Model Management - Training, versioning, deployment
  • πŸ”„ Automated Training - Weekly ML model updates (Sunday 2 AM)
  • 🎯 Daily Detection - Automatic anomaly scanning (3 AM)
  • πŸ’Ύ 90-Day Caching - Efficient geolocation data caching
  • πŸ“ˆ Prediction History - ML prediction tracking and analytics

Performance & Caching

  • ⚑ Redis Caching - 90%+ hit rate, sub-200ms response times
  • πŸ”§ Query Optimization - N+1 query elimination, indexed lookups
  • πŸ“ˆ Auto-Invalidation - Cache clearing on new data
  • πŸ”„ Connection Pooling - Optimized database and cache connections

Visualizations

  • πŸ“Š 8 Interactive Charts:
    • DMARC results timeline (line chart)
    • Results by domain (bar chart)
    • Top source IPs (bar chart)
    • Disposition breakdown (pie chart)
    • SPF/DKIM alignment breakdown (stacked bar)
    • Policy compliance (doughnut chart)
    • Failure rate trend with moving average (line chart)
    • Top sending organizations (horizontal bar)

Advanced Filtering

  • πŸ” Source IP - Exact match or CIDR ranges
  • πŸ” Authentication - DKIM/SPF pass/fail
  • πŸ“‹ Disposition - None/Quarantine/Reject
  • 🏒 Organization - Sending organization filter
  • πŸ“… Date Range - Custom or preset ranges
  • 🌐 Domain - Multi-domain filtering

Export Capabilities

  • πŸ“„ CSV Exports - Reports, records, sources
  • πŸ“‘ PDF Reports - Professional summary with charts
  • πŸ”’ Rate Limiting - 10/min CSV, 5/min PDF
  • πŸ›‘οΈ Security - CSV formula injection prevention

πŸ› οΈ Tech Stack

Backend

  • Framework: Python 3.11 + FastAPI
  • Task Queue: Celery + Redis
  • ML/Analytics: scikit-learn, NumPy, pandas
  • Geolocation: MaxMind GeoLite2 + geoip2
  • Auth: JWT (PyJWT), bcrypt
  • Database: PostgreSQL 15 + SQLAlchemy 2.0
  • Cache: Redis 7 (Alpine)
  • PDF: ReportLab

Frontend

  • Stack: Vanilla HTML/CSS/JS + Chart.js v4.4.0
  • Charts: Chart.js for visualizations
  • Web Server: Nginx (reverse proxy)

Infrastructure

  • Orchestration: Docker Compose
  • Services: Backend, Celery Worker, Celery Beat, PostgreSQL, Redis, Nginx, Flower
  • Monitoring: Flower dashboard for Celery tasks

πŸ“‹ Prerequisites

Required

  • Docker & Docker Compose
  • Python 3.8+ (for the setup script)

Optional

  • MaxMind GeoLite2 account (free β€” for IP geolocation maps)
  • Email account with IMAP access (for automated report ingestion)
  • Microsoft Teams/Slack webhooks (for alerts)

Quick Start

git clone <repo-url>
cd dmarc
make setup

That's it. The interactive setup will:

  1. Generate a .env file with secure secrets
  2. Ask for your admin email and password
  3. Optionally configure email ingestion and geolocation
  4. Start all 7 Docker services
  5. Run database migrations and create your admin account

When it finishes, open http://localhost and log in.

Non-interactive mode (for CI/automation):

ADMIN_EMAIL=admin@co.com ADMIN_PASSWORD=secret SKIP_EMAIL=1 SKIP_GEO=1 make setup

Browser-based setup: If you prefer, just run docker compose up -d and open http://localhost β€” a setup wizard will guide you through the same steps in your browser.

Services

Service Port Description
Dashboard 80 Web UI (Nginx)
Backend API 8000 FastAPI + Swagger docs at /docs
Flower 5555 Celery task monitoring
PostgreSQL 5433 Database
Redis 6379 Cache & message broker

Advanced Setup

If you need more control, see the manual steps:

  1. Copy and edit the environment file: cp .env.example .env
  2. Generate a JWT secret: python -c "import secrets; print(secrets.token_urlsafe(64))"
  3. Start services: docker compose up -d --build
  4. Run migrations: docker compose exec backend alembic upgrade head
  5. Create admin user: bash scripts/init-db.sh --create-admin
  6. Access the dashboard at http://localhost

πŸ” Authentication

The platform uses JWT-based authentication. Access tokens expire after 15 minutes, and refresh tokens last 7 days.

Login (Get JWT Token)

curl -X POST http://localhost:8000/auth/login \
  -H "Content-Type: application/json" \
  -d '{
    "email": "admin@example.com",
    "password": "your-password"
  }'

Response:

{
  "access_token": "eyJ0eXAiOiJKV1QiLCJhbGc...",
  "refresh_token": "eyJ0eXAiOiJKV1QiLCJhbGc...",
  "token_type": "bearer"
}

Use the Bearer Token in API Calls

Include the access token in the Authorization header for all protected endpoints:

curl -H "Authorization: Bearer <access_token>" http://localhost:8000/api/reports

# Example: Get report summary
curl -H "Authorization: Bearer <access_token>" http://localhost:8000/api/rollup/summary

# Example: Upload a report
curl -X POST http://localhost:8000/api/upload \
  -H "Authorization: Bearer <access_token>" \
  -F "files=@report.xml.gz"

Refresh an Expired Token

When your access token expires (after 15 minutes), use the refresh token to obtain a new one without re-entering credentials:

curl -X POST http://localhost:8000/auth/refresh \
  -H "Content-Type: application/json" \
  -d '{
    "refresh_token": "<refresh_token>"
  }'

Response (new access token):

{
  "access_token": "eyJ0eXAiOiJKV1QiLCJhbGc...",
  "token_type": "bearer"
}

API Key Authentication

For automated scripts and integrations, API keys provide long-lived authentication without token refresh.

Generate an API key:

curl -X POST http://localhost:8000/users/api-keys \
  -H "Authorization: Bearer <access_token>" \
  -H "Content-Type: application/json" \
  -d '{"name": "my-script"}'

Use the API key:

curl -H "X-API-Key: <api_key>" http://localhost:8000/api/reports

Logout

Invalidate your tokens:

curl -X POST http://localhost:8000/auth/logout \
  -H "Authorization: Bearer <access_token>"

πŸ“‘ API Endpoints

Authentication (/auth)

  • POST /auth/login - Login with email/password
  • POST /auth/refresh - Refresh access token
  • POST /auth/logout - Logout (invalidate tokens)

Users (/users)

  • GET /users/me - Get current user profile
  • GET /users - List all users (admin)
  • POST /users - Create user (admin)
  • PATCH /users/{id} - Update user (admin)
  • DELETE /users/{id} - Delete user (admin)
  • POST /users/api-keys - Generate API key

Core DMARC (/api)

  • GET /api/domains - List domains
  • GET /api/reports - List reports (paginated)
  • GET /api/reports/{id} - Get report details
  • POST /api/upload - Bulk file upload

Analytics & Rollup (/api/rollup)

  • GET /api/rollup/summary - Aggregate statistics
  • GET /api/rollup/sources - Top source IPs
  • GET /api/rollup/alignment - DKIM/SPF alignment
  • GET /api/rollup/timeline - Time-series data
  • GET /api/rollup/failure-trend - Failure rate trends

Exports (/api/export)

  • GET /api/export/reports/csv - Export reports CSV
  • GET /api/export/records/csv - Export records CSV
  • GET /api/export/sources/csv - Export sources CSV
  • GET /api/export/report/pdf - Generate PDF summary

Alerts (/alerts)

  • GET /alerts/history - Alert history
  • GET /alerts/rules - Alert rules
  • POST /alerts/rules - Create rule (admin)
  • PATCH /alerts/{id}/acknowledge - Acknowledge alert
  • PATCH /alerts/{id}/resolve - Resolve alert
  • POST /alerts/suppressions - Create suppression

ML Analytics (/analytics)

  • GET /analytics/geolocation/map - Country heatmap
  • GET /analytics/geolocation/lookup/{ip} - IP geolocation
  • GET /analytics/ml/models - List ML models
  • POST /analytics/ml/train - Train model (admin)
  • POST /analytics/ml/deploy - Deploy model (admin)
  • POST /analytics/anomalies/detect - Detect anomalies
  • GET /analytics/anomalies/recent - Recent predictions

Tasks (/tasks)

  • POST /tasks/trigger/email-ingestion - Trigger email fetch
  • POST /tasks/trigger/process-reports - Process pending reports
  • GET /tasks/status/{task_id} - Get task status

🎯 Role-Based Access

Role Permissions
Admin Full access: users, models, rules, all data
Analyst Read/write: reports, alerts, analytics
Viewer Read-only: dashboards, reports, analytics

πŸ“Š Monitoring

Flower Dashboard (Celery Tasks)

Access at http://localhost:5555

Monitors:

  • Active tasks
  • Task history
  • Worker status
  • Task schedules (Beat)

Scheduled Tasks

# View all schedules
docker compose exec celery-beat celery -A app.celery_app inspect scheduled

# Force run a task
docker compose exec celery-worker celery -A app.celery_app call \
  app.tasks.ml_tasks.train_anomaly_model_task

πŸ§ͺ Testing

# Run all tests with coverage
docker compose exec backend pytest -v --cov=app

# Run specific test suite
docker compose exec backend pytest tests/unit/ -v
docker compose exec backend pytest tests/integration/ -v

# Generate HTML coverage report
docker compose exec backend pytest --cov=app --cov-report=html

Coverage: 70%+ enforced in CI/CD


πŸ“š Documentation


πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Nginx     │────▢│   Backend    │────▢│ PostgreSQL β”‚
β”‚   (Port 80) β”‚     β”‚  (FastAPI)   β”‚     β”‚    (DB)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                           β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚    Redis    │◀───▢│Celery Worker β”‚
                    β”‚   (Broker)  β”‚     β”‚   + Beat     β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                           β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   Flower    β”‚
                    β”‚  (Monitor)  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ”§ Development

# View logs
docker compose logs -f backend
docker compose logs -f celery-worker

# Rebuild after code changes
docker compose up --build -d backend

# Create new migration
docker compose exec backend alembic revision --autogenerate -m "description"

# Reset database (WARNING: deletes all data)
docker compose down -v
docker compose up -d
docker compose exec backend alembic upgrade head
docker compose exec backend python scripts/create_admin_user.py

🚒 Production Deployment

See backend/DEPLOYMENT.md for:

  • SSL/TLS with Let's Encrypt
  • Database backups
  • Security hardening
  • Performance tuning
  • Monitoring setup

πŸ“ˆ System Requirements

Minimum:

  • CPU: 2 cores
  • RAM: 4GB
  • Storage: 10GB

Recommended:

  • CPU: 4+ cores
  • RAM: 8GB
  • Storage: 50GB+ (depends on volume)

πŸ“„ License

MIT


Version: 2.0.0 (Enterprise Edition) Last Updated: January 2026 Status: βœ… Production Ready

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors