A FastAPI application for identifying and cataloguing police data sources. Part of the Police Data Accessibility Project (PDAP).
The Source Manager collects URLs from various sources, enriches them with metadata using automated tasks and ML models, supports human annotation for validation, and synchronizes approved data sources to the Data Sources App.
# Install dependencies
uv sync
# Start the local database
cd local_database && docker compose up -d && cd ..
# Create a .env file (see ENV.md for all variables)
# At minimum, set the POSTGRES_* variables to match local_database defaults.
# Run the app
fastapi dev main.pyThen open http://localhost:8000/api for the interactive API docs.
Note: accessing API endpoints requires a valid Bearer token from the Data Sources API.
| Document | Description |
|---|---|
| Architecture | System design, module structure, task system, data flow |
| API Reference | All 65 endpoints across 15 route groups |
| Development Guide | Local setup, environment variables, common workflows |
| Testing Guide | Running tests, CI pipeline, writing new tests |
| Deployment | Docker, Alembic migrations, DS App synchronization |
| Collectors | Collector architecture and how to build new ones |
| Environment Variables | Full reference for all env vars and feature flags |
src/
├── api/ # FastAPI routers and endpoint logic
├── core/ # Integration layer and task system
├── db/ # SQLAlchemy models, async DB client, queries
├── collectors/ # Pluggable URL collection strategies
├── external/ # Clients for external services (HuggingFace, PDAP, etc.)
├── security/ # JWT auth and permissions
└── util/ # Shared helpers
Thank you for your interest in contributing to this project! Please follow these guidelines:
- These Design Principles may be used to make decisions or guide your work.
- If you want to work on something, create an issue first so the broader community can discuss it.
- If you make a utility, script, app, or other useful bit of code: put it in a top-level directory with an appropriate name and dedicated README and add it to the index.
Docstrings and type hints are checked via a GitHub Action (python_checks.yml) using pydocstyle and mypy. These produce advisory PR comments and do not block merges.
Note: python_checks.yml only runs on pull requests from within the repo, not from forks.