A Rust-based geocoding system with Elasticsearch, inspired by Pelias and Nominatim.
- OSM PBF Ingestion - Parses OpenStreetMap data with multilingual name support
- Point-in-Polygon Admin Lookup - Assigns administrative hierarchy to each place using R-tree spatial indexing
- Elasticsearch Backend - Full-text search with edge n-gram autocomplete
- Wikidata Integration - Enriches place names with multilingual labels from Wikidata
- Location & Bounding Box Bias - Boost results near user's location or viewport
- Data Refresh - Re-import files with automatic stale document cleanup
- Rust 1.70+
- Elasticsearch 8.x
- 8GB+ RAM for Switzerland import
docker run -d --name cypress-es -p 9200:9200 \
-e "discovery.type=single-node" \
-e "xpack.security.enabled=false" \
docker.elastic.co/elasticsearch/elasticsearch:8.11.0cargo build --release# Configure regions.toml and run:
cargo run --release --bin ingest -- batch --config regions.toml
# Or run directly:
cargo run --release --bin ingest -- \
--file switzerland-latest.osm.pbf \
--create-index \
--refresh \
--wikidatacargo run --release --bin query -- --listen 0.0.0.0:3000If you need to remove data for a specific region (e.g., to re-import it or free up space), you can use the wipe_region.sh script:
# Wipe data for Albania
./scripts/wipe_region.sh Albania
# Wipe data using a custom Elasticsearch URL
./scripts/wipe_region.sh Germany --url http://10.0.0.5:9200The script identifies the correct records using the source_file field based on the regions defined in scripts/import_global.sh.
Deleting places index
curl -X DELETE "http://localhost:9200/places"Deleting versions index
curl -X DELETE "http://localhost:9200/cypress_versions"or use the wipe versions script:
cargo run --bin ingest -- reset-versions# Basic search
curl "http://localhost:3000/v1/search?text=Zurich"
# With language preference
curl "http://localhost:3000/v1/search?text=Genève&lang=fr"
# With bounding box filter
curl "http://localhost:3000/v1/search?text=bahnhof&bbox=8.5,47.3,8.6,47.4"
# With location bias
curl "http://localhost:3000/v1/search?text=restaurant&focus.point.lat=47.37&focus.point.lon=8.54"curl "http://localhost:3000/v1/reverse?point.lat=47.37&point.lon=8.54"curl "http://localhost:3000/v1/autocomplete?text=zur"Results are returned in GeoJSON-like format with all available language variants:
{
"features": [
{
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [8.54, 47.37]
},
"properties": {
"id": "node/123456",
"layer": "locality",
"name": "Zürich",
"names": {
"default": "Zürich",
"de": "Zürich",
"fr": "Zurich",
"it": "Zurigo",
"en": "Zurich"
},
"country": "Switzerland",
"region": "Zürich",
"confidence": 42.5
}
}
]
}cypress/
├── src/
│ ├── lib.rs # Shared library
│ ├── models/ # Place, AdminHierarchy, etc.
│ ├── elasticsearch/ # ES client, schema, bulk indexer
│ ├── pip/ # Point-in-Polygon admin lookup
│ ├── wikidata/ # SPARQL label fetcher
│ ├── ingest/ # OSM PBF import binary
│ └── query/ # HTTP query server
└── schema/
└── places_mapping.json # Elasticsearch index mapping
MIT
