Skip to content

catenarytransit/cypress

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cypress

A Rust-based geocoding system with Elasticsearch, inspired by Pelias and Nominatim.

1200x680

Features

  • OSM PBF Ingestion - Parses OpenStreetMap data with multilingual name support
  • Point-in-Polygon Admin Lookup - Assigns administrative hierarchy to each place using R-tree spatial indexing
  • Elasticsearch Backend - Full-text search with edge n-gram autocomplete
  • Wikidata Integration - Enriches place names with multilingual labels from Wikidata
  • Location & Bounding Box Bias - Boost results near user's location or viewport
  • Data Refresh - Re-import files with automatic stale document cleanup

Requirements

  • Rust 1.70+
  • Elasticsearch 8.x
  • 8GB+ RAM for Switzerland import

Quick Start

1. Start Elasticsearch

docker run -d --name cypress-es -p 9200:9200 \
  -e "discovery.type=single-node" \
  -e "xpack.security.enabled=false" \
  docker.elastic.co/elasticsearch/elasticsearch:8.11.0

2. Build

cargo build --release

3. Import Data

# Configure regions.toml and run:
cargo run --release --bin ingest -- batch --config regions.toml

# Or run directly:
cargo run --release --bin ingest -- \
  --file switzerland-latest.osm.pbf \
  --create-index \
  --refresh \
  --wikidata

4. Start Query Server

cargo run --release --bin query -- --listen 0.0.0.0:3000

Data Management

Wiping a Region

If you need to remove data for a specific region (e.g., to re-import it or free up space), you can use the wipe_region.sh script:

# Wipe data for Albania
./scripts/wipe_region.sh Albania

# Wipe data using a custom Elasticsearch URL
./scripts/wipe_region.sh Germany --url http://10.0.0.5:9200

The script identifies the correct records using the source_file field based on the regions defined in scripts/import_global.sh.

index management

Deleting places index

curl -X DELETE "http://localhost:9200/places"

Deleting versions index

curl -X DELETE "http://localhost:9200/cypress_versions"

or use the wipe versions script:

cargo run --bin ingest -- reset-versions

API Endpoints

Forward Geocoding

# Basic search
curl "http://localhost:3000/v1/search?text=Zurich"

# With language preference
curl "http://localhost:3000/v1/search?text=Genève&lang=fr"

# With bounding box filter
curl "http://localhost:3000/v1/search?text=bahnhof&bbox=8.5,47.3,8.6,47.4"

# With location bias
curl "http://localhost:3000/v1/search?text=restaurant&focus.point.lat=47.37&focus.point.lon=8.54"

Reverse Geocoding

curl "http://localhost:3000/v1/reverse?point.lat=47.37&point.lon=8.54"

Autocomplete

curl "http://localhost:3000/v1/autocomplete?text=zur"

Response Format

Results are returned in GeoJSON-like format with all available language variants:

{
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [8.54, 47.37]
      },
      "properties": {
        "id": "node/123456",
        "layer": "locality",
        "name": "Zürich",
        "names": {
          "default": "Zürich",
          "de": "Zürich",
          "fr": "Zurich",
          "it": "Zurigo",
          "en": "Zurich"
        },
        "country": "Switzerland",
        "region": "Zürich",
        "confidence": 42.5
      }
    }
  ]
}

Architecture

cypress/
├── src/
│   ├── lib.rs              # Shared library
│   ├── models/             # Place, AdminHierarchy, etc.
│   ├── elasticsearch/      # ES client, schema, bulk indexer
│   ├── pip/                # Point-in-Polygon admin lookup
│   ├── wikidata/           # SPARQL label fetcher
│   ├── ingest/             # OSM PBF import binary
│   └── query/              # HTTP query server
└── schema/
    └── places_mapping.json # Elasticsearch index mapping

License

MIT

About

Geocoder, Autocomplete, and Search for OSM data in ElasticSearch

Resources

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

Packages

No packages published