initial draft for water layer#38
Conversation
4053f71 to
c03a8ed
Compare
Currently this just tests that particular columns aren't empty and that particular prefix_maps aren't empty. This is fairly reliant on the imported region having tagged elements for every tag being tested.
|
Edit: The problem discussed here should be fixed in 9c4db48 and f1094b5 however the bridge query still slows down the import process quite a bit, so it probably needs to be discussed. Hmm, I was looking into the `whitewater:* prefix_map, and looking for a good geofabrik region to test with. Been running for over an hour compared to a minute or two for all other regions I've tested with. I would suspect this might be the bridge query (without any actual evidence to suggest it). But it is taking vastly more resources than even larger regions like the UK and California. I suspect I'll need to get to the bottom of what's causing this for this to be viable. |
The bridge query was resource heavy on areas with lots of water
like alaska, and finland.
This pulls them into a temporary table first before doing the join.
When doing this I ran into an issue with multi-threaded query planning.
for instance
```
CREATE OR REPLACE TEMP TABLE water_bridges AS
SELECT b.type, b.id, b.tags, b.geometry
FROM bridges_unfiltered b
JOIN water_features w
ON b.geometry && w.geometry
WHERE ST_Intersects(b.geometry, w.geometry);
...
UNION ALL
SELECT DISTINCT type, id, tags, geometry FROM water_bridges
```
Works while
```
CREATE OR REPLACE TEMP TABLE water_bridges AS
SELECT DISTINCT b.type, b.id, b.tags, b.geometry
```
segfaults with a floating point exception unless
`PRAGMA threads=1`
The previous commit helped, but we can probably reduce the number
of IO passes over {{INPUT}} to 2, which seems to help a bit.
Here is an initial draft for water layer (#35), this is going to conflict with PR #36 I'm happy to fix my PR up after that one is finalized, this one is probably not ready anyways.
I'm pretty new to OSM, and duckdb both, I just tried to base it off the existing scripts, changing the tags I could think of.
My initial thoughts were to not include POI like boat ramps from the
leisurewhich would be contained in the POI layer in #36 but I figure if someone is using both the water layer and the POI layer they can just ignore the duplicate data in the water layer pretty easily?For now I figure I'll just do anything remotely related to water, and we can decide at a later date what to include once we have a fairly complete set of tags.
Tagging notes:
ST_Intersectsqueryford=nounder the assumption if it is covered in water you it is unsafe to ford.Are there any tags where we want to exclude values of
nolikewhere tags['foo'] IS NOT NULL and tags['foo'] != 'no'?amenity=drinking_water, showers, foot_washes, sports, and swimming_pools, water_parks?I have yet to really use this in anger yet