Skip to content

Conversation

@kylebarron
Copy link
Member

@kylebarron kylebarron commented Oct 10, 2025

Edit 2: The below performance issues still stand, but I want to merge this, because it might be easier to implement the A5 layer on top of this. So I'll remove the H3 layer and trait from the public API and then merge this.


cc @felixpalmer

image

Works in principle with latest deck.gl-layers release.

Change list

  • Adds H3HexagonLayer as a core layer type.
  • Implements h3 index validation in pure numpy, so that users can have data validated before it goes to JS (where it's hard to surface any data errors)
  • Implement str_to_h3 vectorized function that converts str input into a uint64 h3 array.
  • Implement H3Accessor traitlet that takes in either an array of str or int, validates them, and then packs array as uint64 type to send to the frontend.

todo

  • update layer docs
  • Update website docs for this layer
  • Implement layer bounds computation. For now, if the h3 binding exists in the environment, pick a random sample of ~10,000 input rows (with a stable seed) and compute viewport info based on that sample.

Edit: Sadly, this is extremely, unacceptably slow. Using as an example the kontur population dataset, 22km resolution, it takes 15 seconds to render on the JS side

Screen.Recording.2025-10-28.at.12.15.21.PM.mov

because you see the readParquet console.log statements immediately, I think all of that is overhead in the deck.gl code.

image

the main task took 16.25 seconds and 85% of that (I think that's what the first three "self time" numbers mean?) was just allocations and GC...?

the implementation on the geoarrow/deck.gl-layers side seems pretty straightforward and simple

So idk if I'm doing something wrong (very possible) or if that's just the performance of sending 70k h3 cells to the H3HexagonLayer?

I don't think we can merge this as-is with this performance. Users expect to be able to render hundreds of thousands of polygons and 15s rendering with 70k doesn't hold to the standards of this library.

I'd rather re-implement this to just convert H3 hexagons to GeoArrow polygons on the Python side.

Code:

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6039544a-9028-4b5f-915b-2b92b2e3df13",
   "metadata": {},
   "outputs": [],
   "source": [
    "import lonboard"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b93514c8-773f-4e85-9587-8514f9f08283",
   "metadata": {},
   "outputs": [],
   "source": [
    "from lonboard import H3HexagonLayer, Map"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "26e6183a-2ec2-4288-9083-8a63906bad29",
   "metadata": {},
   "outputs": [],
   "source": [
    "from palettable.colorbrewer.diverging import BrBG_10"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1f211696-0637-46ff-adec-22ffb7c389c8",
   "metadata": {},
   "outputs": [],
   "source": [
    "from lonboard.colormap import apply_continuous_cmap\n",
    "from matplotlib.colors import LogNorm"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "24b04f6f-f133-4107-8795-3eefe9186fae",
   "metadata": {},
   "outputs": [],
   "source": [
    "path = \"/Users/kyle/Downloads/kontur_population_20231101_r4.gpkg\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3e1aa79a-3975-4059-8ee9-d18149074a73",
   "metadata": {},
   "outputs": [],
   "source": [
    "import geopandas as gpd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e7051bd8-3060-40eb-959a-554da532df69",
   "metadata": {},
   "outputs": [],
   "source": [
    "gdf = gpd.read_file(path)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5b08f0a6-c730-471f-87d9-75e5cca95e6c",
   "metadata": {},
   "outputs": [],
   "source": [
    "df = gdf[[\"h3\", \"population\"]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8cc4cb73-02bc-4c95-8497-e563bc1e9a90",
   "metadata": {},
   "outputs": [],
   "source": [
    "layer = H3HexagonLayer.from_pandas(df, get_hexagon=df[\"h3\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "de85090c-828f-4a80-8a65-fa2f45e9eeb1",
   "metadata": {},
   "outputs": [],
   "source": [
    "m = Map(layer)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bbb5f82b-0318-405a-86a5-216b6863ba82",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "m"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0295d0fe-df22-4a8a-8a93-9e824630e7eb",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "de39f57a-c463-48e3-b505-058049de8588",
   "metadata": {},
   "outputs": [],
   "source": [
    "pop = df[\"population\"]\n",
    "min_bound = pop.min()\n",
    "max_bound = pop.max()\n",
    "normalizer = LogNorm(min_bound, max_bound, clip=True)\n",
    "normalized = normalizer(pop)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "736a7c94-e8bc-48eb-a1d0-aca2af961a3a",
   "metadata": {},
   "outputs": [],
   "source": [
    "colors = apply_continuous_cmap(normalized, BrBG_10, alpha=0.7)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3eda2ed2-115d-4d8f-9dce-9fe062b3e520",
   "metadata": {},
   "outputs": [],
   "source": [
    "layer.get_fill_color = colors"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d915d64b-5813-4a44-92d5-65b5ff00d06d",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "lonboard",
   "language": "python",
   "name": "lonboard"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}

Closes #302, for #885

@kylebarron kylebarron changed the title WIP: h3 layer feat: H3 layer Oct 13, 2025
@github-actions github-actions bot added the feat label Oct 13, 2025
@kylebarron
Copy link
Member Author

Benchmark of h3 string parsing:

import numpy as np
import pandas as pd
import pyarrow as pa

import h3.api.numpy_int as h3
from lonboard import H3HexagonLayer, Map
from lonboard._h3 import h3_to_str
from lonboard._h3._str_to_h3 import str_to_h3

VALID_INDICES = np.array(
    [
        0x8075FFFFFFFFFFF,
        0x81757FFFFFFFFFF,
        0x82754FFFFFFFFFF,
        0x83754EFFFFFFFFF,
        0x84754A9FFFFFFFF,
        0x85754E67FFFFFFF,
        0x86754E64FFFFFFF,
        0x87754E64DFFFFFF,
        0x88754E6499FFFFF,
        0x89754E64993FFFF,
        0x8A754E64992FFFF,
        0x8B754E649929FFF,
        0x8C754E649929DFF,
        0x8D754E64992D6FF,
        0x8E754E64992D6DF,
        0x8F754E64992D6D8,
    ],
    dtype=np.uint64,
)

hex_str = h3_to_str(VALID_INDICES)
large_hex_str = np.repeat(hex_str, 10000)

%timeit parsed_loop = np.array([int(h, 16) for h in large_hex_str])
# 20.3 ms ± 886 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit parsed_h3_api = np.array([h3.str_to_int(h) for h in large_hex_str])
# 26.9 ms ± 200 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit parsed = str_to_h3(large_hex_str)
# 7.25 ms ± 170 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

@kylebarron
Copy link
Member Author

Benchmark of h3 cell validation:

import numpy as np
import pandas as pd
import pyarrow as pa

import h3.api.numpy_int as h3
from lonboard import H3HexagonLayer, Map
from lonboard._h3 import h3_to_str, validate_h3_indices

VALID_INDICES = np.array(
    [
        0x8075FFFFFFFFFFF,
        0x81757FFFFFFFFFF,
        0x82754FFFFFFFFFF,
        0x83754EFFFFFFFFF,
        0x84754A9FFFFFFFF,
        0x85754E67FFFFFFF,
        0x86754E64FFFFFFF,
        0x87754E64DFFFFFF,
        0x88754E6499FFFFF,
        0x89754E64993FFFF,
        0x8A754E64992FFFF,
        0x8B754E649929FFF,
        0x8C754E649929DFF,
        0x8D754E64992D6FF,
        0x8E754E64992D6DF,
        0x8F754E64992D6D8,
    ],
    dtype=np.uint64,
)

large_hex = np.repeat(VALID_INDICES, 10000)
%timeit validate_h3_indices(large_hex)
# 3.68 ms ± 96.5 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit all([h3.is_valid_cell(h) for h in large_hex])
# 15 ms ± 157 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

@kylebarron
Copy link
Member Author

kylebarron commented Oct 13, 2025

Benchmark of bounds computation using h3:

import numpy as np
import pandas as pd
import pyarrow as pa

import h3.api.numpy_int as h3
from lonboard import H3HexagonLayer, Map
from lonboard._h3 import h3_to_str, validate_h3_indices

VALID_INDICES = np.array(
    [
        0x8075FFFFFFFFFFF,
        0x81757FFFFFFFFFF,
        0x82754FFFFFFFFFF,
        0x83754EFFFFFFFFF,
        0x84754A9FFFFFFFF,
        0x85754E67FFFFFFF,
        0x86754E64FFFFFFF,
        0x87754E64DFFFFFF,
        0x88754E6499FFFFF,
        0x89754E64993FFFF,
        0x8A754E64992FFFF,
        0x8B754E649929FFF,
        0x8C754E649929DFF,
        0x8D754E64992D6FF,
        0x8E754E64992D6DF,
        0x8F754E64992D6D8,
    ],
    dtype=np.uint64,
)

large_hex = np.repeat(VALID_INDICES, 10000)

def cell_bounds(h):
    boundary = np.array(h3.cell_to_boundary(h))  # lat/lon pairs
    min_lat = boundary[:, 0].min()
    max_lat = boundary[:, 0].max()
    min_lon = boundary[:, 1].min()
    max_lon = boundary[:, 1].max()
    return min_lat, max_lat, min_lon, max_lon

%%timeit
# Apply to all cells
bounds_array = np.array([cell_bounds(c) for c in large_hex])
min_lat = bounds_array[:, 0].min()
max_lat = bounds_array[:, 0].max()
min_lon = bounds_array[:, 1].min()
max_lon = bounds_array[:, 1].max()
856 ms ± 6.26 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Almost a second for 160,000 h3 cells 😬

@kylebarron kylebarron added this to the 0.13 milestone Oct 22, 2025
@kylebarron kylebarron removed this from the 0.13 milestone Oct 28, 2025
@kylebarron kylebarron marked this pull request as draft October 28, 2025 17:25
@kylebarron kylebarron marked this pull request as ready for review October 29, 2025 16:32
@kylebarron
Copy link
Member Author

I removed the public exports of the H3HexagonLayer and H3Accessor, so we can merge this and iterate to create the A5Layer

@kylebarron kylebarron enabled auto-merge (squash) October 29, 2025 16:33
@kylebarron kylebarron merged commit a2924a2 into main Oct 29, 2025
6 checks passed
@kylebarron kylebarron deleted the kyle/h3-layer branch October 29, 2025 16:39
kylebarron added a commit that referenced this pull request Oct 29, 2025
In #917 I documented some performance issues during rendering.

Removing these settings for `typedArrayManagerProps` fixes the rendering
performance.

The issue is that we were never using deck.gl to allocate data before
this layer. So I essentially turned off the typed array manager to avoid
any extra memory usage.

But with the H3Layer, we're now passing h3 strings to deck.gl and
letting deck.gl manage the geometry construction. This means that with
the typed array manager turned off we were getting massive performance
hits to allocations and GC.

Also improves perf of the [upcoming
A5Layer](#1001)

cc @felixpalmer
kylebarron added a commit that referenced this pull request Oct 29, 2025
Follow up to #917 after solving the performance issue in #1003
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

H3 support

2 participants