-
Notifications
You must be signed in to change notification settings - Fork 2
Benchmarks
Measured on AMD Ryzen 9 3950X (16C/32T, 3.49 GHz base), Windows 11 24H2, .NET 10.0.6, X64 RyuJIT AVX2 / x86-64-v3, BenchmarkDotNet 0.14 / 0.15, Release mode. The authoritative report with full per-suite tables lives at BENCHMARKS.md in the repo root.
v1.9.0 adds 12 indicators across Tier 3a/b/c (52 total) and does not touch any rendering path — every SVG/indicator/numerics number published at v1.7 / v1.8 is still current. A dedicated Tier3IndicatorBenchmarks suite will land post-v1.9.0.
Headline numbers (Ryzen 9 3950X, 100 000-point series unless noted):
| Scenario | Time | Allocation |
|---|---|---|
| SVG render a 1 000-point line | 66 µs | 127 KB |
| SVG render with LTTB downsample (100 000 pts) | 1.78 ms | 2.4 MB |
SIMD Vec.Sum / Vec.Mean on 100 000 pts |
19 µs | 0 B |
| SMA(20) on 100 000 pts | 196 µs | 781 KB |
| VWAP on 100 000 pts | 204 µs | 781 KB |
| EquityCurve on 100 000 pts | 189 µs | 781 KB |
| JSON round-trip (Figure ⇄ JSON) | 40 µs | 19.6 KB |
| PNG export via SkiaSharp | 27 ms | 88 KB |
TransformBatch (AVX) on 100 000 pts |
208 µs | 1.5 MB |
See BENCHMARKS.md for full per-suite tables, allocation breakdowns, and historical v0.5 → v1.1 comparisons.
| Benchmark | Result | Notes |
|---|---|---|
| RingBuffer.Append | 40M ops/sec | Single-writer, ReaderWriterLockSlim, 100K capacity |
| RingBuffer.ToArray (10K snapshot) | 190K snapshots/sec | 10K-element copy per snapshot |
| StreamingLine 100K appends + snapshot | 13ms total | Append 100K points + create snapshot |
MathText parse \sum_{i=0}^{n} \frac{1}{i!} = e |
473K parses/sec | Operator limits + fraction in single expression |
MathText parse \begin{pmatrix} a & b \\ c & d \end{pmatrix} |
554K parses/sec | 2×2 matrix environment |
| 26 theme presets loaded | 370µs | All 26 themes instantiated from static properties |
| SVG render (1K-point line chart) | 1.12ms | Full pipeline: data range → ticks → axes → series → SVG string |
| SVG render (3D surface + rotation script) | 1.22ms | Includes Projection3D, depth sort, JS injection |
| Natural Earth 110m coastlines loaded | 5ms | 134 features parsed from embedded GeoJSON |
| Natural Earth 110m countries loaded | 48ms | 177 features (cached after first load) |
| Projection | ops/sec | Notes |
|---|---|---|
| Sinusoidal | 83M/sec | Simplest — single cosine |
| NaturalEarth | 77M/sec | Polynomial — no trig |
| Robinson | 33M/sec | Table interpolation |
| AlbersEqualArea | 29M/sec | Conic |
| Mercator | 25M/sec | Log tangent |
| Stereographic | 24M/sec | Azimuthal |
| EqualEarth | 19M/sec | Modern polynomial |
| Orthographic | 17M/sec | Globe view |
| PlateCarree | 16M/sec | Identity (overhead is the method call) |
| LambertConformal | 16M/sec | Conic |
| TransverseMercator | 14M/sec | Rotated cylindrical |
| AzimuthalEquidistant | 13M/sec | Acos + trig |
| Mollweide | 6M/sec | Newton iteration (slowest) |
All projections exceed 5M projections/sec — projecting all 177 countries takes < 1ms even with Mollweide.
| Scenario | Buffer | Append rate | Render FPS | Memory |
|---|---|---|---|---|
| Dashboard | 1K | 1/sec | 1 | 16 KB |
| Telemetry | 10K | 100/sec | 30 | 640 KB |
| Oscilloscope | 100K | 10K/sec | 60 | 800 KB |
| Trading (OHLC) | 5K | 10/sec | 10 | 160 KB |
With 40M appends/sec and 190K snapshots/sec, the ring buffer is never the bottleneck — rendering is. At 30fps (33ms budget), each frame has ~32ms for rendering after the sub-microsecond snapshot.
PolarHeatmapSeries rendering and the NumPy-style numeric operations added in v1.1.1 are not yet in the benchmark suite. The polar heatmap renders via the same 12-segment polygon path as PolarBarSeries — expected render time is comparable to the existing PolarLine (33 µs) given similar polygon counts per chart.
Run the full suite to establish a baseline:
dotnet run -c Release --project Benchmarks/MatPlotLibNet.Benchmarks -- --filter "*"New DataFrameBenchmarks class covers every public method in MatPlotLibNet.DataFrame — column reader, all 16 financial indicators, polynomial numerics, and figure builder extensions (with and without hue grouping).
27 benchmarks across 5 groups, 3 data sizes (1K / 10K / 100K rows):
| Group | Benchmarks |
|---|---|
| Column reader |
ToDoubleArray (baseline), ToStringArray
|
| Price indicators | SMA, EMA, RSI, MACD, BollingerBands, DrawDown, OBV |
| OHLCV indicators | ATR, ADX, ADXFull, CCI, Stochastic, WilliamsR, KeltnerChannels, VWAP, ParabolicSAR |
| Numerics | PolyFit(deg 3), PolyEval(deg 3), ConfidenceBand(95%) |
| Figure builders | Line, Scatter, Hist (plain + with 3-group hue split) |
Key insight: hue grouping overhead is measurable by comparing Line_Close vs Line_WithHue — isolates the HueGrouper cost from the render cost.
dotnet run -c Release --project Benchmarks/MatPlotLibNet.Benchmarks -- --filter "*DataFrame*"
dotnet run -c Release --project Benchmarks/MatPlotLibNet.Benchmarks -- --filter "*DataFrame*Indicator*"
dotnet run -c Release --project Benchmarks/MatPlotLibNet.Benchmarks -- --filter "*DataFrame*Hue*"Added benchmarks for one new series type and a SIMD improvement. (The GeoMap_Equirectangular and Choropleth_Viridis benchmarks from v1.1.0 were removed in v1.1.4 along with the Geo/Map subsystem itself — see the Roadmap for rationale.)
| Benchmark | Time | Allocated |
|---|---|---|
Surface3D_WithLighting — 10×10 grid + directional light |
82 µs | 148 KB |
VectorMath.SplitPositiveNegative — replaced per-element branching with two TensorPrimitives.Max/Min SIMD passes. Faster for all spans > ~16 elements on AVX2 hardware.
dotnet run -c Release --project Benchmarks/MatPlotLibNet.Benchmarks -- --filter *Surface3D_WithLighting* --memoryMatPlotLibNet renders charts server-side as SVG and delivers them to clients. No JavaScript chart library on the client — the browser just swaps innerHTML.
| Benefit | Detail |
|---|---|
| Zero client-side cost | Browser swaps innerHTML — no canvas redraws, no layout recalculation |
| Inline SVG | Part of the DOM — styleable via CSS, accessible to screen readers, prints as vector |
| Consistent | Every client sees the exact same chart, no browser rendering differences |
| Bandwidth-efficient | Typical chart SVG is 5–15 KB; SignalR pushes only changed charts |
| Scales with hardware | Parallel subplot rendering uses all available cores |
The hot path: data space → pixel space. v0.6.0 replaced a two-pass TensorPrimitives approach with a single-pass AVX SIMD interleave — Vector256.Multiply + Add (FMA when available) → Avx.UnpackLow/High → Avx.Permute2x128 → direct store via MemoryMarshal.Cast. Scalar fallback on non-x86.
| Size | v0.5.1 | v0.6.0 | Speedup | Alloc reduction |
|---|---|---|---|---|
| 1K pts | 9 µs | 764 ns | 11.8× | 2× |
| 10K pts | 124 µs | 53 µs | 2.3× | 2× |
| 100K pts | 1,298 µs / 3,047 KB | 208 µs / 1,563 KB | 6.2× | 2× |
Every LineSeries, ScatterSeries, AreaSeries, and BubbleSeries renderer uses this path — all indicator output benefits automatically.
| Chart | Time | Allocated |
|---|---|---|
| Simple line (100 pts) | 94 µs | 136 KB |
| Line + scatter + bar | 109 µs | 133 KB |
| 3×3 subplot grid | 754 µs | 933 KB |
| Treemap (6 nodes) | 60 µs | 109 KB |
| Sunburst (4 nodes, depth 2) | 65 µs | 118 KB |
| Sankey (4 nodes, 4 links) | 63 µs | 118 KB |
| Polar line (50 pts) | 33 µs | 56 KB |
| 3D surface (10×10) | 69 µs | 124 KB |
| 3D surface (10×10) + directional lighting | 82 µs | 148 KB |
| Line + legend (3 series) | 140 µs | 214 KB |
| Large line (10K pts) | 3,105 µs | 3,714 KB |
| Large line (100K pts, LTTB→2K) | 1,332 µs | 2,429 KB |
LTTB downsampling makes 100K-point charts faster than full-resolution 10K charts.

At 100K points (a full trading day at 1-second bars), every indicator completes in under 3.3 ms. Multiple indicators run in parallel on separate cores.
| Indicator | v0.5.1 | v0.6.0 | Note |
|---|---|---|---|
| SMA(20) | 196 µs | 195 µs | Sliding sum |
| EMA(20) | 496 µs | 491 µs | Sequential |
| RSI(14) | 851 µs | 892 µs | |
| VWAP | 212 µs | 238 µs | |
| EquityCurve | 349 µs | 226 µs | CumulativeSum + Linspace |
| BollingerBands(20) | 2,016 µs | 2,231 µs | SIMD inner loop |
| MACD(12,26,9) | 1,574 µs | 1,495 µs | |
| ADX(14) | 2,609 µs | 2,434 µs | |
| Stochastic(14,3) | 7,669 µs | 3,308 µs | 2.3× — O(n*p) → O(n) monotone deque |
| Indicator | Time | Allocated |
|---|---|---|
| OBV | 645 µs | 781 KB |
| ParabolicSAR | 1,211 µs | 879 KB |
| CCI(20) | 2,159 µs | 2,344 KB |
| WilliamsR(14) | 2,972 µs | 3,125 KB |
Vec is a readonly record struct wrapping double[] with SIMD-accelerated operators via TensorPrimitives.
| Operation | 1K | 10K | 100K |
|---|---|---|---|
| a + b | 452 ns | 3.8 µs | 120 µs |
| a × scalar | 386 ns | 2.9 µs | 120 µs |
| (a+b)×1.5−b | 1.3 µs | 11 µs | 447 µs |
| Std | 803 ns | 8.4 µs | 172 µs |
| Operation | 1K | 10K | 100K |
|---|---|---|---|
| Sum | 164 ns | 1.8 µs | 18 µs |
| Mean | 166 ns | 1.8 µs | 18 µs |
| Min | 434 ns | 4.5 µs | 44 µs |
| Max | 335 ns | 3.4 µs | 34 µs |
Reductions are zero-alloc and ~6× faster than element-wise ops at 100K.
Round-trip under 50 µs → >20,000 chart specs/sec on a single core.
| Method | Time | Allocated |
|---|---|---|
| ToJson | 26 µs | 8 KB |
| FromJson | 21 µs | 12 KB |
| Round-trip | 41 µs | 20 KB |
Dominated by SkiaSharp rasterization. Suited for batch export, not real-time streaming.
| Method | Time | Allocated |
|---|---|---|
| PNG (simple) | 27 ms | 88 KB |
| PNG (complex) | 22 ms | 81 KB |
| PDF (simple) | 47 ms | 3,925 KB |
| PDF (complex) | 47 ms | 3,922 KB |
cd Benchmarks/MatPlotLibNet.Benchmarks
dotnet run -c Release -- --filter "*SvgRendering*"
dotnet run -c Release -- --filter "*DataTransform*"
dotnet run -c Release -- --filter "*Indicator*"
dotnet run -c Release -- --filter "*VectorMath*"
dotnet run -c Release -- --filter "*Serialization*"
dotnet run -c Release -- --filter "*SkiaExport*"
dotnet run -c Release -- --filter "*" # all suitesRun one suite at a time — concurrent benchmark runs inflate timings due to CPU contention.