Experiment of rewriting in Ruby#218
Conversation
Performance Analysis: Ruby Implementation vs C ExtensionBenchmarks run on Ruby 4.0.2 (2026-03-17, +PRISM, +YJIT, arm64-darwin25), Rails 8.0. panko_json — JSON string output (primary benchmark)Throughput (iterations/sec, higher is better)
Allocations (allocs / retained)
panko_object — Ruby Hash/Array output
plain_object — non-AR objects
object_writer — Oj::StringWriter layer
type_casts — selected highlights
components — Ruby impl internal benchmarks (no C ext equivalent)
SummaryThe Ruby implementation is faster than the C extension for all ActiveRecord serialization. On Ruby 4.0 with YJIT (the default), panko_json throughput improves +10% to +101% across every benchmark. The largest gains are on filtered queries: Only +101%, HasOne +68%, Except +65%, Simple +66% at scale. Even the smallest gain (MethodCall, 50 posts) is still +10%. panko_object sees even larger gains (+42% to +80%). The Ruby Hash/Array output path benefits from the same YJIT optimizations. Only +80% on 2300 records is the standout. plain_object is a regression. Non-AR serialization is −1% to −31% slower, with MethodCall taking the biggest hit (−27% to −31%). Simple attribute access is flat (±1%). This is the one area where the C extension still wins. object_writer is unchanged (~0%). Expected — Type cast microbenchmarks are slower in pure Ruby. Individual type conversions run 2–87% slower, with Boolean TypeCast (−87%) and Json NoTypeCast (−71%) being the outliers. However, these microbenchmarks measure isolated type conversions at millions of iterations/sec — the overhead is invisible in end-to-end serialization where the Ruby impl is faster overall. Allocations are dramatically lower. The Ruby impl now allocates significantly fewer objects than the C extension in most benchmarks. For example, Simple 2300 drops from 4,620 to 35 allocations — a 99% reduction. Except 2300 drops from 4,631 to 66. HasMany 2300 drops from 24,928 to 18,058. The only cases where the Ruby impl allocates more are MethodCall (37 vs 22) and Only (66 vs 31), both small in absolute terms. This is a major improvement over the previous iteration where the Ruby impl allocated ~50% more objects. |
593c73d to
8be0e83
Compare
|
📊 Benchmarks: Ruby 4.0.1, Rails 8.1.0
• Master + YJIT: 5-39% speedup |
|
@rus-max thanks for this comment.. I was pretty sure that the benchmark auto enable YJIT but this is the not the case.. I updated the performance comments and the results are impressive. Hopefully soon I'll be able to release the ruby version :) |
|
@yosiat Allocation benchmarks show no difference between Ruby impl and Ruby+YJIT. |
|
@rus-max I am not too familiar with YJIT internals, but I don't expect it to reduce allocations. |
d99c926 to
0d6f037
Compare
…fter first call. +3.6% IPS
Result: {"status":"keep","total_ips":13443.41,"total_allocs":4818}
…aseline! Simple 2300: 209 ips (was 175)
Result: {"status":"keep","total_ips":15600.3,"total_allocs":4818}
…er baseline! Simple 2300: 265 ips
Result: {"status":"keep","total_ips":18463.56,"total_allocs":4818}
…11% over baseline. 14422 vs 12978
Result: {"status":"keep","total_ips":14421.88,"total_allocs":4818}
…allocs unchanged at 4818
Result: {"status":"keep","total_ips":17167.58,"total_allocs":4818}
…6% over baseline
Result: {"status":"keep","total_ips":17339.66,"total_allocs":4818}
…one call. +40.5% over baseline
Result: {"status":"keep","total_ips":18243.27,"total_allocs":4818}
The 5-argument form of bytesplice was added in Ruby 3.3. Fall back to the 3-argument form with an explicit byteslice to support Ruby 3.2.
… C extension fallback Deletes the unused Context class from context.rb, the @Aliases attr_accessor and apply_fields_filters method from SerializationDescriptor, the stale aliases={} init in Serializer#inherited, and the C extension fallback branch in Impl::Serializer#write_fields that referenced non-existent Panko._sd_set_writer and Panko._write_attributes methods. Updates the one spec that asserted on aliases.
…-convention visibility
Move apply_filters, resolve_filters, apply_attribute_filters, and apply_association_filters out of SerializationDescriptor into a new Panko::Filters class backed by a single frozen instance (INSTANCE). SerializationDescriptor#apply_filters becomes a one-liner delegation. Adds spec/unit/panko/filters_spec.rb with full unit coverage.
…d edge-case specs
- Remove @record_class ivar and all AR alias-resolution logic from Attribute - Simplify invalidate! to take no arguments (just clears @type and @cached_writer) - Move AR attribute_aliases resolution into ActiveRecord::Writer where it belongs - Expose alias_name= writer (attr_accessor) and make name= public - Add spec/unit/panko/attribute_spec.rb covering the new behaviour
Extract all record-level state (column_indexes, row, is_indexed_row, attributes_hash, types, additional_types, values, etc.) from the 252-line Writer god object into a new RecordState class. RecordState owns: - setup(object) — handles the IndexedRow fast path (identity check on column_indexes) and the full initialization path; returns true when the record class changes so the caller knows to invalidate attributes and re-resolve AR aliases - read_attribute(attribute) — non-indexed (Rails 7.x) value lookup with dirty-attributes-hash priority Writer becomes a thin orchestrator: calls record_state.setup, handles attribute invalidation and alias resolution on class change, then delegates into the existing indexed-row fast paths using state exposed via record_state accessors. Adds spec/unit/panko/impl/record_state_spec.rb with 29 unit examples covering initialization defaults, setup return values, fast-path detection, class-change tracking, and read_attribute priority logic.
…ngine The Impl namespace was a historical artifact from when this code was a direct port of the C extension. Now that it is idiomatic Ruby, Engine better reflects its role: the internal hot-path machinery that powers serialization, distinct from the public Panko::* API surface. - Rename lib/panko/impl/ → lib/panko/engine/ (git mv) - Rename spec/unit/panko/impl/ → spec/unit/panko/engine/ (git mv) - Replace all Panko::Impl → Panko::Engine references across lib/, spec/, benchmarks/, and panko_serializer.rb - Update CLAUDE.md to reflect new paths and namespace - Add .DS_Store to .gitignore
Replace 29 benchmark files with a clean 12-file structure: - support/benchmark.rb: ~200 LOC infra with benchmark() and benchmark_with_records() API, BENCH=/SIZE=/PROFILE= env vars, YJIT - support/setup.rb: SQLite in-memory DB with 2300 seed records - support/datasets.rb: 4 reusable datasets (posts, authors, aliased, plain) - 5 benchmark files: panko_json, panko_object, plain_object, object_writer, components (Filters, RecordState, SerializationDescriptor) - type_casts/: per-provider files (generic, postgresql, mysql, sqlite) Simplify Rakefile from PTY/JSON runner to simple system() calls. Remove active_model_serializers and terminal-table dependencies.
… allocations On Ruby 3.3+, bytesplice(dst_off, dst_len, src, src_off, src_len) copies directly without allocating intermediate byteslice strings. This removes ~6,900 object allocations and ~276KB per 2,300-record serialization run. Falls back to the 3-arg bytesplice + byteslice path on Ruby < 3.3.
association().target returns nil when the association hasn't been loaded yet. Check loaded? first and fall back to public_send for lazy loading.
Break the monolithic 180-line method into 9 small methods with clear responsibilities: write_attributes (dispatcher), handle_class_change, resolve_type, write_value, write_indexed_with_hash, write_indexed_cached, write_indexed_first_pass, write_non_indexed, and build_column_caches. Add nil_safe_push? to all value writers, replacing the is_a? chain in build_column_caches. Expose last_record_class on RecordState.
…. serialize) The subtype branch in ValuesWriter::Writer#write never cached a writer on the attribute, so the second record in a batch hit write_indexed_cached with a nil writer_cache entry and raised NoMethodError. Introduce SubtypeWriter to wrap the AR type and cache it like every other writer. Also removes stale TODO comments from attribute.rb and writer.rb.
- design-choices.md: Replace C extension references with Ruby Engine architecture, document fast paths, value writers, and IndexedRow - performance.md: Update benchmarks to Ruby 4.0.2/Rails 8.0, simplify to core JSON benchmarks, add instructions for running locally
Passing non-Array/Hash values to :only/:except now raises ArgumentError instead of NoMethodError, giving callers a clear message.
…ociations When a serializer declares has_one pointing to a plain method on the model (not a real AR association), object.association() raises AssociationNotFoundError. Rescue and fall back to public_send, matching has_many behavior which already uses public_send directly.
…ocal caching When Serializer.new is called without :only/:except/:context/:scope options, SerializationDescriptor.build now returns a thread-local cached copy instead of duplicating the descriptor on every call. This avoids allocating ~8 objects per call (descriptor, 2 array dups, serializer instance, 3 association duplicates with recursive sub-descriptors). Each thread gets its own duplicate (created once, reused forever), keeping the mutable per-call state (serializer @object, association Writer/RecordState) thread-safe under Puma. Also fixes a latent bug in RecordState#setup where the IndexedRow fast path could crash when a reused RecordState transitions from an IndexedRow-backed object to a non-indexed object. Benchmark (panko_oj, single Game object, Rails 8.0): Unpersisted: 119k → 190k i/s (+59%) Persisted: 96k → 171k i/s (+78%) Allocations per call: 79 → 23 objects (-71%)
Instead of creating a new Engine::Serializer on every serialize_to_json / to_json call, cache it on the descriptor via engine_serializer. Since descriptors are thread-local (from the previous commit), the cached engine is also per-thread and safe to reuse. Both write_fields and _serialize_many validate that the cached attributes_writer matches the current object type via AttributesWriter.writer_for, handling the edge case where a reused engine encounters a different object type (e.g. AR model then Hash). Also refactors AttributesWriter.create into writer_for (returns class) and create (instantiates), enabling cheap is_a? checks without allocation. Benchmark (panko_oj, single Game object, Rails 8.0): Unpersisted: 190k → 225k i/s (+18%) Persisted: 171k → 213k i/s (+25%) Cumulative from baseline: Unpersisted: 120k → 225k i/s (+88%) Persisted: 96k → 213k i/s (+122%) panko_json (array serialization): no change — one engine per batch.
No description provided.