Optimize memory allocation when rendering partials#8
Conversation
| end | ||
|
|
||
| set! name, value | ||
| _set_value name, value |
There was a problem hiding this comment.
We can set the value directly here instead of going back in through set! with different options. A call to set! with these parameters will just end up calling _set_value anyways.
This saves a bit in processing and also avoids an extra memory allocation for *args.
| options[:locals].merge! json: self | ||
| @context.render options | ||
| options[:locals][:json] = self | ||
| @context.render options, nil |
There was a problem hiding this comment.
The render helper in rails will default the second parameter to {}. By providing nil here we save on that extra memory allocation.
That second parameter is intended to be the options you provide to the partial if the first param is the partial name (ex: render 'foo', options). Since the partial name is included in the options, that second parameter isn't actually used.
|
Mind linking the profiles? Also did you start taking snapshots of real perf so we can track the differences? |
Re-implements rails#591 for our fork.
We're seeing calls to
reverse_merge!,merge!, andmergefromJbuilderTemplatecome up as CPU and memory hot spots in our profiles.The changes proposed in this PR are inspired by https://github.com/fastruby/fast-ruby#hashmerge-vs-hash-code, and favours mutating the
optionshash via element assignment over merge methods. This saves on both CPU and memory allocation.Comparing
options[:locals].merge!(json: self)tooptions[:locals][:json] = selffor example produced:This PR replaces all instances of
reverse_merge!with[] ||=, and all instances ofmerge!with[]=. Theoptionswere already being mutated so this introduces no change in behaviour.There are a handful of non-mutating calls to
mergeas well that I was hesitant to change, but upon further analysis theoptionshash ends up being mutated further down the call chain anyways; any instance of theoptionshash being merged are on code paths that render to partials which already mutate the options.I've run some benchmarks against something simple yet representative of a template structure that would exercise some of the changes being proposed.
The measurements below are for 100 posts, each with a single author.
CPU
Memory
I was surprised to see no difference in IPS given the earlier benchmarks, but that can be explained by
actionviewdiluting it; this benchmark includes the entirerenderlifecycle which means that my code changes are only running a couple hundred times per second.The impactful improvements is the ~20% reduction in memory. Note that the memory allocation savings would depend entirely on your template - templates rendering to fewer or no partials would see less of an improvement, templates rendering to more partials could see a much larger improvement. As your API serves requests over time, this improvement would go a long way towards saving on garbage collection cycles.