⚡️ Speed up method __Timer__.toc by 54%
#40
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 54% (0.54x) speedup for
__Timer__.tocinquantecon/util/timing.py⏱️ Runtime :
980 microseconds→635 microseconds(best of34runs)📝 Explanation and details
The optimization extracts the time decomposition logic (
divmodoperations and formatting calculations) into a separate@njit-decorated helper function_decompose_time. This achieves a 54% speedup by leveraging Numba's just-in-time compilation to accelerate the mathematical operations.Key changes:
divmod(elapsed, 60),divmod(m, 60), and(s % 1)*(10**digits)calculations are moved to a separate_decompose_timefunction@njit(cache=True, fastmath=True), enabling compiled execution of the mathematical operationsWhy this optimization works:
cache=Trueparameter ensures the compiled function is cached after first use, avoiding recompilation overheadImpact on workloads:
Based on the function references,
toc()is called withinloop_timer()for performance benchmarking scenarios where it may be invoked thousands of times. The test results show significant improvements in large-scale scenarios (up to 122% faster for 1000 calls), making this optimization particularly valuable for:loop_timercallstoc()repeatedlyThe optimization is most effective when
verbose=True(the default), as this is when the time decomposition logic executes. Forverbose=Falsecalls, the speedup is more modest since the mathematical operations are skipped entirely.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-__Timer__.toc-mj9qhw5xand push.