Skip to content

feat(utils): Accelerate data generation by 168x using NumPy#1

Open
raphaelgimenezneto wants to merge 1 commit into
Ismail-Dagli:mainfrom
raphaelgimenezneto:feature/optimize-greedy-solver
Open

feat(utils): Accelerate data generation by 168x using NumPy#1
raphaelgimenezneto wants to merge 1 commit into
Ismail-Dagli:mainfrom
raphaelgimenezneto:feature/optimize-greedy-solver

Conversation

@raphaelgimenezneto
Copy link
Copy Markdown

Hello!

This PR introduces a high-performance, vectorized implementation for the generate_historical_rides function, resulting in a ~168x speedup in the simulation's data setup phase.

The Problem: Identifying the Bottleneck
Using cProfile, I identified that the original generate_historical_rides function was the most significant bottleneck, consuming over 10 seconds of execution time. This was primarily due to its iterative, loop-based approach for generating a large number of records.

Profiler Output (Before):
image

The Solution: Vectorization with NumPy
The solution was to replace the iterative method with NumPy vectorization. Instead of processing records one-by-one, this approach operates on entire arrays of data at once, leveraging NumPy's highly optimized C backend for maximum efficiency.

The Results: Performance Gain & Validation
The new implementation is 168.29x faster, reducing the execution time from 10.68 seconds to just 0.06 seconds.

More importantly, this speed was achieved without sacrificing correctness. A comprehensive statistical validation suite confirms that the new function produces a dataset that is statistically equivalent to the original, preserving all key patterns like rush hour distribution and hotspot logic.

Benchmark & Validation Results:
Captura de tela 2026-01-17 013248

Profiler Output (After):
As a result, generate_historical_rides no longer appears as a major bottleneck in the profiler output.
image

Changes in this Pull Request

  • src/utils.py: The original function has been replaced with the high-performance vectorized version.
  • benchmarking/benchmark_data_generation.py: A new, self-contained script has been added. It contains the original "frozen" code and the logic used to generate the benchmark results above, serving as reproducible proof of the improvement.

This PR serves as a practical case study in applying HPC principles to scientific Python code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant