Skip to content

Conversation

@Harshit28j
Copy link

@Harshit28j Harshit28j commented Nov 24, 2025

This PR is follow up PR related to this issue #678 #491
image

Summary by CodeRabbit

Release Notes

  • New Features

    • Added new "clean" setup mode that automatically removes temporary Docker images, generated script files, and cached OpenML datasets and tasks following benchmark completion to optimize disk space and system resources
  • Improvements

    • Refined caching mechanisms and modernized the package installation process to enhance performance and efficiency

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 24, 2025

Walkthrough

A new "clean" setup mode is introduced across the benchmarking system to perform cleanup operations including docker image removal, script deletion, and cache cleanup. The pip installation mechanism is refactored to use subprocess-based invocation, and caching is updated from cache to lru_cache. The mode is exposed via CLI and integrated into the container runner.

Changes

Cohort / File(s) Change Summary
New Setup Mode Definition
amlb/benchmark.py, runbenchmark.py
Added SetupMode.clean = 5 enum member; extended CLI --setup choices to include "clean" alongside existing options
Container Cleanup Logic
amlb/runners/container.py
Added setup_mode tracking in setup method; enhanced cleanup to conditionally execute (docker image removal, script deletion, OpenML cache cleanup) when SetupMode.clean is active; stored task_defs in run for cleanup access; added os and run_cmd imports
Caching Strategy Update
amlb/resources.py
Replaced cache with lru_cache(maxsize=None) import; updated constraint_definition method decorator from @cache to @lru_cache(maxsize=None)
Pip Installation Refactoring
amlb/utils/modules.py
Replaced dynamic import-based pip access with subprocess-based invocation (sys.executable -m pip install --no-cache-dir); removed register_module and register_submodule helper functions; changed exception handling from SystemExit to subprocess.CalledProcessError

Sequence Diagram

sequenceDiagram
    participant CLI as CLI (runbenchmark.py)
    participant Runner as ContainerBenchmark
    participant Cleanup as Cleanup Operations
    participant Docker as Docker
    participant Cache as OpenML Cache

    CLI->>Runner: setup(setup_mode="clean")
    activate Runner
    Runner->>Runner: Record setup_mode = SetupMode.clean
    deactivate Runner

    CLI->>Runner: run(tasks)
    activate Runner
    Runner->>Runner: Store task_defs for later use
    deactivate Runner

    CLI->>Runner: cleanup()
    activate Runner
    Note over Runner: Check if setup_mode == SetupMode.clean
    alt SetupMode.clean
        Runner->>Cleanup: Initiate conditional cleanup
        activate Cleanup
        
        Cleanup->>Docker: Remove built docker image
        Docker-->>Cleanup: Image removed (if exists)
        
        Cleanup->>Cleanup: Delete generated script file
        
        Cleanup->>Cache: Clean OpenML caches<br/>(for each task_def)
        Cache-->>Cleanup: Caches cleared
        
        deactivate Cleanup
    else Other modes
        Note over Runner: Skip cleanup
    end
    deactivate Runner
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Cleanup logic with error handling: The container cleanup implementation includes conditional branching, per-item error handling with logging, and multiple resource cleanup operations that require verification of correctness
  • Setup mode state tracking: Verify that setup_mode is correctly captured and propagated from CLI through setup to cleanup phases
  • Subprocess refactoring in modules.py: The pip installation changes replace dynamic imports with subprocess calls; requires careful review of command construction, error handling migration (SystemExit → CalledProcessError), and removal of helper functions
  • Task definition storage: Ensure task_defs are properly stored and accessible during cleanup without unintended side effects

Poem

🐰 A setup mode called "clean" takes the stage,
Sweeping docker images, scripts—all the rage!
With subprocess pipes and caches so tight,
The benchmarks run fresh, oh what a delight! 🧹✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title '[Fix #491] : Add cache cleanup follow up' directly corresponds to the main changes in the PR: adding a new 'clean' setup mode for cache cleanup across multiple files.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Harshit28j
Copy link
Author

Manually tested after building docker image

image

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
runbenchmark.py (1)

130-137: Add documentation for the new "clean" setup option.

The new "clean" setup mode has been added to the choices but is not documented in the help text (lines 132-137). Users running --help won't know what this option does.

Based on the changes in amlb/runners/container.py, the help text should describe that "clean" mode removes Docker images, generated scripts, and OpenML caches. Consider adding a bullet point like:

     help="Framework/platform setup mode. Available values are:"
     "\n• auto: setup is executed only if strictly necessary."
     "\n• skip: setup is skipped."
     "\n• force: setup is always executed before the benchmark."
     "\n• only: only setup is executed (no benchmark)."
+    "\n• clean: performs cleanup operations (removes Docker images, scripts, and caches)."
     "\n(default: '%(default)s')",
🧹 Nitpick comments (1)
amlb/utils/modules.py (1)

19-23: Simplify error logging.

The current implementation uses logging.error followed by logging.exception(e). Since logging.exception automatically captures exception information, you can simplify this to a single call without passing the exception object explicitly.

Apply this diff:

-    except subprocess.CalledProcessError as e:
-        log.error(
-            "Error when trying to install python modules %s.", module_or_requirements
-        )
-        log.exception(e)
+    except subprocess.CalledProcessError:
+        log.exception(
+            "Error when trying to install python modules %s.", module_or_requirements
+        )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dfe8d21 and f6a8499.

📒 Files selected for processing (5)
  • amlb/benchmark.py (1 hunks)
  • amlb/resources.py (2 hunks)
  • amlb/runners/container.py (4 hunks)
  • amlb/utils/modules.py (1 hunks)
  • runbenchmark.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-03-13T18:54:25.520Z
Learnt from: PGijsbers
Repo: openml/automlbenchmark PR: 700
File: amlb/data.py:131-132
Timestamp: 2025-03-13T18:54:25.520Z
Learning: The `clear_cache` function in the automlbenchmark library explicitly handles `cached_property` decorated attributes by removing them from the object's __dict__, but for other cache decorators like `cache`, it depends on the presence of a `cache_clear` method.

Applied to files:

  • amlb/resources.py
🪛 Ruff (0.14.5)
amlb/resources.py

174-174: Use of functools.lru_cache or functools.cache on methods can lead to memory leaks

(B019)

amlb/runners/container.py

116-116: Do not catch blind exception: Exception

(BLE001)

amlb/utils/modules.py

18-18: subprocess call: check for execution of untrusted input

(S603)


20-22: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


23-23: Redundant exception object included in logging.exception call

(TRY401)

🔇 Additional comments (9)
amlb/resources.py (2)

15-15: LGTM: Import change aligns with cache clearing requirements.

The change from cache to lru_cache is appropriate given the PR's focus on cache cleanup. lru_cache provides the cache_clear() method needed for cleanup operations.

Based on learnings, the clear_cache function in the codebase depends on the presence of a cache_clear method for cache decorators.


174-174: Acceptable use of lru_cache on method despite static analysis warning.

The static analysis tool flags this as potentially causing memory leaks (B019). However, since Resources is a singleton class (see __INSTANCE__ at line 329), only one instance will ever exist, which mitigates the memory leak concern. The change to lru_cache enables the cache_clear() method needed for the cleanup functionality introduced in this PR.

amlb/utils/modules.py (1)

11-18: LGTM: Subprocess-based pip invocation is more robust.

The refactoring from internal pip APIs to subprocess-based invocation is a best practice, as internal pip APIs are not guaranteed to be stable. The command construction is safe and uses sys.executable to ensure the correct Python environment.

The S603 static analysis warning about untrusted input can be safely dismissed here, as this is an internal utility function where module_or_requirements comes from controlled sources within the benchmarking framework.

amlb/benchmark.py (1)

68-68: LGTM: Clean enum addition.

The new clean setup mode is properly added to the SetupMode enum with sequential numbering, consistent with the existing pattern.

amlb/runners/container.py (5)

12-12: LGTM: Import additions support cleanup functionality.

The new imports (os and run_cmd) are appropriately added to support the cleanup operations introduced in this PR.

Also applies to: 21-21


83-83: LGTM: Setup mode tracking enables conditional cleanup.

Storing the setup mode allows the cleanup method to perform different actions based on the chosen mode, which is essential for the new clean functionality.


99-101: Consider the implications of forced Docker image removal.

The cleanup uses docker rmi -f with the force flag, which will remove the image even if containers are using it. While this is appropriate for a cleanup mode, ensure this behavior aligns with user expectations, especially if multiple benchmarking sessions could be running concurrently.

If concurrent benchmark runs are possible, verify that forced image removal won't disrupt other active sessions.


105-117: Appropriate error handling for cleanup operations.

The per-task error handling with warning logs is well-designed for cleanup operations. The broad Exception catch (flagged by static analysis) is acceptable here because:

  • Cleanup should be resilient and continue even if individual operations fail
  • Warnings are logged so failures aren't silent
  • The lazy import of openml only loads the library when needed

122-122: LGTM: Task definitions stored for cleanup.

Storing the task definitions enables the cleanup method to iterate over tasks and clean their OpenML caches. The comment correctly notes this also validates the tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant