Skip to content

DSLighting support for DA-Code #8

@luckyfan-cs

Description

@luckyfan-cs

Hi!

Thanks for your wonderful work on DA-Code , we really enjoyed it and found it very valuable for evaluating data science code generation capabilities of large language models 🙌

We’d like to briefly introduce our project, DSLighting.

About DSLighting

DSLighting is a data science agent harness — an LLM-driven autonomous execution engine that turns task descriptions and datasets into iterative workflows including:

  • Code generation
  • Execution
  • Evaluation
  • Refinement

It is designed to make it easy to build, run, and evaluate data science agents in a reproducible and extensible way.

Support for DA-Code

We’ve recently added support for running DA-Code within DSLighting. With just a few lines of code, users can easily run the benchmark:

from dotenv import load_dotenv
load_dotenv()

from dslighting.api import DSBenchmark
from dslighting.core import ConfigBuilder

config = ConfigBuilder().build_config(
    workflow="aide",
    model="gpt-4o",
)

benchmark = DSBenchmark("dacode", data_dir="/path/to/dacode")
result = benchmark.run(config=config)

print(result.results_path)
print(result.metadata_path)

Why this might be useful

  • Minimal setup to run DA-Code
  • Unified interface across multiple benchmarks
  • Supports iterative agent workflows (not just single-pass evaluation)
  • Easy to configure for different models and workflows

Other supported benchmarks

DSLighting currently also supports:

  • DABench (ICML 2024)
  • MoSciBench (ICLR 2026)
  • MLE-Bench
  • ScienceAgentBench (ICLR 2025)

We hope this can make it easier for researchers to run and extend DA-Code in agent-based workflows.

Happy to hear your thoughts, and we’d love to explore potential alignment or integration!

Thanks again for your great work 🙌

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions