Skip to content

Pull requests: Aleph-Alpha-Research/eval-framework

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

chore(main): release 0.3.8 autorelease: pending
#237 opened May 8, 2026 by github-actions Bot Loading…
feat(vllm): add local vLLM OpenAI-server backend and GPU tests
#235 opened May 7, 2026 by martinreinhardt01 Collaborator Loading…
12 tasks
refactor: Make linter stricter
#233 opened May 7, 2026 by martinreinhardt01 Collaborator Loading…
12 tasks
chore: remove kubernetes
#230 opened Apr 30, 2026 by prabhuteja12 Contributor Loading…
2 of 12 tasks
ci: make CPU tests runnable without GPU deps
#229 opened Apr 30, 2026 by prabhuteja12 Contributor Draft
12 tasks
fix: minerva sympy memory limit
#223 opened Apr 20, 2026 by prabhuteja12 Contributor Draft
4 of 12 tasks
fix: limit minerva sympy memory consumption
#222 opened Apr 16, 2026 by mitja-kleider Loading…
feat: bpb implementations
#212 opened Apr 9, 2026 by prabhuteja12 Contributor Loading…
12 tasks
Update citation year and add version+author to README
#159 opened Jan 26, 2026 by tfburns Collaborator Loading…
1 task done
chore: Bump pyasn1 from 0.6.1 to 0.6.2 in the uv group across 1 directory dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code
#157 opened Jan 16, 2026 by dependabot Bot Loading…
docs: add LLM as judge guide
#151 opened Jan 12, 2026 by AhmedHammam-AA Collaborator Loading…
fix(main): duplicated task that are actually the same
#144 opened Jan 7, 2026 by benureau Loading…
3 of 13 tasks
fix(wmt): use HuggingFace datasets instead of sacrebleu
#137 opened Dec 19, 2025 by AhmedHammam-AA Collaborator Loading…
Remove leading space in ground truth formatting
#129 opened Dec 10, 2025 by SohirMaskey Loading…
3 of 13 tasks
harcoded date for consistent evals
#99 opened Nov 4, 2025 by GrS-AA Collaborator Draft
13 tasks
Refactor Dataloading
#13 opened Aug 26, 2025 by bastitx Draft
1 of 13 tasks
ProTip! Follow long discussions with comments:>50.