Skip to content

Codex128187/ahoum-conversation-eval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

About

Production-ready benchmark scoring conversations on 300+ facets (scalable to 5000+) using open-weights LLMs ≤16B. With per-score confidence, 4 backends, FastAPI + Streamlit UI, and Docker.

Topics

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors