Skip to content

aims-foundations/torch_measure

Repository files navigation

torch_measure

License: MIT Python 3.10+ PyTorch Discord

PyTorch-native toolkit for predictive evaluation of AI systems.

Benchmark scores increasingly gate deployment decisions but rarely predict how a model will behave in production. torch_measure treats evaluation itself as a predictive modeling problem: latent-variable models infer a system's capability directly from sparse benchmark observations and predict its performance on unseen tasks. Built on PyTorch, with GPU-accelerated IRT, factor models, amortized inference, adaptive testing, and tabular baselines.

Installation

With pip:

pip install torch_measure

With uv (faster; drop-in replacement for pip):

uv pip install torch_measure        # into the active environment
uv add torch_measure                # into a uv-managed project

Contributing

We welcome contributions! Please see our contributing guidelines for details, or drop by our Discord to chat.

About

A package for AI measurement science

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages