You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The performance of GLM-5 on vending-bench 2 is quite surprising. However, I could not find the evaluation data and code in their paper or on their blogs. Could you please explain how to evaluate the model on vending-bench 2?
The performance of GLM-5 on vending-bench 2 is quite surprising. However, I could not find the evaluation data and code in their paper or on their blogs. Could you please explain how to evaluate the model on vending-bench 2?
Thanks in advance :)