Skip to content

How to evaluate model on vending bench 2 #70

@Richar-Du

Description

@Richar-Du

The performance of GLM-5 on vending-bench 2 is quite surprising. However, I could not find the evaluation data and code in their paper or on their blogs. Could you please explain how to evaluate the model on vending-bench 2?

Thanks in advance :)

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions