Skip to content

Commit e1f40f1

Browse files
authored
Update README.md
1 parent 06881bf commit e1f40f1

1 file changed

Lines changed: 14 additions & 1 deletion

File tree

README.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ git clone [https://huggingface.co/datasets/opendatalab-raiser/Envision](https://
2323

2424
## 📐 Evaluation
2525

26-
The evaluation of generated sequential images is managed by the `eval.py` script, which automates the quality assessment using a commercial LLM (e.g., OpenAI models) as the judge. The scoring adheres to a strict hierarchical protocol.
26+
The evaluation of generated sequential images is managed by the `eval.py` script, which automates the quality assessment using a commercial VLM (e.g., OpenAI models) as the judge. The scoring adheres to a strict hierarchical protocol.
2727

2828
### 1\. Evaluation Dimensions and Weights
2929

@@ -91,6 +91,19 @@ For the latest official results and model rankings on the Envision benchmark, pl
9191

9292
**[https://opendatalab-raiser.github.io/Envision/](https://opendatalab-raiser.github.io/Envision/)**
9393

94+
-----
95+
## 🌐 Community Contribution
96+
97+
We strongly encourage the research community to expand and enhance the Envision benchmark. We welcome contributions in the form of new model results, additional evaluation metrics, or new causal process categories to further challenge the capabilities of unified multimodal models.
98+
99+
How to Contribute:
100+
101+
Submit New Results: If you have evaluated a novel model on the Envision benchmark using the provided eval.py script, please submit your quantitative results to us. We will periodically update the official leaderboard to reflect the state-of-the-art.
102+
103+
Code and Data Extensions: We welcome pull requests (git pull request) for any improvements to the evaluation script, bug fixes, or the inclusion of supplementary causal event data to diversify the benchmark's coverage.
104+
105+
By collaborating, we can ensure the Envision benchmark remains a robust and evolving resource for measuring true world knowledge internalization and dynamic process modeling in multimodal generation.
106+
94107
-----
95108
96109
## ✍️ Citation

0 commit comments

Comments
 (0)