Hi, thanks for sharing this interesting work!
While reading the paper, I noticed that the method described in Section 6.2.1, the Agent-as-a-Judge pipeline, appears quite similar to a method, Agentic Reward Framework, we proposed in:
Code Aesthetics with Agentic Reward Feedback (https://arxiv.org/abs/2510.23272)
In particular, both approaches share similarities in using agent(s) with multimodal LM to judge and provide comprehensive reward feedback. When evaluating webpages, Agentic-as-Judge pipeline takes screenshots to perform static anlyses, and executes actions (such as clicking) to verify the functionality. In our work, we use Static Aesthetics Agent to evaluate the webpage based on static screenshot, and use Interactive Aesthetics Agent to evaluate the functionality of the webpage by clicking the elements on the webpage or scrolling.
I was wondering whether you were aware of this work, and if so, it might be helpful to include a citation for completeness.
Thanks again for your work—looking forward to your thoughts!
Hi, thanks for sharing this interesting work!
While reading the paper, I noticed that the method described in Section 6.2.1, the Agent-as-a-Judge pipeline, appears quite similar to a method, Agentic Reward Framework, we proposed in:
In particular, both approaches share similarities in using agent(s) with multimodal LM to judge and provide comprehensive reward feedback. When evaluating webpages, Agentic-as-Judge pipeline takes screenshots to perform static anlyses, and executes actions (such as clicking) to verify the functionality. In our work, we use Static Aesthetics Agent to evaluate the webpage based on static screenshot, and use Interactive Aesthetics Agent to evaluate the functionality of the webpage by clicking the elements on the webpage or scrolling.
I was wondering whether you were aware of this work, and if so, it might be helpful to include a citation for completeness.
Thanks again for your work—looking forward to your thoughts!