alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Browser Extension

We're hiring
PaperBlogResources

Agent-as-a-Judge: Evaluate Agents with Agents

BibTex
Copy
@Article{Zhuge2024AgentasaJudgeEA,
 author = {Mingchen Zhuge and Changsheng Zhao and Dylan R. Ashley and Wenyi Wang and Dmitrii Khizbullin and Yunyang Xiong and Zechun Liu and Ernie Chang and Raghuraman Krishnamoorthi and Yuandong Tian and Yangyang Shi and Vikas Chandra and Jurgen Schmidhuber},
 booktitle = {arXiv.org},
 journal = {ArXiv},
 title = {Agent-as-a-Judge: Evaluate Agents with Agents},
 volume = {abs/2410.10934},
 year = {2024}
}
GitHub
Mind2Web-2
85
HTTPS
https://github.com/OSU-NLP-Group/Mind2Web-2
SSH
git@github.com:OSU-NLP-Group/Mind2Web-2.git
CLI
gh repo clone OSU-NLP-Group/Mind2Web-2
Transform this paper into an audio lecture
Get an engaging lecture and Q&A format to quickly understand the paper in minutes, perfect for learning on the go.
Audio lecture
Q&A format