SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning
BibTex
Copy
@misc{zhang2025srpoenhancingmultimodal,
title={SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning},
author={Yu Zhang and Yangfan He and Yifan Jiang and Che Liu and Jing Xiong and Mi Zhang and Zhongwei Wan and Shen Yan and Yi Xin and Hui Shen and Zhihao Dou and Dongfei Cui and Qinjian Zhao},
year={2025},
eprint={2506.01713},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2506.01713},
}
Transform this paper into an audio lecture
Get an engaging lecture and Q&A format to quickly understand the paper in minutes, perfect for learning on the go.