Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward

BibTex

Copy

@misc{zhangTue Apr 02 2024 12:47:49 GMT+0000 (Coordinated Universal Time)directpreferenceoptimization,
      title={Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward},
      author={Ruohong Zhang and Liangke Gui and Zhiqing Sun and Yihao Feng and Keyang Xu and Yuanhan Zhang and Di Fu and Chunyuan Li and Alexander Hauptmann and Yonatan Bisk and Yiming Yang},
      year={Tue Apr 02 2024 12:47:49 GMT+0000 (Coordinated Universal Time)},
      eprint={2404.01258},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2404.01258},
}

Transform this paper into an audio lecture

Get an engaging lecture and Q&A format to quickly understand the paper in minutes, perfect for learning on the go.

Audio lecture

Q&A format

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Browser Extension

Dark mode

Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward