Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

BibTex

Copy

@misc{chng2025unifiedmultimodalunderstanding,
      title={Unified Multimodal Understanding and Generation Models: Advances,  Challenges, and Opportunities},
      author={Yong Xien Chng and Qing-Guo Chen and Zhao Xu and Weihua Luo and Kaifu Zhang and Lunhao Duan and Shanshan Zhao and Xinjie Zhang and Jiakui Hu and Minghao Fu and Jintao Guo and Guo-Hua Wang},
      year={2025},
      eprint={2505.02567},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2505.02567},
}

GitHub

Awesome-Unified-Multimodal-Models

782

HTTPS

https://github.com/AIDC-AI/Awesome-Unified-Multimodal-Models

SSH

git@github.com:AIDC-AI/Awesome-Unified-Multimodal-Models.git

CLI

gh repo clone AIDC-AI/Awesome-Unified-Multimodal-Models

Transform this paper into an audio lecture

Get an engaging lecture and Q&A format to quickly understand the paper in minutes, perfect for learning on the go.

Audio lecture

Q&A format

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Browser Extension

Dark mode

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities