PaddleOCR-VL introduces a 0.9B ultra-compact vision-language model (VLM) that achieves state-of-the-art multilingual document parsing by decoupling layout analysis from element-level recognition. The model supports 109 languages and secured an overall score of 92.86 on OmniDocBench v1.5, while also delivering 53.1% higher page throughput than leading baselines.
View blog