alphaXiv

History

Papers Benchmarks

Jina AI by Elastic

135

04 Dec 2025

attention-mechanisms computer-science artificial-intelligence

Jina-VLM: Small Multilingual Vision Language Model

Jina AI Elastic Jina AI by Elastic

Jina AI introduced JINA-VLM, a 2.4-billion parameter vision-language model, which sets a new benchmark for multilingual visual question answering among open models of similar size. The model also demonstrates robust performance on general English VQA tasks and incorporates an attention-pooling connector that reduces visual tokens by 4x, enhancing efficiency.

There are no more papers matching your filters at the moment.

Events

AI for Law
Joel Niklaus· Hugging Face
01/09
Register
Watch recordings

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Jina-VLM: Small Multilingual Vision Language Model

Events

AI for Law

Personalize Your Feed