Ask or search anything...

Events

AI for Law01/09 · Joel Niklaus · Hugging Face

UTD

Dr.V: A Hierarchical Perception-Temporal-Cognition Framework to Diagnose Video Hallucination by Fine-grained Spatial-Temporal Grounding

15 Sep 2025

NUS UCSB

The Dr.V framework introduces a comprehensive hierarchical taxonomy and a benchmark with fine-grained spatial-temporal grounding to diagnose video hallucinations in large video models (LVMs). Its Dr.V-Agent, a training-free diagnostic system, effectively identifies and mitigates LVM errors, showing significant performance improvements, for instance, increasing VideoChat2's accuracy by 18.60% against human-level performance of 95.25%.

View blog

#agents #computer-science #computer-vision-and-pattern-recognition

Resources 2

There are no more papers matching your filters at the moment.

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Browser Extension

Dark mode

Ask or search anything...

Events