CASIA MAIS-NLPR
Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models

An extensive empirical study identifies key design choices for Vision-Language-Action (VLA) models, leading to a proposed RoboVLMs framework that achieves state-of-the-art performance on robotic manipulation benchmarks and demonstrates robust generalization in real-world tasks. The research provides actionable insights into VLM backbones, action formulations, and the effective use of large-scale robotic datasets.

View blog
Resources
AI-Driven Virtual Teacher for Enhanced Educational Efficiency: Leveraging Large Pretrain Models for Autonomous Error Analysis and Correction
Students frequently make mistakes while solving mathematical problems, and traditional error correction methods are both time-consuming and labor-intensive. This paper introduces an innovative \textbf{V}irtual \textbf{A}I \textbf{T}eacher system designed to autonomously analyze and correct student \textbf{E}rrors (VATE). Leveraging advanced large language models (LLMs), the system uses student drafts as a primary source for error analysis, which enhances understanding of the student's learning process. It incorporates sophisticated prompt engineering and maintains an error pool to reduce computational overhead. The AI-driven system also features a real-time dialogue component for efficient student interaction. Our approach demonstrates significant advantages over traditional and machine learning-based error correction methods, including reduced educational costs, high scalability, and superior generalizability. The system has been deployed on the Squirrel AI learning platform for elementary mathematics education, where it achieves 78.3\% accuracy in error analysis and shows a marked improvement in student learning efficiency. Satisfaction surveys indicate a strong positive reception, highlighting the system's potential to transform educational practices.
View blog
Resources
There are no more papers matching your filters at the moment.