Ask or search anything...

History

Events

Watch Recordings

AI for Law01/09 · Joel Niklaus · Hugging Face

Papers Benchmarks

Hot

CMU LTI

Improve Vision Language Model Chain-of-thought Reasoning

21 Oct 2024

Apple CMU LTI

Researchers from Carnegie Mellon University and Apple developed a method to enhance Vision-Language Models' chain-of-thought reasoning by distilling 193k detailed rationales from GPT-4o, then applying supervised fine-tuning and Direct Preference Optimization. This approach yielded an average gain of 11.7 points in CoT prediction and improved generalization to direct answer tasks, while also enabling the model to act as a reasoning verifier.

View blog

#computer-science #artificial-intelligence #computer-vision-and-pattern-recognition

Resources 32

859

Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward

02 Apr 2024

ByteDance

The University of Texas at Austin

This research introduces a method for directly optimizing video large multimodal models (LMMs) for factual consistency by leveraging a language model's reward derived from detailed video captions. The approach significantly improves factual alignment, achieving an 8.1% average accuracy gain on video question answering benchmarks, while drastically reducing the cost of alignment data collection.

View blog

#computer-science #artificial-intelligence #computer-vision-and-pattern-recognition

Resources

334

LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery

16 May 2024

UMass Amherst MIT logo

MIT

Researchers from MIT, CMU, UMass Amherst, and MIT-IBM Watson AI Lab introduced the Scientific Generative Agent (SGA), a bilevel optimization framework that integrates large language models with differentiable physical simulations for scientific discovery. This framework achieved quantitatively superior results in tasks like constitutive law discovery and molecular design, yielding novel and expert-validated solutions.

View blog

#computer-science #artificial-intelligence #computational-engineering-finance-and-science

Resources 63

193

RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs

11 Jul 2023

Allen Institute for Artificial Intelligence Boston University logo

Boston University

A multi-agent reinforcement learning framework from researchers at AI2, MIT, CMU, and Boston University trains a smaller language model to generate natural language critiques, which then guide fixed, black-box large language models to refine their outputs. This method consistently improves the fixed model's performance, achieving gains such as a 27-point absolute increase in exact match accuracy on a synthetic alphabetization task over a supervised critique generation baseline.

View blog

#computer-science #computation-and-language #deep-reinforcement-learning

Resources 63

There are no more papers matching your filters at the moment.

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Browser Extension

Dark mode

Ask or search anything...

Events