Ask or search anything...

History

Events

Watch Recordings

AI for Law01/09 · Joel Niklaus · Hugging Face

Papers Benchmarks

Hot

OpenLocus

DiscoveryBench: Towards Data-Driven Discovery with Large Language Models

01 Jul 2024

Allen Institute for AI University of Massachusetts Amherst

Researchers from AI2, OpenLocus, and UMass Amherst introduce DISCOVERYBENCH, a new benchmark designed to evaluate large language models' ability to perform multi-step data-driven scientific discovery. The benchmark, comprising 264 real-world tasks and 903 synthetic tasks, reveals that current state-of-the-art LLMs achieve a maximum Hypothesis Matching Score of 25%, indicating significant limitations in autonomous discovery.

View blog

#computer-science #artificial-intelligence #computation-and-language

Resources 113

507

Data-driven Discovery with Large Generative Models

21 Feb 2024

University of Utah Allen Institute for AI logo

Allen Institute for AI

This position paper explores the use of Large Generative Models (LGMs) for end-to-end data-driven scientific discovery, proposing a hybrid system that combines LGM capabilities with robust external tools and active human feedback. Their proof-of-concept, DATAVOYAGER, demonstrated the potential for automated hypothesis generation and verification from existing datasets while also highlighting the necessity of human oversight and tool integration to mitigate LGM limitations.

View blog

#computer-science #artificial-intelligence #computation-and-language

Resources

124

There are no more papers matching your filters at the moment.

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Browser Extension

Dark mode

Ask or search anything...

Events