SDAIA-KFUPM Joint Research Center for AI
MeXtract: Light-Weight Metadata Extraction from Scientific Papers
08 Oct 2025

MeXtract introduces a family of lightweight Large Language Models (LLMs) (0.5B to 3B parameters) specifically fine-tuned for extracting structured metadata from long scientific papers. The approach achieved state-of-the-art performance within its size class, with the 3B model reaching an F1 score of 73.23% on the MOLE benchmark, and demonstrated strong generalization to unseen metadata schemas.

View blog
Resources
There are no more papers matching your filters at the moment.