alphaXiv

History

Papers Benchmarks

Jožef Stefan Institute

405

05 Oct 2025

agentic-frameworks agents computer-science

Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards

Stanford University Jožef Stefan Institute

Rutgers University Amazon

OPTIMAS is a framework for optimizing compound AI systems, utilizing globally aligned local reward functions (LRFs) to enable efficient, decentralized optimization of heterogeneous components. It achieved an average relative improvement of 11.92% over strong baselines across five real-world benchmarks, while also demonstrating high data efficiency and interpretability of its LRFs.

179

16 Jun 2025

agent-based-systems ai-for-health computer-science

A Survey on Imitation Learning for Contact-Rich Tasks in Robotics

Google DeepMind Jožef Stefan Institute

University of Tokyo Università di Genova Istituto Italiano di Tecnologia Saitama University

A survey systematically reviews imitation learning (IL) research for contact-rich robotic tasks, detailing demonstration collection, learning algorithms, and real-world applications. It highlights the growing role of multimodal data and foundation models in advancing robotic capabilities for complex physical interactions, while also identifying key challenges and future directions in the field.

158

08 Oct 2024

computer-science computation-and-language machine-learning

Counterfactual Causal Inference in Natural Language with Large Language Models

Jožef Stefan Institute University of Auckland

Researchers from the University of Auckland and the Jožef Stefan Institute developed Counterfactual-CI, an end-to-end method that leverages Large Language Models to extract causal graphs and perform counterfactual causal inference directly from unstructured natural language. The work reveals that while LLMs excel at discovering causal relationships in text, their capacity for accurate counterfactual reasoning is limited by their ability to perform logical predictions on given causal inputs, even when the correct causal structure is provided.

26 May 2025

agent-based-systems autonomous-vehicles computer-science

Automated Scientific Discovery: From Equation Discovery to Autonomous Discovery Systems

University of Cambridge Jožef Stefan Institute Chalmers University Johannes Gutenberg University

A comprehensive survey on automated scientific discovery critically analyzes its history, current state, and future, consolidating efforts from equation discovery to autonomous systems. It identifies a key research gap in integrating interpretable knowledge generation with autonomous experimentation, proposing a path towards AI scientists capable of producing human-interpretable scientific knowledge.

191

14 Sep 2024

computer-science computation-and-language

ClarQ-LLM: A Benchmark for Models Clarifying and Requesting Information in Task-Oriented Dialog

Jožef Stefan Institute Guangxi Normal University

Queen Mary University of London University of Utrecht

ClarQ-LLM presents a comprehensive benchmark designed to assess large language models' capacity for asking effective clarification questions in complex task-oriented dialogues. Experiments using this benchmark indicate that even state-of-the-art LLMs achieve success rates of only around 50-60%, significantly trailing human performance of 80-85%, and often struggle with uncertainty analysis, context retention, and conciseness.

24 Jun 2025

ai-for-health computer-science artificial-intelligence

Multimodal Machine Learning in Mental Health: A Survey of Data, Algorithms, and Challenges

Jožef Stefan Institute

Queen Mary University of London

Multimodal machine learning (MML) is rapidly reshaping the way mental-health disorders are detected, characterized, and longitudinally monitored. Whereas early studies relied on isolated data streams -- such as speech, text, or wearable signals -- recent research has converged on architectures that integrate heterogeneous modalities to capture the rich, complex signatures of psychiatric conditions. This survey provides the first comprehensive, clinically grounded synthesis of MML for mental health. We (i) catalog 26 public datasets spanning audio, visual, physiological signals, and text modalities; (ii) systematically compare transformer, graph, and hybrid-based fusion strategies across 28 models, highlighting trends in representation learning and cross-modal alignment. Beyond summarizing current capabilities, we interrogate open challenges: data governance and privacy, demographic and intersectional fairness, evaluation explainability, and the complexity of mental health disorders in multimodal settings. By bridging methodological innovation with psychiatric utility, this survey aims to orient both ML researchers and mental-health practitioners toward the next generation of trustworthy, multimodal decision-support systems.

14 Jun 2025

computer-science artificial-intelligence computation-and-language

Recent Advances and Future Directions in Literature-Based Discovery

University of Ljubljana Jožef Stefan Institute Temida d.o.o

A survey details how Literature-Based Discovery (LBD) has advanced between 2000 and 2025, particularly through the integration of artificial intelligence technologies such as knowledge graphs, deep learning, and large language models, while also outlining remaining challenges and future research directions.

444

17 Feb 2025

computer-science artificial-intelligence machine-learning

LLM Embeddings for Deep Learning on Tabular Data

University of Cambridge Jožef Stefan Institute

Bosko Koloski

Andrei Margeloiu

This paper introduces an approach that leverages large language models (LLMs) as frozen feature encoders to improve deep learning performance on tabular data. The method converts tabular feature-value pairs into natural language sentences, generates embeddings with an LLM, and then feeds these to a shallow neural network and a downstream predictive model, achieving improved accuracy across various datasets and narrowing the performance gap with tree-based methods.

04 Sep 2025

high-energy-physics-phenomenology physics

UV origins of CP-violating leptonic Yukawa couplings

University of Ljubljana Jožef Stefan Institute

University of Basel

Probing the CP nature of the Higgs Yukawa couplings is a promising avenue for unraveling new physics effects. In this work, we investigate the possible single- and two-field UV origins of CP-violating leptonic Yukawa couplings, using the Standard Model Effective Field Theory as a stepping stone. We demonstrate a rich set of constraints on the UV model parameters, including direct Higgs measurements, electric and magnetic dipole moments of leptons, charged lepton flavor violating observables, and electroweak precision tests. Studying representative flavor assumptions we find that the precision constraints are often, but not always, more constraining than the dedicated LHC analyses of modified leptonic Yukawa couplings.

117

22 May 2025

ai-for-health chain-of-thought computer-science

Advancing the Scientific Method with Large Language Models: From Hypothesis to Discovery

Harvard University

University of Chicago

Stanford University The University of Edinburgh Jožef Stefan Institute

NVIDIA Oxford Immune Algorithmics

University of Southampton The Alan Turing Institute Santa Fe Institute

King’s College London King Abdullah University of Science and Technology Karolinska Institutet Zuse Institute Berlin Francis Crick Institute Allen Discovery Center at Tufts University Network Rail

A comprehensive review analyzes how Large Language Models can function as 'creative engines' in scientific discovery, augmenting each stage of the scientific method from hypothesis generation to experimentation, highlighting their potential beyond traditional research assistance.

364

29 Jun 2024

computer-science computation-and-language information-extraction

Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark

University of Washington Allen Institute for Artificial Intelligence

Georgia Institute of Technology LMU Munich IT University of Copenhagen Jožef Stefan Institute Beijing Academy of Artificial Intelligence

Sorbonne Université INRIA Paris University of Bath National University Ben Gurion University Comenius University in Bratislava Cisco Duolingo

Yuval Pinter

Marek Suppa

We introduce Universal NER (UNER), an open, community-driven project to develop gold-standard NER benchmarks in many languages. The overarching goal of UNER is to provide high-quality, cross-lingually consistent annotations to facilitate and standardize multilingual NER research. UNER v1 contains 18 datasets annotated with named entities in a cross-lingual consistent schema across 12 diverse languages. In this paper, we detail the dataset creation and composition of UNER; we also provide initial modeling baselines on both in-language and cross-lingual learning settings. We release the data, code, and fitted models to the public.

11 Jun 2025

computer-science conversational-ai artificial-intelligence

From Symbolic to Neural and Back: Exploring Knowledge Graph-Large Language Model Synergies

Jožef Stefan Institute Jožef Stefan International Postgraduate School

This survey paper systematically categorizes and analyzes the synergy between Knowledge Graphs (KGs) and Large Language Models (LLMs), identifying critical challenges in scalability, computational efficiency, and data quality that differentiate it from prior reviews. It provides a structured framework for understanding current integration approaches and outlines key open problems for future research in creating more reliable and interpretable AI systems.

12 Sep 2025

computer-science sound audio-and-speech-processing

An Adaptive CMSA for Solving the Longest Filled Common Subsequence Problem with an Application in Audio Querying

Jožef Stefan Institute University of Belgrade TU Wien University of Banja Luka Artificial Intelligence Research Institute (IIIA-CSIC)University of Nova Gorica

This paper addresses the Longest Filled Common Subsequence (LFCS) problem, a challenging NP-hard problem with applications in bioinformatics, including gene mutation prediction and genomic data reconstruction. Existing approaches, including exact, metaheuristic, and approximation algorithms, have primarily been evaluated on small-sized instances, which offer limited insights into their scalability. In this work, we introduce a new benchmark dataset with significantly larger instances and demonstrate that existing datasets lack the discriminative power needed to meaningfully assess algorithm performance at scale. To solve large instances efficiently, we utilize an adaptive Construct, Merge, Solve, Adapt (CMSA) framework that iteratively generates promising subproblems via component-based construction and refines them using feedback from prior iterations. Subproblems are solved using an external black-box solver. Extensive experiments on both standard and newly introduced benchmarks show that the proposed adaptive CMSA achieves state-of-the-art performance, outperforming five leading methods. Notably, on 1,510 problem instances with known optimal solutions, our approach solves 1,486 of them -- achieving over 99.9% optimal solution quality and demonstrating exceptional scalability. We additionally propose a novel application of LFCS for song identification from degraded audio excerpts as an engineering contribution, using real-world energy-profile instances from popular music. Finally, we conducted an empirical explainability analysis to identify critical feature combinations influencing algorithm performance, i.e., the key problem features contributing to success or failure of the approaches across different instance types are revealed.

01 Oct 2025

computer-science artificial-intelligence machine-learning

Fiaingen: A financial time series generative method matching real-world data quality

Jožef Stefan Institute University of Innsbruck University of Twente SINTEF AS Peracton Ltd.

Data is vital in enabling machine learning models to advance research and practical applications in finance, where accurate and robust models are essential for investment and trading decision-making. However, real-world data is limited despite its quantity, quality, and variety. The data shortage of various financial assets directly hinders the performance of machine learning models designed to trade and invest in these assets. Generative methods can mitigate this shortage. In this paper, we introduce a set of novel techniques for time series data generation (we name them Fiaingen) and assess their performance across three criteria: (a) overlap of real-world and synthetic data on a reduced dimensionality space, (b) performance on downstream machine learning tasks, and (c) runtime performance. Our experiments demonstrate that the methods achieve state-of-the-art performance across the three criteria listed above. Synthetic data generated with Fiaingen methods more closely mirrors the original time series data while keeping data generation time close to seconds - ensuring the scalability of the proposed approach. Furthermore, models trained on it achieve performance close to those trained with real-world data.

01 Dec 2025

cosmology-and-nongalactic-astrophysics high-energy-physics-phenomenology physics

Linearly Polarized Gravitational Waves from Bubble Collisions

University of Ljubljana Jožef Stefan Institute

Katarina Trailović analytically demonstrated that the collision of two vacuum bubbles during an early Universe first-order phase transition yields a linearly polarized gravitational wave signal. The study also characterized the conditions under which such two-bubble transitions can occur and found the resulting signals to be detectable by future observatories like LISA and the Einstein Telescope.

16 May 2023

quantum-gases statistical-mechanics strongly-correlated-electrons

Quantum state complexity meets many-body scars

University College London Jožef Stefan Institute Indian Institute of Technology Gandhinagar Okinawa Institute of Science and Technology

Scar eigenstates in a many-body system refers to a small subset of non-thermal finite energy density eigenstates embedded into an otherwise thermal spectrum. This novel non-thermal behaviour has been seen in recent experiments simulating a one-dimensional PXP model with a kinetically-constrained local Hilbert space realized by a chain of Rydberg atoms. We probe these small sets of special eigenstates starting from particular initial states by computing the spread complexity associated to time evolution of the PXP hamiltonian. Since the scar subspace in this model is embedded only loosely, the scar states form a weakly broken representation of the Lie Algebra. We demonstrate why a careful usage of the Forward Scattering Approximation (or similar strategies thereof) is required to extract an appropriate set of Lanczos coefficients in this case as the consequence of this approximate symmetry. This leads to a well defined notion of a closed Krylov subspace and consequently, that of spread complexity. We show how the spread complexity shows approximate revivals starting from both

|\mathbb{Z}_2\rangle

and

|\mathbb{Z}_3\rangle

states and how these revivals can be made more accurate by adding optimal perturbations to the bare Hamiltonian. We also investigate the case of the vacuum as the initial state, where revivals can be stabilized using an iterative process of adding few-body terms.

11 Sep 2021

computer-science computation-and-language information-extraction

Natural SQL: Making SQL Easier to Infer from Natural Language Specifications

UC Berkeley Jožef Stefan Institute

Queen Mary University of London University of Leicester Guangxi University of Finance and Economics

Addressing the mismatch between natural language descriptions and the corresponding SQL queries is a key challenge for text-to-SQL translation. To bridge this gap, we propose an SQL intermediate representation (IR) called Natural SQL (NatSQL). Specifically, NatSQL preserves the core functionalities of SQL, while it simplifies the queries as follows: (1) dispensing with operators and keywords such as GROUP BY, HAVING, FROM, JOIN ON, which are usually hard to find counterparts for in the text descriptions; (2) removing the need for nested subqueries and set operators; and (3) making schema linking easier by reducing the required number of schema items. On Spider, a challenging text-to-SQL benchmark that contains complex and nested SQL queries, we demonstrate that NatSQL outperforms other IRs, and significantly improves the performance of several previous SOTA models. Furthermore, for existing models that do not support executable SQL generation, NatSQL easily enables them to generate executable SQL queries, and achieves the new state-of-the-art execution accuracy.

03 Mar 2025

cosmology-and-nongalactic-astrophysics general-relativity-and-quantum-cosmology high-energy-physics-phenomenology

Schwinger Current in de Sitter Space

Jožef Stefan Institute Universidad de Granada Institute for Fundamental Physics of the Universe (IFPU)Universidad de Coimbra

Mar Bastero-Gil and colleagues resolved a theoretical inconsistency regarding the Schwinger current in de Sitter space by implementing a revised renormalization approach that accounts for a necessary tachyonic photon mass. Their work produced a physically consistent, finite, and positive Schwinger current for various charged particles, even in the massless limit.

29 Aug 2025

high-energy-physics-phenomenology physics

Beyond Neutrino Mass: Observable $n$ - $\overline{n}$ Oscillations in UV Complete Seesaw Models

Jožef Stefan Institute University of Split

Next-generation experiments, such as the Deep Underground Neutrino Experiment and the European Spallation Source, are set to dramatically improve sensitivity to neutron-antineutron oscillation that is a direct probe of

\Delta B = 2

baryon number violation. The discovery of such a rare process would indicate physics beyond the Standard Model and could point to specific unified theories that allow observable

n-\overline{n}

transitions. We accordingly examine

n-\overline{n}

oscillations within a unified framework that accounts for charged fermion masses and generates viable neutrino masses via the seesaw mechanism. More specifically, we show that

n-\overline{n}

oscillations can arise from two specific topologies within two distinct

SU(5)

scenarios. One topology requires a presence of two color-sextet scalars in the Type II seesaw framework, whereas the other involves a scalar sextet and a color-octet fermion in the Type III seesaw framework. While the former topology can be realized in the

SO(10)

/Pati-Salam frameworks, the latter finds a natural embedding in

SU(5)

, which constitutes one of the key novelties of our work. Remarkably enough, the same dynamics responsible for fermion masses also induces baryon number violation, thus linking

n-\overline{n}

oscillations to the flavor structure of the theory. We show that upcoming searches of such

\Delta B = 2

processes can probe new colored states with masses up to

10^{11}

GeV that is well beyond the reach of colliders. This positions

n-\overline{n}

oscillations as a rare low-energy portal to grand unification and ultra-heavy new physics.

10 Feb 2025

disordered-systems-and-neural-networks quantum-gases statistical-mechanics

Many-Body Localization in the Age of Classical Computing

University of Ljubljana Jožef Stefan Institute ICREA Uniwersytet Jagielloński Abdus Salam International Centre of Theoretical Physics ICFO Institut de Ciencies Fotoniques

Statistical mechanics provides a framework for describing the physics of large, complex many-body systems using only a few macroscopic parameters to determine the state of the system. For isolated quantum many-body systems, such a description is achieved via the eigenstate thermalization hypothesis (ETH), which links thermalization, ergodicity and quantum chaotic behavior. However, tendency towards thermalization is not observed at finite system sizes and evolution times in a robust many-body localization (MBL) regime found numerically and experimentally in the dynamics of interacting many-body systems at strong disorder. Although the phenomenology of the MBL regime is well-established, the central question remains unanswered: under what conditions does the MBL regime give rise to an MBL phase in which the thermalization does not occur even in the asymptotic limit of infinite system size and evolution time? This review focuses on recent numerical investigations aiming to clarify the status of the MBL phase, and it establishes the critical open questions about the dynamics of disordered many-body systems. Persistent finite size drifts towards ergodicity consistently emerge in spectral properties of disordered many-body systems, excluding naive single-parameter scaling hypothesis and preventing comprehension of the status of the MBL phase. The drifts are related to tendencies towards thermalization and non-vanishing transport observed in the dynamics of many-body systems, even at strong disorder. These phenomena impede understanding of microscopic processes at the ETH-MBL crossover. Nevertheless, the abrupt slowdown of dynamics with increasing disorder strength suggests the proximity of the MBL phase. This review concludes that the questions about thermalization and its failure in disordered many-body systems remain a captivating area open for further explorations.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards

A Survey on Imitation Learning for Contact-Rich Tasks in Robotics

Counterfactual Causal Inference in Natural Language with Large Language Models

Automated Scientific Discovery: From Equation Discovery to Autonomous Discovery Systems

ClarQ-LLM: A Benchmark for Models Clarifying and Requesting Information in Task-Oriented Dialog

Multimodal Machine Learning in Mental Health: A Survey of Data, Algorithms, and Challenges

Recent Advances and Future Directions in Literature-Based Discovery

LLM Embeddings for Deep Learning on Tabular Data

UV origins of CP-violating leptonic Yukawa couplings

Advancing the Scientific Method with Large Language Models: From Hypothesis to Discovery

Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark

From Symbolic to Neural and Back: Exploring Knowledge Graph-Large Language Model Synergies

An Adaptive CMSA for Solving the Longest Filled Common Subsequence Problem with an Application in Audio Querying

Fiaingen: A financial time series generative method matching real-world data quality

Linearly Polarized Gravitational Waves from Bubble Collisions

Quantum state complexity meets many-body scars

Natural SQL: Making SQL Easier to Infer from Natural Language Specifications

Schwinger Current in de Sitter Space

Beyond Neutrino Mass: Observable $n$ - $\overline{n}$ Oscillations in UV Complete Seesaw Models

Many-Body Localization in the Age of Classical Computing

Events

AI for Law

Personalize Your Feed

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards

A Survey on Imitation Learning for Contact-Rich Tasks in Robotics

Counterfactual Causal Inference in Natural Language with Large Language Models

Automated Scientific Discovery: From Equation Discovery to Autonomous Discovery Systems

ClarQ-LLM: A Benchmark for Models Clarifying and Requesting Information in Task-Oriented Dialog

Multimodal Machine Learning in Mental Health: A Survey of Data, Algorithms, and Challenges

Recent Advances and Future Directions in Literature-Based Discovery

LLM Embeddings for Deep Learning on Tabular Data

UV origins of CP-violating leptonic Yukawa couplings

Advancing the Scientific Method with Large Language Models: From Hypothesis to Discovery

Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark

From Symbolic to Neural and Back: Exploring Knowledge Graph-Large Language Model Synergies

An Adaptive CMSA for Solving the Longest Filled Common Subsequence Problem with an Application in Audio Querying

Fiaingen: A financial time series generative method matching real-world data quality

Linearly Polarized Gravitational Waves from Bubble Collisions

Quantum state complexity meets many-body scars

Natural SQL: Making SQL Easier to Infer from Natural Language Specifications

Schwinger Current in de Sitter Space

Beyond Neutrino Mass: Observable nnn-n‾\overline{n}n Oscillations in UV Complete Seesaw Models

Many-Body Localization in the Age of Classical Computing

Events

AI for Law

Personalize Your Feed

Beyond Neutrino Mass: Observable $n$ - $\overline{n}$ Oscillations in UV Complete Seesaw Models