alphaXiv

History

Papers Benchmarks

Tufts University

2,061

08 Aug 2024

computer-science computation-and-language information-extraction

AcrosticSleuth: Probabilistic Identification and Ranking of Acrostics in Multilingual Corpora

Harvard University

University of Texas at Austin

University of Wisconsin-Madison Tufts University

A computational tool, AcrosticSleuth, identifies and probabilistically ranks acrostics in multilingual corpora, achieving F1 scores up to 0.66 on known Russian acrostics. This tool, developed by researchers from Tufts, UW-Madison, UT Austin, and Harvard, also led to the discovery of previously unrecognized acrostics in significant historical texts, including one in Thomas Hobbes' *The Elements of Law*.

2,023

17 Sep 2024

astrophysics-of-galaxies physics

UNCOVER: Significant Reddening in Cosmic Noon Quiescent Galaxies

University of Pittsburgh University of Zurich

Yale University

Northwestern University

The Pennsylvania State University University of Colorado Tufts University

Princeton University Ben-Gurion University of the Negev Swinburne University of Technology University of Massachusetts Max Planck Institut fur Astronomie

We explore the physical properties of five massive quiescent galaxies at

z\sim2.5

, revealing the presence of non-negligible dust reservoirs. JWST NIRSpec observations were obtained for each target, finding no significant line emission; multiple star formation tracers independently place upper limits between

0.1-10~M_\odot / \mathrm{yr}

. Spectral energy distribution modeling with Prospector infers stellar masses between

\log_{10}[M / M_\odot] \sim 10-11

and stellar mass-weighted ages between

1-2

Gyr. The inferred mass-weighted effective radii (

r_{eff}\sim 0.4-1.4

kpc) and inner

1

kpc stellar surface densities (

\log_{10}[\Sigma / M_\odot \mathrm{kpc}^2 ]\gtrsim 9

) are typical of quiescent galaxies at

z \gtrsim 2

. The galaxies display negative color gradients (redder core and bluer outskirts); for one galaxy, this effect results from a dusty core, while for the others it may be evidence of an "inside-out" growth process. Unlike local quiescent galaxies, we identify significant reddening in these typical cosmic noon passive galaxies; all but one require

A_V \gtrsim 0.4

. This finding is in qualitative agreement with previous studies but our deep 20-band NIRCam imaging is able to significantly suppress the dust-age degeneracy and confidently determine that these galaxies are reddened. We speculate about the physical effects that may drive the decline in dust content in quiescent galaxies over cosmic time.

236

19 Aug 2025

computer-science sound audio-and-speech-processing

MMAU-Pro: A Challenging and Comprehensive Benchmark for Holistic Evaluation of Audio General Intelligence

KAIST Shanghai Artificial Intelligence Laboratory

Carnegie Mellon University Brno University of Technology

Tsinghua University

University of Maryland, College Park

Microsoft

Johns Hopkins University Universidad Autónoma de Madrid Indian Institute of Technology, Bombay Tufts University Universidad de Buenos Aires Universiti Sains Malaysia Middlebury College Athens University of Economics and Business Phonexia Telefónica University of Texas, Austin

MMAU-Pro introduces a comprehensive benchmark of 5,305 expert-annotated instances designed to holistically evaluate audio general intelligence in AI models across complex, real-world scenarios. The benchmark reveals that even state-of-the-art models exhibit substantial limitations in multi-audio reasoning, spatial understanding, and long-form audio comprehension, indicating significant room for improvement.

14 Oct 2025

computer-science information-theory

Engineering Emergence

University of Amsterdam Tufts University

Jansma and Hoel extend the Causal Emergence 2.0 framework to analyze all possible micro-to-macro paths within a system, identifying causally relevant emergent hierarchies and introducing new measures for complexity. Their work demonstrates that emergent properties can be engineered with pinpoint precision, designing systems with specific emergent causal structures.

481

21 Apr 2025

computer-science information-theory

Causal Emergence 2.0: Quantifying emergent complexity

Tufts University Allen Discovery Center

Complex systems can be described at myriad different scales, and their causal workings often have multiscale structure (e.g., a computer can be described at the microscale of its hardware circuitry, the mesoscale of its machine code, and the macroscale of its operating system). While scientists study and model systems across the full hierarchy of their scales, from microphysics to macroeconomics, there is debate about what the macroscales of systems can possibly add beyond mere compression. To resolve this longstanding issue, here a new theory of emergence is introduced wherein the different scales of a system are treated like slices of a higher-dimensional object. The theory can distinguish which of these scales possess unique causal contributions, and which are not causally relevant. Constructed from an axiomatic notion of causation, the theory's application is demonstrated in coarse-grains of Markov chains. It identifies all cases of macroscale causation: instances where reduction to a microscale is possible, yet lossy about causation. Furthermore, the theory posits a causal apportioning schema that calculates the causal contribution of each scale, showing what each uniquely adds. Finally, it reveals a novel measure of emergent complexity: how widely distributed a system's causal workings are across its hierarchy of scales.

215

03 Jun 2025

computer-science artificial-intelligence machine-learning

Graph Generative Pre-trained Transformer

Northeastern University

Cornell University Tufts University

Graph generation is a critical task in numerous domains, including molecular design and social network analysis, due to its ability to model complex relationships and structured data. While most modern graph generative models utilize adjacency matrix representations, this work revisits an alternative approach that represents graphs as sequences of node set and edge set. We advocate for this approach due to its efficient encoding of graphs and propose a novel representation. Based on this representation, we introduce the Graph Generative Pre-trained Transformer (G2PT), an auto-regressive model that learns graph structures via next-token prediction. To further exploit G2PT's capabilities as a general-purpose foundation model, we explore fine-tuning strategies for two downstream applications: goal-oriented generation and graph property prediction. We conduct extensive experiments across multiple datasets. Results indicate that G2PT achieves superior generative performance on both generic graph and molecule datasets. Furthermore, G2PT exhibits strong adaptability and versatility in downstream tasks from molecular design to property prediction. Code available at this https URL,

12 Sep 2025

high-energy-physics-phenomenology physics

Heavy QCD Axions at High-Energy Muon Colliders

New York University

University of Minnesota Tufts University

We study the physics potential of heavy QCD axions at high-energy muon colliders. Unlike typical axion-like particles, heavy QCD axions solve the strong CP problem with phenomenology driven by the anomalous gluon (

aG\widetilde G

) couplings. Several ultraviolet scenarios are presented in which QCD axions with TeV-scale masses and decay constants arise consistently with a solution to both the strong CP problem and the axion quality problem. We perform a detailed collider analysis for both a 3 and 10~TeV muon collider, focusing on hadronic axion decays that gives rise to a dijet-resonance signature. Our projections for the axion discovery reach in the multi-TeV mass range demonstrate that a muon collider can significantly extend sensitivity to heavy QCD axions compared to existing experiments.

167

17 Sep 2020

computer-science artificial-intelligence machine-learning

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

University of Oxford

University of Texas at Austin Tufts University University of Leeds Washington State University

Researchers from UT Austin, Oxford, Leeds, Tufts, and Alberta established a unified conceptual framework for curriculum learning in reinforcement learning (RL), systematically classifying existing methods and identifying key research gaps. The work defines curriculum learning through task generation, sequencing, and transfer learning components, providing a structured overview to address challenges in RL sample efficiency and complex task acquisition.

178

14 Feb 2025

computer-science machine-learning quantitative-methods

MassSpecGym: A benchmark for the discovery and identification of molecules

University of Toronto

University of Alberta Tufts University

Aalto University National Institute of Standards and Technology University of Johannesburg Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University University of Antwerp Friedrich-Schiller-University Jena Wageningen University & Research Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Alberta Machine Intelligence Institute University of Applied Sciences, Düsseldorf Eawag: Swiss Federal Institute of Aquatic Science and Technology Bright Giant GmbH

The discovery and identification of molecules in biological and environmental samples is crucial for advancing biomedical and chemical sciences. Tandem mass spectrometry (MS/MS) is the leading technique for high-throughput elucidation of molecular structures. However, decoding a molecular structure from its mass spectrum is exceptionally challenging, even when performed by human experts. As a result, the vast majority of acquired MS/MS spectra remain uninterpreted, thereby limiting our understanding of the underlying (bio)chemical processes. Despite decades of progress in machine learning applications for predicting molecular structures from MS/MS spectra, the development of new methods is severely hindered by the lack of standard datasets and evaluation protocols. To address this problem, we propose MassSpecGym -- the first comprehensive benchmark for the discovery and identification of molecules from MS/MS data. Our benchmark comprises the largest publicly available collection of high-quality labeled MS/MS spectra and defines three MS/MS annotation challenges: de novo molecular structure generation, molecule retrieval, and spectrum simulation. It includes new evaluation metrics and a generalization-demanding data split, therefore standardizing the MS/MS annotation tasks and rendering the problem accessible to the broad machine learning community. MassSpecGym is publicly available at this https URL

19 May 2023

physics quantum-physics

Exact and efficient Lanczos method on a quantum computer

IBM Quantum Tufts University

We present an algorithm that uses block encoding on a quantum computer to exactly construct a Krylov space, which can be used as the basis for the Lanczos method to estimate extremal eigenvalues of Hamiltonians. While the classical Lanczos method has exponential cost in the system size to represent the Krylov states for quantum systems, our efficient quantum algorithm achieves this in polynomial time and memory. The construction presented is exact in the sense that the resulting Krylov space is identical to that of the Lanczos method, so the only approximation with respect to the exact method is due to finite sample noise. This is possible because, unlike previous quantum Krylov methods, our algorithm does not require simulating real or imaginary time evolution. We provide an explicit error bound for the resulting ground state energy estimate in the presence of noise. For our method to be successful efficiently, the only requirement on the input problem is that the overlap of the initial state with the true ground state must be

\Omega(1/\text{poly}(n))

for

n

qubits.

263

16 May 2025

computer-science machine-learning efficient-transformers

TabTreeFormer: Tabular Data Generation Using Hybrid Tree-Transformer

National University of Singapore Tufts University Betterdata AI

Bingyin Zhao

TabTreeFormer, developed by researchers at NUS and Betterdata AI, introduces a hybrid tree-transformer model for synthetic tabular data generation, achieving up to 44% utility gain over baselines while preserving privacy and enhancing efficiency through a novel dual-quantization tokenizer and ordinal-aware learning.

23 Sep 2025

ai-for-health computer-science machine-learning

Recovering Wasserstein Distance Matrices from Few Measurements

Tufts University

University of Central Florida University of Texas at Arlington

This paper proposes two algorithms for estimating square Wasserstein distance matrices from a small number of entries. These matrices are used to compute manifold learning embeddings like multidimensional scaling (MDS) or Isomap, but contrary to Euclidean distance matrices, are extremely costly to compute. We analyze matrix completion from upper triangular samples and Nyström completion in which

\mathcal{O}(d\log(d))

columns of the distance matrices are computed where

d

is the desired embedding dimension, prove stability of MDS under Nyström completion, and show that it can outperform matrix completion for a fixed budget of sample distances. Finally, we show that classification of the OrganCMNIST dataset from the MedMNIST benchmark is stable on data embedded from the Nyström estimation of the distance matrix even when only 10\% of the columns are computed.

154

06 Aug 2025

clustering-algorithms computer-science artificial-intelligence

What Lives? A meta-analysis of diverse opinions on the definition of life

University of Toronto

Harvard University

Google Tufts University Attune Intelligence, LLC

The question of "what is life?" has challenged scientists and philosophers for centuries, producing an array of definitions that reflect both the mystery of its emergence and the diversity of disciplinary perspectives brought to bear on the question. Despite significant progress in our understanding of biological systems, psychology, computation, and information theory, no single definition for life has yet achieved universal acceptance. This challenge becomes increasingly urgent as advances in synthetic biology, artificial intelligence, and astrobiology challenge our traditional conceptions of what it means to be alive. We undertook a methodological approach that leverages large language models (LLMs) to analyze a set of definitions of life provided by a curated set of cross-disciplinary experts. We used a novel pairwise correlation analysis to map the definitions into distinct feature vectors, followed by agglomerative clustering, intra-cluster semantic analysis, and t-SNE projection to reveal underlying conceptual archetypes. This methodology revealed a continuous landscape of the themes relating to the definition of life, suggesting that what has historically been approached as a binary taxonomic problem should be instead conceived as differentiated perspectives within a unified conceptual latent space. We offer a new methodological bridge between reductionist and holistic approaches to fundamental questions in science and philosophy, demonstrating how computational semantic analysis can reveal conceptual patterns across disciplinary boundaries, and opening similar pathways for addressing other contested definitional territories across the sciences.

17 Nov 2025

astrophysics-of-galaxies physics

The Deepest GLIMPSE of a Dense Gas Cocoon Enshrouding a Little Red Dot

The detection of strong Balmer breaks and absorption features in Little Red Dots (LRDs) suggests they host AGN embedded within dense gas envelopes, potentially powered by super-Eddington accretion. We present GLIMPSE-17775, a luminous (

L_{\rm bol}\sim10^{45}

erg s

^{-1}

) LRD at

z=3.501

behind Abell S1063 (

\mu\sim2

), observed with deep JWST/NIRCam and a

\sim

20 hr (80 hr de-lensed) NIRSpec/G395M spectrum. The data reveal 40+ emission and absorption features, including a rich forest of low-ionization FeII lines and numerous broad hydrogen recombination transitions. We use this depth to test the dense-gas interpretation through five independent diagnostics. Nearly all permitted lines show exponential wings with consistent FWHM, the signature of Thomson scattering requiring

n_e\gtrsim10^8

^{-3}

. Adopting this width yields

M_{\rm BH}\sim10^{6.7}M_\odot

, a factor of ten lower than Gaussian fits, and

\lambda_{\rm Edd}\sim1.8

. Additional diagnostics support the same picture: a pronounced Balmer break (

f_{\nu,4050}/f_{\nu,3670}=2.0\pm0.1

), enhanced HeI

\lambda7065

and

\lambda10830

with P-Cygni absorption, Bowen-fluorescent OI

\lambda8446

\lambda11290

emission requiring Ly

\beta

pumping, and 16 FeII lines matching fluorescence models. These features indicate a dense (

n\sim10^8

^{-3}

), partially ionized cocoon where scattering and fluorescence dominate line formation, providing strong evidence that at least some LRDs are powered by super-Eddington black-hole growth in the early Universe.

275

06 Feb 2025

computer-science artificial-intelligence robotics

Probing a Vision-Language-Action Model for Symbolic States and Integration into a Cognitive Architecture

Tufts University

Vision-language-action (VLA) models hold promise as generalist robotics solutions by translating visual and linguistic inputs into robot actions, yet they lack reliability due to their black-box nature and sensitivity to environmental changes. In contrast, cognitive architectures (CA) excel in symbolic reasoning and state monitoring but are constrained by rigid predefined execution. This work bridges these approaches by probing OpenVLA's hidden layers to uncover symbolic representations of object properties, relations, and action states, enabling integration with a CA for enhanced interpretability and robustness. Through experiments on LIBERO-spatial pick-and-place tasks, we analyze the encoding of symbolic states across different layers of OpenVLA's Llama backbone. Our probing results show consistently high accuracies (> 0.90) for both object and action states across most layers, though contrary to our hypotheses, we did not observe the expected pattern of object states being encoded earlier than action states. We demonstrate an integrated DIARC-OpenVLA system that leverages these symbolic representations for real-time state monitoring, laying the foundation for more interpretable and reliable robotic manipulation.

137

02 Dec 2024

computer-science artificial-intelligence cryptography-and-security

Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models

Tufts University

Technical University of Munich Cambridge University Apart Research SecureBio

Capability evaluations play a critical role in ensuring the safe deployment of frontier AI systems, but this role may be undermined by intentional underperformance or ``sandbagging.'' We present a novel model-agnostic method for detecting sandbagging behavior using noise injection. Our approach is founded on the observation that introducing Gaussian noise into the weights of models either prompted or fine-tuned to sandbag can considerably improve their performance. We test this technique across a range of model sizes and multiple-choice question benchmarks (MMLU, AI2, WMDP). Our results demonstrate that noise injected sandbagging models show performance improvements compared to standard models. Leveraging this effect, we develop a classifier that consistently identifies sandbagging behavior. Our unsupervised technique can be immediately implemented by frontier labs or regulatory bodies with access to weights to improve the trustworthiness of capability evaluations.

24 Sep 2025

general-economics economics

Healthy diets are affordable but often displaced by other foods in Indonesia

Michigan State University Tufts University Global Alliance for Improved Nutrition

New methods for modeling least-cost diets that meet nutritional requirements for health have emerged as important tools for informing nutrition policy and programming around the world. This study develops a three-step approach using cost of healthy diet to inform targeted nutrition programming in Indonesia. We combine detailed retail prices and household survey data from Indonesia to describe how reported consumption and expenditure patterns across all levels of household income diverge from least cost healthy diets using items from nearby markets. In this analysis, we examine regional price variations, identify households with insufficient income for healthy diets, and analyze the nutrient adequacy of reported consumption patterns. We find that household food spending was sufficient to meet national dietary guidelines using the least expensive locally available items for over 98% of Indonesians, but almost all households consume substantial quantities of discretionary foods and mixed dishes while consuming too little energy from fruits, vegetables, and legumes, nuts, and seeds. Households with higher incomes have higher nutrient adequacy and are closer to meeting local dietary guidelines, but still fall short of recommendations. These findings shed new light on how actual food demand differs from least-cost healthy diets, due to factors other than affordability, such as taste, convenience, and aspirations shaped by marketing and other sociocultural influences.

10 Nov 2025

astrophysics-of-galaxies physics

JWST's GLIMPSE: an overview of the deepest probe of early galaxy formation and cosmic reionization

University of Washington

CNRS

University of Toronto

University of Pittsburgh

University of Copenhagen

The University of Texas at Austin

Yale University

Space Telescope Science Institute

Rutgers University

Stockholm University Tufts University University of Geneva ENS de Lyon Univ Lyon Observatoire de Paris

Durham University Niels Bohr Institute Ben-Gurion University of the Negev Swinburne University of Technology MIT Kavli Institute for Astrophysics and Space Research Institut d'Astrophysique de Paris Cosmic Dawn Center Dunlap Institute for Astronomy and Astrophysics IRAP Leiden Observatory CRAL UMR5574 Univ Lyon1 Centre for Astrophysics and Supercomputing OCA Oskar Klein Centre Institute for Computational Cosmology LUX eScience Institute Cosmic Frontier Center Institute for Data-Intensive Research in Astrophysics and Cosmology (DiRAC)Ecole Polytechnique F´ed´erale de Lausanne (EPFL) Observatoire Institut dAc`Astrophysique de Paris Sorbonne UniversitEc`e Institute for Data-Intensive Research in Astrophysics and Cosmology Ecole Polytechnique FEd`edEc`erale de Lausanne (EPFL) Observatoire UniversitEc`e PSL Université PSL Sorbonne Université

We present an overview of the JWST GLIMPSE program, highlighting its survey design, primary science goals, gravitational lensing models, and first results. GLIMPSE provides ultra-deep JWST/NIRCam imaging across seven broadband filters (F090W, F115W, F200W, F277W, F356W, F444W) and two medium-band filters (F410M, F480M), with exposure times ranging from 20 to 40 hours per filter. This yields a 5

\sigma

limiting magnitude of 30.9 AB (measured in a 0.2 arcsec diameter aperture). The field is supported by extensive ancillary data, including deep HST imaging from the Hubble Frontier Fields program, VLT/MUSE spectroscopy, and deep JWST/NIRSpec medium-resolution multi-object spectroscopy. Exploiting the strong gravitational lensing of the galaxy cluster Abell S1063, GLIMPSE probes intrinsic depths beyond 33 AB magnitudes and covers an effective source-plane area of approximately 4.4 arcmin

^2

z \sim 6

. The program's central aim is to constrain the abundance of the faintest galaxies from

z \sim 6

up to the highest redshifts, providing crucial benchmarks for galaxy formation models, which have so far been tested primarily on relatively bright systems. We present an initial sample of

\sim 540

galaxy candidates identified at

6 &lt; z &lt; 16

, with intrinsic UV magnitudes spanning

M_{\mathrm UV}

-

20 to

-

12. This enables unprecedented constraints on the extreme faint end of the UV luminosity function at these epochs. In addition, GLIMPSE opens new windows for spatially resolved studies of star clusters in early galaxies and the detection and characterization of faint high-

z

active galactic nuclei. This paper accompanies the first public data release, which includes reduced JWST and HST mosaics, photometric catalogs, and gravitational lensing models.

23 Sep 2025

cosmology-and-nongalactic-astrophysics general-relativity-and-quantum-cosmology high-energy-physics-theory

From Cusps to Swallowtails: Domain Wall singularities in 2+1 dimensions

University of the Basque Country (UPV/EHU)Tufts University IKERBASQUE-Basque Foundation for Science Institut de Física d’Altes Energies (IFAE)Barcelona Institute of Science and Technology (BIST)

The effective Nambu-Goto description of

(2+1)

-dimensional domain walls predicts singular behavior of its worldsheet resulting in swallowtail bifurcations. This phenomenon is intimately related to the formation of cusps, which emerge in different forms that we identify and classify. We describe in detail how swallowtail bifurcations generically arise in the collision of wiggles on straight domain wall strings, as well as in the collapse of closed loops, even for smooth initial conditions. Remarkably, by means of accurate lattice simulations, we find that these distinctive swallowtail features are reproduced in the field theory evolution of sufficiently thin walls, typically emitting a significant fraction of their initial energy in the process. These results suggest that such singular evolutions could potentially have important implications for the observable signatures associated with the collapse of domain wall networks in (3+1) dimensions in the early universe.

148

07 Jul 2025

computer-science computers-and-society

Stop treating `AGI' as the north-star goal of AI research

University of Illinois at Urbana-Champaign

Google

University of Chicago

Hugging Face Rochester Institute of Technology Tufts University University of Connecticut Worcester Polytechnic Institute Conservatoire National des Arts et Métiers Eberhard-Karls-Universität Tübingen AI Risk and Vulnerability Alliance Data & Society Research Institute Vijil cole Polytechnique

Researchers from a diverse group of academic institutions and industry labs critically examine the pervasive influence of Artificial General Intelligence (AGI) as a guiding principle in AI research. Their analysis reveals how an AGI focus exacerbates issues such as a lack of scientific rigor, masked values, and exclusion of diverse stakeholders. The paper advocates for alternative, more effective goal-setting strategies that prioritize specificity, pluralism, and inclusion, ultimately aiming to re-center AI development around supporting and benefiting human beings.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

AcrosticSleuth: Probabilistic Identification and Ranking of Acrostics in Multilingual Corpora

UNCOVER: Significant Reddening in Cosmic Noon Quiescent Galaxies

MMAU-Pro: A Challenging and Comprehensive Benchmark for Holistic Evaluation of Audio General Intelligence

Engineering Emergence

Causal Emergence 2.0: Quantifying emergent complexity

Graph Generative Pre-trained Transformer

Heavy QCD Axions at High-Energy Muon Colliders

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

MassSpecGym: A benchmark for the discovery and identification of molecules

Exact and efficient Lanczos method on a quantum computer

TabTreeFormer: Tabular Data Generation Using Hybrid Tree-Transformer

Recovering Wasserstein Distance Matrices from Few Measurements

What Lives? A meta-analysis of diverse opinions on the definition of life

The Deepest GLIMPSE of a Dense Gas Cocoon Enshrouding a Little Red Dot

Probing a Vision-Language-Action Model for Symbolic States and Integration into a Cognitive Architecture

Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models

Healthy diets are affordable but often displaced by other foods in Indonesia

JWST's GLIMPSE: an overview of the deepest probe of early galaxy formation and cosmic reionization

From Cusps to Swallowtails: Domain Wall singularities in 2+1 dimensions

Stop treating `AGI' as the north-star goal of AI research

Events

AI for Law

Personalize Your Feed