alphaXiv

History

Papers Benchmarks

University of Z

107

30 Mar 2025

instrumentation-and-methods-for-astrophysics computer-science machine-learning

Building Machine Learning Challenges for Anomaly Detection in Science

University of Washington Rensselaer Polytechnic Institute

California Institute of Technology

University of Illinois at Urbana-Champaign SLAC National Accelerator Laboratory

Chinese Academy of Sciences

Carnegie Mellon University

Imperial College London

University of Chicago National Taiwan University

Stanford University University of Hong Kong

ETH Zürich

University of California, San Diego

Northwestern University

University of Pennsylvania

University of Minnesota

University of Wisconsin-Madison University of Colorado

Lawrence Berkeley National Laboratory

MIT

The Ohio State University University of Delaware

Dartmouth College Chungnam National University University of Perugia Macau University of Science and Technology University of Maryland Baltimore County University of Z National Tsinghua University Ricoh Software Research Center (Beijing) Co.National Yang Ming-Chiao Tung University

Steven Dillmann

Scientific discoveries are often made by finding a pattern or object that was not predicted by the known rules of science. Oftentimes, these anomalous events or objects that do not conform to the norms are an indication that the rules of science governing the data are incomplete, and something new needs to be present to explain these unexpected outliers. The challenge of finding anomalies can be confounding since it requires codifying a complete knowledge of the known scientific behaviors and then projecting these known behaviors on the data to look for deviations. When utilizing machine learning, this presents a particular challenge since we require that the model not only understands scientific data perfectly but also recognizes when the data is inconsistent and out of the scope of its trained behavior. In this paper, we present three datasets aimed at developing machine learning-based anomaly detection for disparate scientific domains covering astrophysics, genomics, and polar science. We present the different datasets along with a scheme to make machine learning challenges around the three datasets findable, accessible, interoperable, and reusable (FAIR). Furthermore, we present an approach that generalizes to future machine learning challenges, enabling the possibility of large, more compute-intensive challenges that can ultimately lead to scientific discovery.

100

21 Sep 2024

computer-science computation-and-language machine-learning

One Model is All You Need: ByT5-Sanskrit, a Unified Model for Sanskrit NLP Tasks

Heinrich-Heine-University D University of Z

A unified, byte-level ByT5 model (ByT5-Sanskrit) achieves state-of-the-art results across multiple Sanskrit Natural Language Processing tasks, including word segmentation and dependency parsing, matching lexicon-based methods while demonstrating robustness to noisy data. The model also generalizes effectively to other morphologically rich languages for tasks like lemmatization and dependency parsing.

03 Jun 2025

attention-mechanisms computer-science machine-learning

The underlying structures of self-attention: symmetry, directionality, and emergent dynamics in Transformer training

ETH Zürich urich ETH AI Center University of Z urich University of Applied Sciences ECLT European Centre for Living Technology Centre for Artificial Intelligence, Z

Self-attention is essential to Transformer architectures, yet how information is embedded in the self-attention matrices and how different objective functions impact this process remains unclear. We present a mathematical framework to analyze self-attention matrices by deriving the structures governing their weight updates. Using this framework, we demonstrate that bidirectional training induces symmetry in the weight matrices, while autoregressive training results in directionality and column dominance. Our theoretical findings are validated across multiple Transformer models - including ModernBERT, GPT, LLaMA3, and Mistral - and input modalities like text, vision, and audio. Finally, we apply these insights by showing that symmetric initialization improves the performance of encoder-only models on language tasks. This mathematical analysis offers a novel theoretical perspective on how information is embedded through self-attention, thereby improving the interpretability of Transformer models.

26 Aug 2017

computer-science social-and-information-networks physics

A Comparative Analysis of Community Detection Algorithms on Artificial Networks

urich University of Z ":

Many community detection algorithms have been developed to uncover the mesoscopic properties of complex networks. However how good an algorithm is, in terms of accuracy and computing time, remains still open. Testing algorithms on real-world network has certain restrictions which made their insights potentially biased: the networks are usually small, and the underlying communities are not defined objectively. In this study, we employ the Lancichinetti-Fortunato-Radicchi benchmark graph to test eight state-of-the-art algorithms. We quantify the accuracy using complementary measures and algorithms' computing time. Based on simple network properties and the aforementioned results, we provide guidelines that help to choose the most adequate community detection algorithm for a given network. Moreover, these rules allow uncovering limitations in the use of specific algorithms given macroscopic network properties. Our contribution is threefold: firstly, we provide actual techniques to determine which is the most suited algorithm in most circumstances based on observable properties of the network under consideration. Secondly, we use the mixing parameter as an easily measurable indicator of finding the ranges of reliability of the different algorithms. Finally, we study the dependency with network size focusing on both the algorithm's predicting power and the effective computing time.

15 Apr 2025

physics quantum-physics

Entanglement scaling in matrix product state representation of smooth functions and their shallow quantum circuit approximations

urich Czech Technical University in Prague Haiqu, Inc.HSBC Lab University of Z Athena Research Center HSBC Service Delivery Sp. z o.o.

George Korpas

This research provides rigorous analytical derivations for entanglement scaling in matrix product state (MPS) representations of smooth functions, demonstrating universal exponential decay of entanglement with spatial scale. The study develops an improved MPS-based algorithm that constructs shallow, linear-depth quantum circuits for state preparation, successfully encoding heavy-tailed financial distributions on IBM quantum hardware for up to 25 qubits and showing classical scalability to 64 qubits with Tensor Cross Interpolation.

1,832

09 Mar 2025

cosmology-and-nongalactic-astrophysics physics

Dark Energy Survey: implications for cosmological expansion models from the final DES Baryon Acoustic Oscillation and Supernova data

SLAC National Accelerator Laboratory

Northeastern University

University College London

Stanford University

University of Michigan

Brookhaven National Laboratory

University of Wisconsin-Madison urich

Duke University

Australian National University Fermi National Accelerator Laboratory University of Queensland University of Portsmouth National Center for Supercomputing Applications Universidad de La Laguna Universidade Estadual Paulista Institut d’Estudis Espacials de Catalunya (IEEC)Instituto de Astrofisica de Canarias NSF’s National Optical-Infrared Astronomy Research Laboratory University of Z Cerro Tololo Inter-American Observatory Centro de Investigaciones Energ ́ısica d’Altes Energies (IFAE)́ogicas (CIEMAT)́eticas, Medioambientales y Tecnol University of Genova and INFN Institute of Space Sciences (ICE–CSIC)Laborat´orio Interinstitucional de e-Astronomia/LIneA Institut de F",Ludwig-Maximilians-Universit at Center for Astrophysics Harvard & Smithsonian ":

Lluís Galbany

Giulia Giannini

This paper presents results from the final Dark Energy Survey's Baryon Acoustic Oscillation and Supernova datasets, finding approximately 3.2σ evidence that dark energy is not a cosmological constant but a dynamical field with an evolving equation of state. The analysis also confirms that the Hubble tension persists even when allowing for an evolving dark energy, suggesting additional physics may be required.

26 Jan 2025

computer-science human-computer-interaction signal-processing

Heterogeneous Population Encoding for Multi-joint Regression using sEMG Signals

ETH Zürich University of Sheffield University of Z

Regression-based decoding of continuous movements is essential for human-machine interfaces (HMIs), such as prosthetic control. This study explores a feature-based approach to encoding Surface Electromyography (sEMG) signals, focusing on the role of variability in neural-inspired population encoding. By employing heterogeneous populations of Leaky Integrate-and- Fire (LIF) neurons with varying sizes and diverse parameter distributions, we investigate how population size and variability in encoding parameters, such as membrane time constants and thresholds, influence decoding performance. Using a simple linear readout, we demonstrate that variability improves robustness and generalizability compared to single-neuron encoders. These findings emphasize the importance of optimizing variability and population size for efficient and scalable regression tasks in spiking neural networks (SNNs), paving the way for robust, low-power HMI implementations.

18 Mar 2025

ai-for-health computer-science continual-learning

TAMER: A Test-Time Adaptive MoE-Driven Framework for EHR Representation Learning

urich University of Z

We propose TAMER, a Test-time Adaptive MoE-driven framework for Electronic Health Record (EHR) Representation learning. TAMER introduces a framework where a Mixture-of-Experts (MoE) architecture is co-designed with Test-Time Adaptation (TTA) to jointly mitigate the intertwined challenges of patient heterogeneity and distribution shifts in EHR modeling. The MoE focuses on latent patient subgroups through domain-aware expert specialization, while TTA enables real-time adaptation to evolving health status distributions when new patient samples are introduced. Extensive experiments across four real-world EHR datasets demonstrate that TAMER consistently improves predictive performance for both mortality and readmission risk tasks when combined with diverse EHR modeling backbones. TAMER offers a promising approach for dynamic and personalized EHR-based predictions in practical clinical settings.

20 Nov 2019

computer-science computer-vision-and-pattern-recognition

Photo-Realistic Monocular Gaze Redirection Using Generative Adversarial Networks

ETH Zürich urich University of Z

Gaze redirection is the task of changing the gaze to a desired direction for a given monocular eye patch image. Many applications such as videoconferencing, films, games, and generation of training data for gaze estimation require redirecting the gaze, without distorting the appearance of the area surrounding the eye and while producing photo-realistic images. Existing methods lack the ability to generate perceptually plausible images. In this work, we present a novel method to alleviate this problem by leveraging generative adversarial training to synthesize an eye image conditioned on a target gaze direction. Our method ensures perceptual similarity and consistency of synthesized images to the real images. Furthermore, a gaze estimation loss is used to control the gaze direction accurately. To attain high-quality images, we incorporate perceptual and cycle consistency losses into our architecture. In extensive evaluations we show that the proposed method outperforms state-of-the-art approaches in terms of both image quality and redirection precision. Finally, we show that generated images can bring significant improvement for the gaze estimation task if used to augment real training data.

144

14 Sep 2022

cosmology-and-nongalactic-astrophysics astrophysics-of-galaxies physics

Cosmological simulations of the same spiral galaxy: the impact of baryonic physics

CNRS

University of Oxford Aix-Marseille Univ CNES LAM University of Z

The interplay of star formation and supernova (SN) feedback in galaxy formation is a key element for understanding galaxy evolution. Since these processes occur at small scales, it is necessary to have sub-grid models that recover their evolution and environmental effects at the scales reached by cosmological simulations. We simulate the same spiral galaxy inhabiting a Milky Way (MW) size halo in a cosmological environment changing the sub-grid models for SN feedback and star formation. We test combinations of the Schmidt law and a multi-freefall based star formation with delayed cooling feedback or mechanical feedback. We reach a resolution of 35 pc in a zoom-in box of 36 Mpc. For this, we use the code RAMSES with the implementation of gas turbulence in time and trace the local hydrodynamical features of the star-forming gas. Finally, we compare the galaxies at redshift 0 with global and interstellar medium observations in the MW and local spiral galaxies. The simulations show successful comparisons with observations. Nevertheless, diverse galactic morphologies are obtained from different numerical implementations. We highlight the importance of detailed modelling of the star formation and feedback processes, especially when increasing the resolution of simulations. Future improvements could alleviate the degeneracies exhibited in our simulated galaxies under different sub-grid models.

16 Oct 2017

earth-and-planetary-astrophysics physics

Assessing the Interior Structure of Terrestrial Exoplanets with Implications for Habitability

University of Bern

ETH Zürich urich University of Z ":

Astrophysical observations reveal a large diversity of radii and masses of exoplanets. It is important to characterize the interiors of exoplanets to understand planetary diversity and further determine how unique, or not, Earth is. Assessing interior structure is challenging because there are few data and large uncertainties. Thus, for a given exoplanet a range of interior structure models can satisfy available data. Typically, interior models aim to constrain the radial structure and composition of the core and mantle, and additionally ice, ocean, and gas layer if appropriate. Constraining the parameters of these layers may also inform us about interior dynamics. However, it remains challenging to constrain interior dynamics using interior structure models because structure models are relatively insensitive to the thermal state of a planet. Nevertheless, elucidating interior dynamics remains a key goal in exoplanetology due to its role in determining surface conditions and hence habitability. Thus far, Earth-like habitability can be excluded for super-Earths that are in close proximity to their stars and therefore have high surface temperatures that promote local magma oceans.

09 Jun 2015

physics geophysics

Atomic clocks as a tool to monitor vertical surface motion

University of Mississippi

ETH Zürich urich University of Z Albert Einstein Institute Universitatea de Vest ":

Atomic clock technology is advancing rapidly, now reaching stabilities of

\Delta f/f \sim 10^{-18}

, which corresponds to resolving

1

cm in equivalent geoid height over an integration timescale of about 7 hours. At this level of performance, ground-based atomic clock networks emerge as a tool for monitoring a variety of geophysical processes by directly measuring changes in the gravitational potential. Vertical changes of the clock's position due to magmatic, volcanic, post-seismic or tidal deformations can result in measurable variations in the clock tick rate. As an example, we discuss the geopotential change arising due to an inflating point source (Mogi model), and apply it to the Etna volcano. Its effect on an observer on the Earth's surface can be divided into two different terms: one purely due to uplift and one due to the redistribution of matter. Thus, with the centimetre-level precision of current clocks it is already possible to monitor volcanoes. The matter redistribution term is estimated to be 2-3 orders of magnitude smaller than the uplift term, and should be resolvable when clocks improve their stability to the sub-millimetre level. Additionally, clocks can be compared over distances of thousands of kilometres on a short-term basis (e.g. hourly). These clock networks will improve our ability to monitor periodic effects with long-wavelength like the solid Earth tide.

08 Aug 2019

active-learning computer-science human-computer-interaction

An Extensible Interactive Interface for Agent Design

UC Berkeley

ETH Zürich urich University of Z Z":":

In artificial intelligence, we often specify tasks through a reward function. While this works well in some settings, many tasks are hard to specify this way. In deep reinforcement learning, for example, directly specifying a reward as a function of a high-dimensional observation is challenging. Instead, we present an interface for specifying tasks interactively using demonstrations. Our approach defines a set of increasingly complex policies. The interface allows the user to switch between these policies at fixed intervals to generate demonstrations of novel, more complex, tasks. We train new policies based on these demonstrations and repeat the process. We present a case study of our approach in the Lunar Lander domain, and show that this simple approach can quickly learn a successful landing policy and outperforms an existing comparison-based deep RL method.

22 Sep 2020

computer-science social-and-information-networks

Explaining Gender Differences in Academics' Career Trajectories

New York University University of Colorado urich University at Buffalo SUNY University of Z ":

Academic fields exhibit substantial levels of gender segregation. To date, most attempts to explain this persistent global phenomenon have relied on limited cross-sections of data from specific countries, fields, or career stages. Here we used a global longitudinal dataset assembled from profiles on ORCID.org to investigate which characteristics of a field predict gender differences among the academics who leave and join that field. Only two field characteristics consistently predicted such differences: (1) the extent to which a field values raw intellectual talent ("brilliance") and (2) whether a field is in Science, Technology, Engineering, and Mathematics (STEM). Women more than men moved away from brilliance-oriented and STEM fields, and men more than women moved toward these fields. Our findings suggest that stereotypes associating brilliance and other STEM-relevant traits with men more than women play a key role in maintaining gender segregation across academia.

08 Dec 2022

computer-science continual-learning computer-vision-and-pattern-recognition

Bio-Inspired, Task-Free Continual Learning through Activity Regularization

ETH Zürich urich University of Z

The ability to sequentially learn multiple tasks without forgetting is a key skill of biological brains, whereas it represents a major challenge to the field of deep learning. To avoid catastrophic forgetting, various continual learning (CL) approaches have been devised. However, these usually require discrete task boundaries. This requirement seems biologically implausible and often limits the application of CL methods in the real world where tasks are not always well defined. Here, we take inspiration from neuroscience, where sparse, non-overlapping neuronal representations have been suggested to prevent catastrophic forgetting. As in the brain, we argue that these sparse representations should be chosen on the basis of feed forward (stimulus-specific) as well as top-down (context-specific) information. To implement such selective sparsity, we use a bio-plausible form of hierarchical credit assignment known as Deep Feedback Control (DFC) and combine it with a winner-take-all sparsity mechanism. In addition to sparsity, we introduce lateral recurrent connections within each layer to further protect previously learned representations. We evaluate the new sparse-recurrent version of DFC on the split-MNIST computer vision benchmark and show that only the combination of sparsity and intra-layer recurrent connections improves CL performance with respect to standard backpropagation. Our method achieves similar performance to well-known CL methods, such as Elastic Weight Consolidation and Synaptic Intelligence, without requiring information about task boundaries. Overall, we showcase the idea of adopting computational principles from the brain to derive new, task-free learning algorithms for CL.

05 Jun 2024

computer-science artificial-intelligence computer-vision-and-pattern-recognition

Text-to-Events: Synthetic Event Camera Streams from Conditional Text Input

ETH Zürich urich University of Z

Event cameras are advantageous for tasks that require vision sensors with low-latency and sparse output responses. However, the development of deep network algorithms using event cameras has been slow because of the lack of large labelled event camera datasets for network training. This paper reports a method for creating new labelled event datasets by using a text-to-X model, where X is one or multiple output modalities, in the case of this work, events. Our proposed text-to-events model produces synthetic event frames directly from text prompts. It uses an autoencoder which is trained to produce sparse event frames representing event camera outputs. By combining the pretrained autoencoder with a diffusion model architecture, the new text-to-events model is able to generate smooth synthetic event streams of moving objects. The autoencoder was first trained on an event camera dataset of diverse scenes. In the combined training with the diffusion model, the DVS gesture dataset was used. We demonstrate that the model can generate realistic event sequences of human gestures prompted by different text statements. The classification accuracy of the generated sequences, using a classifier trained on the real dataset, ranges between 42% to 92%, depending on the gesture group. The results demonstrate the capability of this method in synthesizing event datasets.

07 Sep 2025

computer-science artificial-intelligence computer-vision-and-pattern-recognition

Can Machines Imitate Humans? Integrative Turing-like tests for Language and Vision Demonstrate a Narrowing Gap

Harvard University

ETH Zürich IBM Research Harvard Medical School

Johns Hopkins University University of Turin urich

MIT Agency for Science, Technology and Research Birla Institute of Technology and Science University of Z

As AI becomes increasingly embedded in daily life, ascertaining whether an agent is human is critical. We systematically benchmark AI's ability to imitate humans in three language tasks (image captioning, word association, conversation) and three vision tasks (color estimation, object detection, attention prediction), collecting data from 636 humans and 37 AI agents. Next, we conducted 72,191 Turing-like tests with 1,916 human judges and 10 AI judges. Current AIs are approaching the ability to convincingly impersonate humans and deceive human judges in both language and vision. Even simple AI judges outperformed humans in distinguishing AI from human responses. Imitation ability showed minimal correlation with conventional AI performance metrics, suggesting that passing as human is an important independent evaluation criterion. The large-scale Turing datasets and metrics introduced here offer valuable benchmarks for assessing human-likeness in AI and highlight the importance of rigorous, quantitative imitation tests for AI development.

09 May 2025

adversarial-attacks computer-science artificial-intelligence

From Models to Network Topologies: A Topology Inference Attack in Decentralized Federated Learning

urich University of Z armasuisse Science & Technology ":

Federated Learning (FL) is widely recognized as a privacy-preserving machine learning paradigm due to its model-sharing mechanism that avoids direct data exchange. Nevertheless, model training leaves exploitable traces that can be used to infer sensitive information. In Decentralized FL (DFL), the topology, defining how participants are connected, plays a crucial role in shaping the model's privacy, robustness, and convergence. However, the topology introduces an unexplored vulnerability: attackers can exploit it to infer participant relationships and launch targeted attacks. This work uncovers the hidden risks of DFL topologies by proposing a novel Topology Inference Attack that infers the topology solely from model behavior. A taxonomy of topology inference attacks is introduced, categorizing them by the attacker's capabilities and knowledge. Practical attack strategies are designed for various scenarios, and experiments are conducted to identify key factors influencing attack success. The results demonstrate that analyzing only the model of each node can accurately infer the DFL topology, highlighting a critical privacy risk in DFL systems. These findings offer valuable insights for improving privacy preservation in DFL environments.

05 Feb 2025

Distributed Quantum Dynamics on Near-Term Quantum Processors

urich Haiqu, Inc.University of Z

Simulations of quantum dynamics are a key application of near term quantum computing, but are hindered by the twin challenges of noise and small device scale, which limit the executable circuit depths and the number of qubits the algorithm can be run on. Towards overcoming these obstacles we develop and implement a distributed variant of the projected Variational Quantum Dynamics which we dub dp-VQD, which allows to simultaneously alleviate circuit depth and width limitations. We employ the wire cutting technique, which can be executed on the existing devices without quantum or classical communication. We demonstrate the full variational training on noisy simulators, and execute and perform the reconstruction on real IBM quantum devices. The algorithm allows to execute Hamiltonian evolution simulations for problem sizes exceeding devices' nominal qubit counts, and to combine multiple small devices in a distributed computation. We test our approach on the Heisenberg and Hubbard model dynamics.

07 Feb 2025

adversarial-attacks computer-science artificial-intelligence

DMPA: Model Poisoning Attacks on Decentralized Federated Learning for Model Differences

urich armasuisse University of Z

Federated learning (FL) has garnered significant attention as a prominent privacy-preserving Machine Learning (ML) paradigm. Decentralized FL (DFL) eschews traditional FL's centralized server architecture, enhancing the system's robustness and scalability. However, these advantages of DFL also create new vulnerabilities for malicious participants to execute adversarial attacks, especially model poisoning attacks. In model poisoning attacks, malicious participants aim to diminish the performance of benign models by creating and disseminating the compromised model. Existing research on model poisoning attacks has predominantly concentrated on undermining global models within the Centralized FL (CFL) paradigm, while there needs to be more research in DFL. To fill the research gap, this paper proposes an innovative model poisoning attack called DMPA. This attack calculates the differential characteristics of multiple malicious client models and obtains the most effective poisoning strategy, thereby orchestrating a collusive attack by multiple participants. The effectiveness of this attack is validated across multiple datasets, with results indicating that the DMPA approach consistently surpasses existing state-of-the-art FL model poisoning attack strategies.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Building Machine Learning Challenges for Anomaly Detection in Science

One Model is All You Need: ByT5-Sanskrit, a Unified Model for Sanskrit NLP Tasks

The underlying structures of self-attention: symmetry, directionality, and emergent dynamics in Transformer training

A Comparative Analysis of Community Detection Algorithms on Artificial Networks

Entanglement scaling in matrix product state representation of smooth functions and their shallow quantum circuit approximations

Dark Energy Survey: implications for cosmological expansion models from the final DES Baryon Acoustic Oscillation and Supernova data

Heterogeneous Population Encoding for Multi-joint Regression using sEMG Signals

TAMER: A Test-Time Adaptive MoE-Driven Framework for EHR Representation Learning

Photo-Realistic Monocular Gaze Redirection Using Generative Adversarial Networks

Cosmological simulations of the same spiral galaxy: the impact of baryonic physics

Assessing the Interior Structure of Terrestrial Exoplanets with Implications for Habitability

Atomic clocks as a tool to monitor vertical surface motion

An Extensible Interactive Interface for Agent Design

Explaining Gender Differences in Academics' Career Trajectories

Bio-Inspired, Task-Free Continual Learning through Activity Regularization

Text-to-Events: Synthetic Event Camera Streams from Conditional Text Input

Can Machines Imitate Humans? Integrative Turing-like tests for Language and Vision Demonstrate a Narrowing Gap

From Models to Network Topologies: A Topology Inference Attack in Decentralized Federated Learning

Distributed Quantum Dynamics on Near-Term Quantum Processors

DMPA: Model Poisoning Attacks on Decentralized Federated Learning for Model Differences

Events

AI for Law

Personalize Your Feed