alphaXiv

History

Papers Benchmarks

Lucerne University of Applied Sciences and Arts

02 Oct 2025

computer-science cryptography-and-security physics

Reproducible Builds for Quantum Computing

ETH Zurich Lucerne University of Applied Sciences and Arts

Reproducible builds are a set of software development practices that establish an independently verifiable path from source code to binary artifacts, helping to detect and mitigate certain classes of supply chain attacks. Although quantum computing is a rapidly evolving field of research, it can already benefit from adopting reproducible builds. This paper aims to bridge the gap between the quantum computing and reproducible builds communities. We propose a generalization of the definition of reproducible builds in the quantum setting, motivated by two threat models: one targeting the confidentiality of end users' data during circuit preparation and submission to a quantum computer, and another compromising the integrity of quantum computation results. This work presents three examples that show how classical information can be hidden in transpiled quantum circuits, and two cases illustrating how even minimal modifications to these circuits can lead to incorrect quantum computation results. Our work provides initial steps towards a framework for reproducibility in quantum software toolchains.

16 May 2025

active-learning computer-science artificial-intelligence

CleanPatrick: A Benchmark for Image Data Cleaning

Northwestern University

University of Basel Medical University of Vienna Lucerne University of Applied Sciences and Arts Banner Health University Hospital of Basel Northeast Dermatology Associates

Robust machine learning depends on clean data, yet current image data cleaning benchmarks rely on synthetic noise or narrow human studies, limiting comparison and real-world relevance. We introduce CleanPatrick, the first large-scale benchmark for data cleaning in the image domain, built upon the publicly available Fitzpatrick17k dermatology dataset. We collect 496,377 binary annotations from 933 medical crowd workers, identify off-topic samples (4%), near-duplicates (21%), and label errors (22%), and employ an aggregation model inspired by item-response theory followed by expert review to derive high-quality ground truth. CleanPatrick formalizes issue detection as a ranking task and adopts typical ranking metrics mirroring real audit workflows. Benchmarking classical anomaly detectors, perceptual hashing, SSIM, Confident Learning, NoiseRank, and SelfClean, we find that, on CleanPatrick, self-supervised representations excel at near-duplicate detection, classical methods achieve competitive off-topic detection under constrained review budgets, and label-error detection remains an open challenge for fine-grained medical classification. By releasing both the dataset and the evaluation framework, CleanPatrick enables a systematic comparison of image-cleaning strategies and paves the way for more reliable data-centric artificial intelligence.

08 Nov 2024

ai-for-health computer-science artificial-intelligence

Towards Scalable Foundation Models for Digital Dermatology

Johns Hopkins University

University of Basel Lucerne University of Applied Sciences and Arts University Hospital of Basel

Fabian Gröger

The growing demand for accurate and equitable AI models in digital dermatology faces a significant challenge: the lack of diverse, high-quality labeled data. In this work, we investigate the potential of domain-specific foundation models for dermatology in addressing this challenge. We utilize self-supervised learning (SSL) techniques to pre-train models on a dataset of over 240,000 dermatological images from public and private collections. Our study considers several SSL methods and compares the resulting foundation models against domain-agnostic models like those pre-trained on ImageNet and state-of-the-art models such as MONET across 12 downstream tasks. Unlike previous research, we emphasize the development of smaller models that are more suitable for resource-limited clinical settings, facilitating easier adaptation to a broad range of use cases. Results show that models pre-trained in this work not only outperform general-purpose models but also approach the performance of models 50 times larger on clinically relevant diagnostic tasks. To promote further research in this direction, we publicly release both the training code and the foundation models, which can benefit clinicians in dermatological applications.

28 Oct 2024

ai-for-health computer-science computer-vision-and-pattern-recognition

Intrinsic Self-Supervision for Data Quality Audits

MIT Media Lab

University of Basel Lucerne University of Applied Sciences and Arts University Hospital of Basel

Fabian Gröger

Benchmark datasets in computer vision often contain off-topic images, near duplicates, and label errors, leading to inaccurate estimates of model performance. In this paper, we revisit the task of data cleaning and formalize it as either a ranking problem, which significantly reduces human inspection effort, or a scoring problem, which allows for automated decisions based on score distributions. We find that a specific combination of context-aware self-supervised representation learning and distance-based indicators is effective in finding issues without annotation biases. This methodology, which we call SelfClean, surpasses state-of-the-art performance in detecting off-topic images, near duplicates, and label errors within widely-used image datasets, such as ImageNet-1k, Food-101N, and STL-10, both for synthetic issues and real contamination. We apply the detailed method to multiple image benchmarks, identify up to 16% of issues, and confirm an improvement in evaluation reliability upon cleaning. The official implementation can be found at: this https URL.

22 Oct 2024

computer-science computation-and-language model-interpretation

Holmes: A Benchmark to Assess the Linguistic Competence of Language Models

MIT MIT-IBM Watson AI Lab Technical University of Darmstadt IBM Research AI Lucerne University of Applied Sciences and Arts IBM Research Europe - Ireland

Leshem Choshen

Yotam Perlitz

We introduce Holmes, a new benchmark designed to assess language models (LMs) linguistic competence - their unconscious understanding of linguistic phenomena. Specifically, we use classifier-based probing to examine LMs' internal representations regarding distinct linguistic phenomena (e.g., part-of-speech tagging). As a result, we meet recent calls to disentangle LMs' linguistic competence from other cognitive abilities, such as following instructions in prompting-based evaluations. Composing Holmes, we review over 270 probing studies and include more than 200 datasets to assess syntax, morphology, semantics, reasoning, and discourse phenomena. Analyzing over 50 LMs reveals that, aligned with known trends, their linguistic competence correlates with model size. However, surprisingly, model architecture and instruction tuning also significantly influence performance, particularly in morphology and syntax. Finally, we propose FlashHolmes, a streamlined version that reduces the computation load while maintaining high-ranking precision.

04 Jun 2021

computation statistics methodology

varycoef: An R Package for Gaussian Process-based Spatially Varying Coefficient Models

University of Zurich Lucerne University of Applied Sciences and Arts

Gaussian processes (GPs) are well-known tools for modeling dependent data with applications in spatial statistics, time series analysis, or econometrics. In this article, we present the R package varycoef that implements estimation, prediction, and variable selection of linear models with spatially varying coefficients (SVC) defined by GPs, so called GP-based SVC models. Such models offer a high degree of flexibility while being relatively easy to interpret. Using varycoef, we show versatile applications of (spatially) varying coefficient models on spatial and time series data. This includes model and coefficient estimation with predictions and variable selection. The package uses state-of-the-art computational statistics techniques like parallelization, model-based optimization, and covariance tapering. This allows the user to work with (S)VC models in a computationally efficient manner, i.e., model estimation on large data sets is possible in a feasible amount of time.

25 Sep 2024

computer-science computer-vision-security computer-vision-and-pattern-recognition

Hyperbolic Metric Learning for Visual Outlier Detection

University of Basel University of Plymouth Lucerne University of Applied Sciences and Arts

Fabian Gröger

Researchers developed HOD, a Hyperbolic metric learning framework for visual Out-Of-Distribution (OOD) detection that projects feature embeddings into Hyperbolic space. The framework demonstrated improved OOD detection performance, reducing the average False Positive Rate at 95% recall on CIFAR-100 from 49.8% to 28.5%, and maintained effectiveness even with low-dimensional embeddings.

07 Mar 2024

clustering-algorithms computer-science computation-and-language

Automating the Information Extraction from Semi-Structured Interview Transcripts

Lucerne University of Applied Sciences and Arts

This paper explores the development and application of an automated system designed to extract information from semi-structured interview transcripts. Given the labor-intensive nature of traditional qualitative analysis methods, such as coding, there exists a significant demand for tools that can facilitate the analysis process. Our research investigates various topic modeling techniques and concludes that the best model for analyzing interview texts is a combination of BERT embeddings and HDBSCAN clustering. We present a user-friendly software prototype that enables researchers, including those without programming skills, to efficiently process and visualize the thematic structure of interview data. This tool not only facilitates the initial stages of qualitative analysis but also offers insights into the interconnectedness of topics revealed, thereby enhancing the depth of qualitative analysis.

05 Nov 2024

bayesian-optimization computer-science machine-learning

Gaussian Process Boosting

Lucerne University of Applied Sciences and Arts

We introduce a novel way to combine boosting with Gaussian process and mixed effects models. This allows for relaxing, first, the zero or linearity assumption for the prior mean function in Gaussian process and grouped random effects models in a flexible non-parametric way and, second, the independence assumption made in most boosting algorithms. The former is advantageous for prediction accuracy and for avoiding model misspecifications. The latter is important for efficient learning of the fixed effects predictor function and for obtaining probabilistic predictions. Our proposed algorithm is also a novel solution for handling high-cardinality categorical variables in tree-boosting. In addition, we present an extension that scales to large data using a Vecchia approximation for the Gaussian process model relying on novel results for covariance parameter inference. We obtain increased prediction accuracy compared to existing approaches on multiple simulated and real-world data sets.

591

07 Jun 2020

high-energy-physics-phenomenology physics

MasterTwo: A Mathematica Package for the Automated Calculation of Two Loop Diagrams in the Standard Model and Beyond

ETH Zurich University of Zurich Lucerne University of Applied Sciences and Arts

The calculation of rare loop decays in the Standard Model of Particle Physics and its extensions is an extremely tedious work. The Mathematica package MasterTwo facilitates this task. It automatically calculates all loop integrals reducible to scalar integrals depending on up to two different masses independent of external momenta. MasterTwo consists of two sub packages, Fermions and Integrals. Whereas Fermions covers the standard Dirac Algebra, Integrals performs the Taylor expansion, partial fraction, tensor reduction and the integration of the thus achieved scalar integrals. The package works completely inside Mathematica and can be easily customised for both educational and research purposes.

23 Jul 2025

computer-science machine-learning model-interpretation

A Spatio-Temporal Machine Learning Model for Mortgage Credit Risk: Default Probabilities and Loan Portfolios

ETH Zurich

University of Basel Lucerne University of Applied Sciences and Arts

We introduce a novel machine learning model for credit risk by combining tree-boosting with a latent spatio-temporal Gaussian process model accounting for frailty correlation. This allows for modeling non-linearities and interactions among predictor variables in a flexible data-driven manner and for accounting for spatio-temporal variation that is not explained by observable predictor variables. We also show how estimation and prediction can be done in a computationally efficient manner. In an application to a large U.S. mortgage credit risk data set, we find that both predictive default probabilities for individual loans and predictive loan portfolio loss distributions obtained with our novel approach are more accurate compared to conventional independent linear hazard models and also linear spatio-temporal models. Using interpretability tools for machine learning models, we find that the likely reasons for this outperformance are strong interaction and non-linear effects in the predictor variables and the presence of spatio-temporal frailty effects.

05 Mar 2025

ai-for-health computer-science computer-vision-and-pattern-recognition

Implicit U-KAN2.0: Dynamic, Efficient and Interpretable Medical Image Segmentation

University of Illinois at Urbana-Champaign

University of Cambridge

Tsinghua University Lucerne University of Applied Sciences and Arts

Image segmentation is a fundamental task in both image analysis and medical applications. State-of-the-art methods predominantly rely on encoder-decoder architectures with a U-shaped design, commonly referred to as U-Net. Recent advancements integrating transformers and MLPs improve performance but still face key limitations, such as poor interpretability, difficulty handling intrinsic noise, and constrained expressiveness due to discrete layer structures, often lacking a solid theoretical foundation.In this work, we introduce Implicit U-KAN 2.0, a novel U-Net variant that adopts a two-phase encoder-decoder structure. In the SONO phase, we use a second-order neural ordinary differential equation (NODEs), called the SONO block, for a more efficient, expressive, and theoretically grounded modeling approach. In the SONO-MultiKAN phase, we integrate the second-order NODEs and MultiKAN layer as the core computational block to enhance interpretability and representation power. Our contributions are threefold. First, U-KAN 2.0 is an implicit deep neural network incorporating MultiKAN and second order NODEs, improving interpretability and performance while reducing computational costs. Second, we provide a theoretical analysis demonstrating that the approximation ability of the MultiKAN block is independent of the input dimension. Third, we conduct extensive experiments on a variety of 2D and a single 3D dataset, demonstrating that our model consistently outperforms existing segmentation networks.

16 May 2023

attention-mechanisms computer-science computer-vision-security

Blind Image Quality Assessment via Transformer Predicted Error Map and Perceptual Quality Token

Nanjing University of Aeronautics and Astronautics Lucerne University of Applied Sciences and Arts

Image quality assessment is a fundamental problem in the field of image processing, and due to the lack of reference images in most practical scenarios, no-reference image quality assessment (NR-IQA), has gained increasing attention recently. With the development of deep learning technology, many deep neural network-based NR-IQA methods have been developed, which try to learn the image quality based on the understanding of database information. Currently, Transformer has achieved remarkable progress in various vision tasks. Since the characteristics of the attention mechanism in Transformer fit the global perceptual impact of artifacts perceived by a human, Transformer is thus well suited for image quality assessment tasks. In this paper, we propose a Transformer based NR-IQA model using a predicted objective error map and perceptual quality token. Specifically, we firstly generate the predicted error map by pre-training one model consisting of a Transformer encoder and decoder, in which the objective difference between the distorted and the reference images is used as supervision. Then, we freeze the parameters of the pre-trained model and design another branch using the vision Transformer to extract the perceptual quality token for feature fusion with the predicted error map. Finally, the fused features are regressed to the final image quality score. Extensive experiments have shown that our proposed method outperforms the current state-of-the-art in both authentic and synthetic image databases. Moreover, the attentional map extracted by the perceptual quality token also does conform to the characteristics of the human visual system.

23 Sep 2025

computer-science computation-and-language domain-adaptation

A Pipeline to Assess Merging Methods via Behavior and Internals

Technical University of Darmstadt Lucerne University of Applied Sciences and Arts

Merging methods combine the weights of multiple language models (LMs) to leverage their capacities, such as for domain adaptation. While existing studies investigate merged models from a solely behavioral perspective, we offer the first comprehensive view by assessing and connecting their behavior and internals. We present a novel evaluation pipeline that first merges multiple parent LMs, and then evaluates the merged models in comparison to the initial ones based on their behavior on downstream tasks, like MMLU, and the internal encoded linguistic competence. We showcase this pipeline by assessing the merging of instruction fine-tuned with math- and code-adapted LMs from the Qwen2.5 family. Our results show that merging methods impacts behavior and internals differently. While the performance of merged models is typically between that of the two parent models, their encoded information about linguistic phenomena, particularly in morphology and syntax, can surpass the parent models. Moreover, we find weak ranking correlation between this behavior and internal evaluation. With our pipeline and initial results, we emphasize the need for more comprehensive evaluations of model merging methods to gain a faithful understanding of their capabilities and reliability, beyond potential superficial behavioral advances.

21 Sep 2023

computer-science machine-learning statistical-learning

Shedding Light on the Ageing of Extra Virgin Olive Oil: Probing the Impact of Temperature with Fluorescence Spectroscopy and Machine Learning Techniques

Zurich University of Applied Sciences Lucerne University of Applied Sciences and Arts IDSIA Dalle Molle Institute for Artificial Intelligence TOELT LLC

This work systematically investigates the oxidation of extra virgin olive oil (EVOO) under accelerated storage conditions with UV absorption and total fluorescence spectroscopy. With the large amount of data collected, it proposes a method to monitor the oil's quality based on machine learning applied to highly-aggregated data. EVOO is a high-quality vegetable oil that has earned worldwide reputation for its numerous health benefits and excellent taste. Despite its outstanding quality, EVOO degrades over time owing to oxidation, which can affect both its health qualities and flavour. Therefore, it is highly relevant to quantify the effects of oxidation on EVOO and develop methods to assess it that can be easily implemented under field conditions, rather than in specialized laboratories. The following study demonstrates that fluorescence spectroscopy has the capability to monitor the effect of oxidation and assess the quality of EVOO, even when the data are highly aggregated. It shows that complex laboratory equipment is not necessary to exploit fluorescence spectroscopy using the proposed method and that cost-effective solutions, which can be used in-field by non-scientists, could provide an easily-accessible assessment of the quality of EVOO.

08 Feb 2021

computer-science machine-learning ensemble-methods

KTBoost: Combined Kernel and Tree Boosting

Lucerne University of Applied Sciences and Arts

We introduce a novel boosting algorithm called `KTBoost' which combines kernel boosting and tree boosting. In each boosting iteration, the algorithm adds either a regression tree or reproducing kernel Hilbert space (RKHS) regression function to the ensemble of base learners. Intuitively, the idea is that discontinuous trees and continuous RKHS regression functions complement each other, and that this combination allows for better learning of functions that have parts with varying degrees of regularity such as discontinuities and smooth parts. We empirically show that KTBoost significantly outperforms both tree and kernel boosting in terms of predictive accuracy in a comparison on a wide array of data sets.

14 May 2025

ai-for-health computer-science machine-learning

Scalable Computations for Generalized Mixed Effects Models with Crossed Random Effects Using Krylov Subspace Methods

ETH Zurich

University of Basel Lucerne University of Applied Sciences and Arts

Mixed effects models are widely used for modeling data with hierarchically grouped structures and high-cardinality categorical predictor variables. However, for high-dimensional crossed random effects, current standard computations relying on Cholesky decompositions can become prohibitively slow. In this work, we present novel Krylov subspace-based methods that address several existing computational bottlenecks. Among other things, we theoretically analyze and empirically evaluate various preconditioners for the conjugate gradient and stochastic Lanczos quadrature methods, derive new convergence results, and develop computationally efficient methods for calculating predictive variances. Extensive experiments using simulated and real-world data sets show that our proposed methods scale much better than Cholesky-based computations, for instance, achieving a runtime reduction of approximately two orders of magnitudes for both estimation and prediction. Moreover, our software implementation is up to 10'000 times faster and more stable than state-of-the-art implementations such as lme4 and glmmTMB when using default settings. Our methods are implemented in the free C++ software library GPBoost with high-level Python and R packages.

10 Jan 2023

computer-science machine-learning quantitative-methods

Dataset of Fluorescence Spectra and Chemical Parameters of Olive Oils

University of Ljubljana University of Granada Politecnico di Torino Zurich University of Applied Sciences Lucerne University of Applied Sciences and Arts TOELT LLC

This dataset encompasses fluorescence spectra and chemical parameters of 24 olive oil samples from the 2019-2020 harvest provided by the producer Conde de Benalua, Granada, Spain. The oils are characterized by different qualities: 10 extra virgin olive oil (EVOO), 8 virgin olive oil (VOO), and 6 lampante olive oil (LOO) samples. For each sample, the dataset includes fluorescence spectra obtained with two excitation wavelengths, oil quality, and five chemical parameters necessary for the quality assessment of olive oil. The fluorescence spectra were obtained by exciting the samples at 365 nm and 395 nm under identical conditions. The dataset includes the values of the following chemical parameters for each olive oil sample: acidity, peroxide value, K270, K232, ethyl esters, and the quality of the samples (EVOO, VOO, or LOO). The dataset offers a unique possibility for researchers in food technology to develop machine learning models based on fluorescence data for the quality assessment of olive oil due to the availability of both spectroscopic and chemical data. The dataset can be used, for example, to predict one or multiple chemical parameters or to classify samples based on their quality from fluorescence spectra.

17 Mar 2025

computer-science computation-and-language

Aligned Probing: Relating Toxic Behavior and Model Internals

Saarland University Technical University of Darmstadt University of Hamburg Lucerne University of Applied Sciences and Arts

Andreas Waldis

We introduce aligned probing, a novel interpretability framework that aligns the behavior of language models (LMs), based on their outputs, and their internal representations (internals). Using this framework, we examine over 20 OLMo, Llama, and Mistral models, bridging behavioral and internal perspectives for toxicity for the first time. Our results show that LMs strongly encode information about the toxicity level of inputs and subsequent outputs, particularly in lower layers. Focusing on how unique LMs differ offers both correlative and causal evidence that they generate less toxic output when strongly encoding information about the input toxicity. We also highlight the heterogeneity of toxicity, as model behavior and internals vary across unique attributes such as Threat. Finally, four case studies analyzing detoxification, multi-prompt evaluations, model quantization, and pre-training dynamics underline the practical impact of aligned probing with further concrete insights. Our findings contribute to a more holistic understanding of LMs, both within and beyond the context of toxicity.

12 Nov 2022

computer-science machine-learning model-interpretation

On the High Symmetry of Neural Network Functions

Lucerne University of Applied Sciences and Arts TOELT LLC

Training neural networks means solving a high-dimensional optimization problem. Normally the goal is to minimize a loss function that depends on what is called the network function, or in other words the function that gives the network output given a certain input. This function depends on a large number of parameters, also known as weights, that depends on the network architecture. In general the goal of this optimization problem is to find the global minimum of the network function. In this paper it is discussed how due to how neural networks are designed, the neural network function present a very large symmetry in the parameter space. This work shows how the neural network function has a number of equivalent minima, in other words minima that give the same value for the loss function and the same exact output, that grows factorially with the number of neurons in each layer for feed forward neural network or with the number of filters in a convolutional neural networks. When the number of neurons and layers is large, the number of equivalent minima grows extremely fast. This will have of course consequences for the study of how neural networks converges to minima during training. This results is known, but in this paper for the first time a proper mathematical discussion is presented and an estimate of the number of equivalent minima is derived.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Reproducible Builds for Quantum Computing

CleanPatrick: A Benchmark for Image Data Cleaning

Towards Scalable Foundation Models for Digital Dermatology

Intrinsic Self-Supervision for Data Quality Audits

Holmes: A Benchmark to Assess the Linguistic Competence of Language Models

varycoef: An R Package for Gaussian Process-based Spatially Varying Coefficient Models

Hyperbolic Metric Learning for Visual Outlier Detection

Automating the Information Extraction from Semi-Structured Interview Transcripts

Gaussian Process Boosting

MasterTwo: A Mathematica Package for the Automated Calculation of Two Loop Diagrams in the Standard Model and Beyond

A Spatio-Temporal Machine Learning Model for Mortgage Credit Risk: Default Probabilities and Loan Portfolios

Implicit U-KAN2.0: Dynamic, Efficient and Interpretable Medical Image Segmentation

Blind Image Quality Assessment via Transformer Predicted Error Map and Perceptual Quality Token

A Pipeline to Assess Merging Methods via Behavior and Internals

Shedding Light on the Ageing of Extra Virgin Olive Oil: Probing the Impact of Temperature with Fluorescence Spectroscopy and Machine Learning Techniques

KTBoost: Combined Kernel and Tree Boosting

Scalable Computations for Generalized Mixed Effects Models with Crossed Random Effects Using Krylov Subspace Methods

Dataset of Fluorescence Spectra and Chemical Parameters of Olive Oils

Aligned Probing: Relating Toxic Behavior and Model Internals

On the High Symmetry of Neural Network Functions

Events

AI for Law

Personalize Your Feed