alphaXiv

computation

08 Dec 2025

computation computer-science information-theory

Expectations in Expectation Propagation

Expectation Propagation (EP) is a widely used message-passing algorithm that decomposes a global inference problem into multiple local ones. It approximates marginal distributions (beliefs) using intermediate functions (messages). While beliefs must be proper probability distributions that integrate to one, messages may have infinite integral values. In Gaussian-projected EP, such messages take a Gaussian form and appear as if they have "negative" variances. Although allowed within the EP framework, these negative-variance messages can impede algorithmic progress. In this paper, we investigate EP in linear models and analyze the relationship between the corresponding beliefs. Based on the analysis, we propose both non-persistent and persistent approaches that prevent the algorithm from being blocked by messages with infinite integral values. Furthermore, by examining the relationship between the EP messages in linear models, we develop an additional approach that avoids the occurrence of messages with infinite integral values.

09 Dec 2025

computation computation statistics

deepspat: An R package for modeling nonstationary spatial and spatio-temporal Gaussian and extremes data through deep deformations

Nonstationarity in spatial and spatio-temporal processes is ubiquitous in environmental datasets, but is not often addressed in practice, due to a scarcity of statistical software packages that implement nonstationary models. In this article, we introduce the R software package deepspat, which allows for modeling, fitting and prediction with nonstationary spatial and spatio-temporal models applied to Gaussian and extremes data. The nonstationary models in our package are constructed using a deep multi-layered deformation of the original spatial or spatio-temporal domain, and are straightforward to implement. Model parameters are estimated using gradient-based optimization of customized loss functions with tensorflow, which implements automatic differentiation. The functionalities of the package are illustrated through simulation studies and an application to Nepal temperature data.

10 Dec 2025

computation econometrics economics

Debiased Bayesian Inference for High-dimensional Regression Models

There has been significant progress in Bayesian inference based on sparsity-inducing (e.g., spike-and-slab and horseshoe-type) priors for high-dimensional regression models. The resulting posteriors, however, in general do not possess desirable frequentist properties, and the credible sets thus cannot serve as valid confidence sets even asymptotically. We introduce a novel debiasing approach that corrects the bias for the entire Bayesian posterior distribution. We establish a new Bernstein-von Mises theorem that guarantees the frequentist validity of the debiased posterior. We demonstrate the practical performance of our proposal through Monte Carlo simulations and two empirical applications in economics.

10 Dec 2025

computation computation statistics

Minimization of Functions on Dually Flat Spaces Using Geodesic Descent Based on Dual Connections

We propose geodesic-based optimization methods on dually flat spaces, where the geometric structure of the parameter manifold is closely related to the form of the objective function. A primary application is maximum likelihood estimation in statistical models, especially exponential families, whose model manifolds are dually flat. We show that an m-geodesic update, which directly optimizes the log-likelihood, can theoretically reach the maximum likelihood estimator in a single step. In contrast, an e-geodesic update has a practical advantage in cases where the parameter space is geodesically complete, allowing optimization without explicitly handling parameter constraints. We establish the theoretical properties of the proposed methods and validate their effectiveness through numerical experiments.

09 Dec 2025

computation computation statistics

All Emulators are Wrong, Many are Useful, and Some are More Useful Than Others: A Reproducible Comparison of Computer Model Surrogates

Accurate and efficient surrogate modeling is essential for modern computational science, and there are a staggering number of emulation methods to choose from. With new methods being developed all the time, comparing the relative strengths and weaknesses of different methods remains a challenge due to inconsistent benchmarking practices and (sometimes) limited reproducibility and transparency. In this work, we present a large-scale, fully reproducible comparison of

29

distinct emulators across

60

canonical test functions and

40

real emulation datasets. To facilitate rigorous, apples-to-apples comparisons, we introduce the R package \texttt{duqling}, which streamlines reproducible simulation studies using a consistent, simple syntax, and automatic internal scaling of inputs. This framework allows researchers to compare emulators in a unified environment and makes it possible to replicate or extend previous studies with minimal effort, even across different publications. Our results provide detailed empirical insight into the strengths and weaknesses of state-of-the-art emulators and offer guidance for both method developers and practitioners selecting a surrogate for new data. We discuss best practices for emulator comparison and highlight how \texttt{duqling} can accelerate research in emulator design and application.

03 Dec 2025

computation physics atmospheric-and-oceanic-physics

High-Resolution Retrieval of Atmospheric Boundary Layers with Nonstationary Gaussian Processes

Argonne National Laboratory

University of Wisconsin-Madison

The atmospheric boundary layer (ABL) plays a critical role in governing turbulent exchanges of momentum, heat moisture, and trace gases between the Earth's surface and the free atmosphere, thereby influencing meteorological phenomena, air quality, and climate processes. Accurate and temporally continuous characterization of the ABL structure and height evolution is crucial for both scientific understanding and practical applications. High-resolution retrievals of the ABL height from vertical velocity measurements is challenging because it is often estimated using empirical thresholds applied to profiles of vertical velocity variance or related turbulence diagnostics at each measurement altitude, which can suffer from limited sampling and sensitivity to noise. To address these limitations, this work employs nonstationary Gaussian process (GP) modeling to more effectively capture the spatio-temporal dependence structure in the data, enabling high-quality -- and, if desired, high-resolution -- estimates of the ABL height without reliance on ad-hoc parameter tuning. By leveraging Vecchia approximations, the proposed method can be applied to large-scale datasets, and example applications using full-day vertical velocity profiles comprising approximately

5

M measurements are presented.

02 Dec 2025

computation ai-for-health computer-science

Functional Random Forest with Adaptive Cost-Sensitive Splitting for Imbalanced Functional Data Classification

Classification of functional data where observations are curves or trajectories poses unique challenges, particularly under severe class imbalance. Traditional Random Forest algorithms, while robust for tabular data, often fail to capture the intrinsic structure of functional observations and struggle with minority class detection. This paper introduces Functional Random Forest with Adaptive Cost-Sensitive Splitting (FRF-ACS), a novel ensemble framework designed for imbalanced functional data classification. The proposed method leverages basis expansions and Functional Principal Component Analysis (FPCA) to represent curves efficiently, enabling trees to operate on low dimensional functional features. To address imbalance, we incorporate a dynamic cost sensitive splitting criterion that adjusts class weights locally at each node, combined with a hybrid sampling strategy integrating functional SMOTE and weighted bootstrapping. Additionally, curve specific similarity metrics replace traditional Euclidean measures to preserve functional characteristics during leaf assignment. Extensive experiments on synthetic and real world datasets including biomedical signals and sensor trajectories demonstrate that FRF-ACS significantly improves minority class recall and overall predictive performance compared to existing functional classifiers and imbalance handling techniques. This work provides a scalable, interpretable solution for high dimensional functional data analysis in domains where minority class detection is critical.

01 Dec 2025

computation mathematics probability

An hybrid stochastic Newton algorithm for logistic regression

In this paper, we investigate a second-order stochastic algorithm for solving large-scale binary classification problems. We propose to make use of a new hybrid stochastic Newton algorithm that includes two weighted components in the Hessian matrix estimation: the first one coming from the natural Hessian estimate and the second associated with the stochastic gradient information. Our motivation comes from the fact that both parts evaluated at the true parameter of logistic regression, are equal to the Hessian matrix. This new formulation has several advantages and it enables us to prove the almost sure convergence of our stochastic algorithm to the true parameter. Moreover, we significantly improve the almost sure rate of convergence to the Hessian matrix. Furthermore, we establish the central limit theorem for our hybrid stochastic Newton algorithm. Finally, we show a surprising result on the almost sure convergence of the cumulative excess risk.

27 Nov 2025

computation mathematics probability

A two-parameter, minimal-data model to predict dengue cases: the 2022-2023 outbreak in Florida, USA

Kansas State University

Reliable and timely dengue predictions provide actionable lead time for targeted vector control and clinical preparedness, reducing preventable diseases and health-system costs in at-risk communities. Dengue forecasting often relies on site-specific covariates and entomological data, limiting generalizability in data-sparse settings. We propose a data-parsimonious (DP) framework based on the incidence versus cumulative cases (ICC) curve, extending it from a basic SIR to a two-population SEIR model for dengue. Our DP model uses only the target season's incidence time series and estimates only two parameters, reducing noise and computational burden. A Bayesian extension quantifies the case reporting and fitting uncertainty to produce calibrated predictive intervals. We evaluated the performance of the DP model in the 2022-2023 outbreaks in Florida, where standardized clinical tests and reporting support accurate case determination. The DP framework demonstrates competitive predictive performance at substantially lower computational cost than more elaborate models, making it suitable for dengue early detection where dense surveillance or long historical records are unavailable.

26 Nov 2025

computation computation statistics

SVEMnet: An R package for Self-Validated Elastic-Net Ensembles and Multi-Response Optimization in Small-Sample Mixture--Process Experiments

SVEMnet is an R package for fitting Self-Validated Ensemble Models (SVEM) with elastic-net base learners and for performing multi-response optimization in small-sample mixture--process design-of-experiments (DOE) studies with numeric, categorical, and mixture factors. SVEMnet wraps elastic-net and relaxed elastic-net models for Gaussian and binomial responses from glmnet in a fractional random-weight (FRW) resampling scheme with anti-correlated train/validation weights; penalties are selected by validation-weighted AIC- and BIC-type criteria, and predictions are averaged across replicates to stabilize fits near the interpolation boundary. In addition to the core SVEM engine, the package provides deterministic high-order formula expansion, a permutation-based whole-model test heuristic, and a mixture-constrained random-search optimizer that combines Derringer--Suich desirability functions, bootstrap-based uncertainty summaries, and optional mean-level specification-limit probabilities to generate scored candidate tables and diverse exploitation and exploration medoids for sequential fit--score--run--refit workflows. A simulated lipid nanoparticle (LNP) formulation study illustrates these tools in a small-sample mixture--process DOE setting, and simulation experiments based on sparse quadratic response surfaces benchmark SVEMnet against repeated cross-validated elastic-net baselines.

12 Nov 2025

computation clustering-algorithms computer-science

Convex Clustering Redefined: Robust Learning with the Median of Means Estimator

University of Oxford

University of Michigan

University of Texas at Austin Indian Statistical Institute, Kolkata

This work introduces COMET, a robust convex clustering method that integrates the Median of Means (MoM) estimator and a clipped fusion penalty to enhance resilience against outliers and noise. It consistently achieves higher clustering accuracy and more stable cluster number estimation compared to existing methods, particularly in high-dimensional and contaminated datasets.

14 Oct 2025

computation clustering-algorithms computer-science

Dendrograms of Mixing Measures for Softmax-Gated Gaussian Mixture of Experts: Consistency without Model Sweeps

University of Texas at Austin Queensland University of Technology University of Science, Ho Chi Minh city, Vietnam ARC Centre of Excellence for the Mathematical Analysis of Cellular Systems Vietnam National University Ho Chi Minh City, Vietnam

We develop a unified statistical framework for softmax-gated Gaussian mixture of experts (SGMoE) that addresses three long-standing obstacles in parameter estimation and model selection: (i) non-identifiability of gating parameters up to common translations, (ii) intrinsic gate-expert interactions that induce coupled differential relations in the likelihood, and (iii) the tight numerator-denominator coupling in the softmax-induced conditional density. Our approach introduces Voronoi-type loss functions aligned with the gate-partition geometry and establishes finite-sample convergence rates for the maximum likelihood estimator (MLE). In over-specified models, we reveal a link between the MLE's convergence rate and the solvability of an associated system of polynomial equations characterizing near-nonidentifiable directions. For model selection, we adapt dendrograms of mixing measures to SGMoE, yielding a consistent, sweep-free selector of the number of experts that attains pointwise-optimal parameter rates under overfitting while avoiding multi-size training. Simulations on synthetic data corroborate the theory, accurately recovering the expert count and achieving the predicted rates for parameter estimation while closely approximating the regression function. Under model misspecification (e.g.,

\epsilon

-contamination), the dendrogram selection criterion is robust, recovering the true number of mixture components, while the Akaike information criterion, the Bayesian information criterion, and the integrated completed likelihood tend to overselect as sample size grows. On a maize proteomics dataset of drought-responsive traits, our dendrogram-guided SGMoE selects two experts, exposes a clear mixing-measure hierarchy, stabilizes the likelihood early, and yields interpretable genotype-phenotype maps, outperforming standard criteria without multi-size training.

09 Oct 2025

computation computation statistics

Rotated Mean-Field Variational Inference and Iterative Gaussianization

UCLA

Duke University

We propose to perform mean-field variational inference (MFVI) in a rotated coordinate system that reduces correlations between variables. The rotation is determined by principal component analysis (PCA) of a cross-covariance matrix involving the target's score function. Compared with standard MFVI along the original axes, MFVI in this rotated system often yields substantially more accurate approximations with negligible additional cost. MFVI in a rotated coordinate system defines a rotation and a coordinatewise map that together move the target closer to Gaussian. Iterating this procedure yields a sequence of transformations that progressively transforms the target toward Gaussian. The resulting algorithm provides a computationally efficient way to construct flow-like transport maps: it requires only MFVI subproblems, avoids large-scale optimization, and yields transformations that are easy to invert and evaluate. In Bayesian inference tasks, we demonstrate that the proposed method achieves higher accuracy than standard MFVI, while maintaining much lower computational cost than conventional normalizing flows.

04 Oct 2025

computation computer-science machine-learning

The analogy theorem in Hoare logic

Moscow State University of Technology “Stankin”National University of Science & Technology (MISIS)

The introduction of machine learning methods has led to significant advances in automation, optimization, and discoveries in various fields of science and technology. However, their widespread application faces a fundamental limitation: the transfer of models between data domains generally lacks a rigorous mathematical justification. The key problem is the lack of formal criteria to guarantee that a model trained on one type of data will retain its properties on this http URL paper proposes a solution to this problem by formalizing the concept of analogy between data sets and models using first-order logic and Hoare this http URL formulate and rigorously prove a theorem that sets out the necessary and sufficient conditions for analogy in the task of knowledge transfer between machine learning models. Practical verification of the analogy theorem on model data obtained using the Monte Carlo method, as well as on MNIST and USPS data, allows us to achieving F1 scores of 0.84 and 0.88 for convolutional neural networks and random forests, this http URL proposed approach not only allows us to justify the correctness of transfer between domains but also provides tools for comparing the applicability of models to different types of this http URL main contribution of the work is a rigorous formalization of analogy at the level of program logic, providing verifiable guarantees of the correctness of knowledge transfer, which opens new opportunities for both theoretical research and the practical use of machine learning models in previously inaccessible areas.

283

02 Dec 2025

computation computation statistics

Parallelizing MCMC Across the Sequence Length

Stanford University

Markov chain Monte Carlo (MCMC) methods are foundational algorithms for Bayesian inference and probabilistic modeling. However, most MCMC algorithms are inherently sequential and their time complexity scales linearly with the sequence length. Previous work on adapting MCMC to modern hardware has therefore focused on running many independent chains in parallel. Here, we take an alternative approach: we propose algorithms to evaluate MCMC samplers in parallel across the chain length. To do this, we build on recent methods for parallel evaluation of nonlinear recursions that formulate the state sequence as a solution to a fixed-point problem and solve for the fixed-point using a parallel form of Newton's method. We show how this approach can be used to parallelize Gibbs, Metropolis-adjusted Langevin, and Hamiltonian Monte Carlo sampling across the sequence length. In several examples, we demonstrate the simulation of up to hundreds of thousands of MCMC samples with only tens of parallel Newton iterations. Additionally, we develop two new parallel quasi-Newton methods to evaluate nonlinear recursions with lower memory costs and reduced runtime. We find that the proposed parallel algorithms accelerate MCMC sampling across multiple examples, in some cases by more than an order of magnitude compared to sequential evaluation.

25 Aug 2025

computation computer-science artificial-intelligence

Algebraic Approach to Ridge-Regularized Mean Squared Error Minimization in Minimal ReLU Neural Network

RIKEN The Graduate University for Advanced Studies (SOKENDAI)Institute of Statistical Mathematics Kyushu University Kagoshima University

This paper investigates a perceptron, a simple neural network model, with ReLU activation and a ridge-regularized mean squared error (RR-MSE). Our approach leverages the fact that the RR-MSE for ReLU perceptron is piecewise polynomial, enabling a systematic analysis using tools from computational algebra. In particular, we develop a Divide-Enumerate-Merge strategy that exhaustively enumerates all local minima of the RR-MSE. By virtue of the algebraic formulation, our approach can identify not only the typical zero-dimensional minima (i.e., isolated points) obtained by numerical optimization, but also higher-dimensional minima (i.e., connected sets such as curves, surfaces, or hypersurfaces). Although computational algebraic methods are computationally very intensive for perceptrons of practical size, as a proof of concept, we apply the proposed approach in practice to minimal perceptrons with a few hidden units.

13 Apr 2025

computation adversarial-robustness computer-science

RANSAC Revisited: An Improved Algorithm for Robust Subspace Recovery under Adversarial and Noisy Corruptions

University of Michigan

Salar Fattahi

RANSAC+, an enhanced algorithm from the University of Michigan, effectively recovers low-dimensional subspaces from data contaminated with both Gaussian noise and adversarial outliers. This method offers improved computational efficiency and automatically estimates the subspace dimension, addressing limitations of classic RANSAC.

03 Dec 2025

computation computation statistical-learning

Lecture Notes on High Dimensional Linear Regression

Erasmus University of Rotterdam

These lecture notes cover advanced topics in linear regression, with an in-depth exploration of the existence, uniqueness, relations, computation, and non-asymptotic properties of the most prominent estimators in this setting. The covered estimators include least squares, ridgeless, ridge, and lasso. The content follows a proposition-proof structure, making it suitable for students seeking a formal and rigorous understanding of the statistical theory underlying machine learning methods.

04 Dec 2024

computation mathematics computation

Quantifying the effectiveness of linear preconditioning in Markov chain Monte Carlo

University College London

This research quantifies how linear preconditioning impacts the condition number in Markov Chain Monte Carlo, demonstrating its effectiveness for target distributions with additive or multiplicative Hessian structures and showing it can provably accelerate Random Walk Metropolis. Crucially, it finds that diagonal preconditioning can sometimes increase the condition number and degrade sampler performance, while optimal preconditioning reduces HMC computational cost by a factor of approximately \sqrt{\kappa}.

06 Nov 2024

computation computation statistics

Running Markov Chain Monte Carlo on Modern Hardware and Software

Google DeepMind

Researchers at Google DeepMind provide practical guidance on accelerating Markov Chain Monte Carlo (MCMC) methods using modern parallel hardware like GPUs and TPUs, integrated with deep learning software frameworks such as JAX. The work details how to leverage chain, data, and model parallelism, achieving up to 12x speedups over CPUs while addressing critical numerical and hardware-specific implementation challenges.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Expectations in Expectation Propagation

deepspat: An R package for modeling nonstationary spatial and spatio-temporal Gaussian and extremes data through deep deformations

Debiased Bayesian Inference for High-dimensional Regression Models

Minimization of Functions on Dually Flat Spaces Using Geodesic Descent Based on Dual Connections

All Emulators are Wrong, Many are Useful, and Some are More Useful Than Others: A Reproducible Comparison of Computer Model Surrogates

High-Resolution Retrieval of Atmospheric Boundary Layers with Nonstationary Gaussian Processes

Functional Random Forest with Adaptive Cost-Sensitive Splitting for Imbalanced Functional Data Classification

An hybrid stochastic Newton algorithm for logistic regression

A two-parameter, minimal-data model to predict dengue cases: the 2022-2023 outbreak in Florida, USA

SVEMnet: An R package for Self-Validated Elastic-Net Ensembles and Multi-Response Optimization in Small-Sample Mixture--Process Experiments

Convex Clustering Redefined: Robust Learning with the Median of Means Estimator

Dendrograms of Mixing Measures for Softmax-Gated Gaussian Mixture of Experts: Consistency without Model Sweeps

Rotated Mean-Field Variational Inference and Iterative Gaussianization

The analogy theorem in Hoare logic

Parallelizing MCMC Across the Sequence Length

Algebraic Approach to Ridge-Regularized Mean Squared Error Minimization in Minimal ReLU Neural Network

RANSAC Revisited: An Improved Algorithm for Robust Subspace Recovery under Adversarial and Noisy Corruptions

Lecture Notes on High Dimensional Linear Regression

Quantifying the effectiveness of linear preconditioning in Markov chain Monte Carlo

Running Markov Chain Monte Carlo on Modern Hardware and Software

Events

AI for Law

Personalize Your Feed