École Normale Supérieure de Paris
Forward gradients have been recently introduced to bypass backpropagation in autodifferentiation, while retaining unbiased estimators of true gradients. We derive an optimality condition to obtain best approximating forward gradients, which leads us to mathematical insights that suggest optimization in high dimension is challenging with forward gradients. Our extensive experiments on test functions support this claim.
3
We present a new experimental facility to investigate the nucleation and growth of liquid droplets and ice particles under controlled conditions and characterize processes relevant to cloud microphysics: the rapid expansion aerosol chamber (REACh). REACh is an intermediate size chamber (~0.14 m3^3) combining the principle of an expansion chamber with the ability to probe the influence of turbulent flows. Nucleation is achieved via a sudden pressure drop accompanied by a temperature drop, which can cause humid air to condense into a cloud of droplets under the appropriate thermodynamic conditions. REACh features tight control and monitoring of the initial saturation ratio of water vapor, identity and concentration of seeding aerosol particles, temperature, pressure, and air flow mixing, together with high speed real time measurements of aerosol and droplet size and number. Here, we demonstrate that the minimum temperature reached during each expansion can be reasonably described by the thermodynamics of dry or moist adiabats, for a wide range of initial relative humidity. The size and number of droplets formed, and the overall lifetime of the cloud, are characterized as a function of the aerosol concentration and initial water vapor saturation ratio. The total droplet concentration scales linearly with the seeding aerosol concentration, suggesting that all injected aerosol particles serve as condensation nuclei. While the total number of droplets formed increases with aerosol concentration, the mean droplet size decreases with the concentration of seeding aerosols as a result of competition for the available water vapor. Theoretical considerations provide a quantitative prediction for the mean droplet size for a wide range of conditions.
We investigate the high-dimensional properties of robust regression estimators in the presence of heavy-tailed contamination of both the covariates and response functions. In particular, we provide a sharp asymptotic characterisation of M-estimators trained on a family of elliptical covariate and noise data distributions including cases where second and higher moments do not exist. We show that, despite being consistent, the Huber loss with optimally tuned location parameter δ\delta is suboptimal in the high-dimensional regime in the presence of heavy-tailed noise, highlighting the necessity of further regularisation to achieve optimal performance. This result also uncovers the existence of a transition in δ\delta as a function of the sample complexity and contamination. Moreover, we derive the decay rates for the excess risk of ridge regression. We show that, while it is both optimal and universal for covariate distributions with finite second moment, its decay rate can be considerably faster when the covariates' second moment does not exist. Finally, we show that our formulas readily generalise to a richer family of models and data distributions, such as generalised linear estimation with arbitrary convex regularisation trained on mixture models.
Many of the ordinal regression models that have been proposed in the literature can be seen as methods that minimize a convex surrogate of the zero-one, absolute, or squared loss functions. A key property that allows to study the statistical implications of such approximations is that of Fisher consistency. Fisher consistency is a desirable property for surrogate loss functions and implies that in the population setting, i.e., if the probability distribution that generates the data were available, then optimization of the surrogate would yield the best possible model. In this paper we will characterize the Fisher consistency of a rich family of surrogate loss functions used in the context of ordinal regression, including support vector ordinal regression, ORBoosting and least absolute deviation. We will see that, for a family of surrogate loss functions that subsumes support vector ordinal regression and ORBoosting, consistency can be fully characterized by the derivative of a real-valued function at zero, as happens for convex margin-based surrogates in binary classification. We also derive excess risk bounds for a surrogate of the absolute error that generalize existing risk bounds for binary classification. Finally, our analysis suggests a novel surrogate of the squared error loss. We compare this novel surrogate with competing approaches on 9 different datasets. Our method shows to be highly competitive in practice, outperforming the least squares loss on 7 out of 9 datasets.
University of Cambridge and UC Berkeley researchers formalize and extend Sun's construction for Diophantine equations over integers using Isabelle/HOL theorem prover, establishing the first non-trivial universal pair (11, 1.16258·10^63) for Diophantine equations over integers and proving that any Diophantine equation can be reduced to an equivalent equation with at most 11 variables, while developing custom metaprogramming tools and formal libraries that extend Isabelle's multivariate polynomial capabilities.
In this work, we study the problem of detecting periodic trends in strings. While detecting exact periodicity has been studied extensively, real-world data is often noisy, where small deviations or mismatches occur between repetitions. This work focuses on a generalized approach to period detection that efficiently handles noise. Given a string SS of length nn, the task is to identify integers pp such that the prefix and the suffix of SS, each of length np+1n-p+1, are similar under a given distance measure. Ergün et al. [APPROX-RANDOM 2017] were the first to study this problem in the streaming model under the Hamming distance. In this work, we combine, in a non-trivial way, the Hamming distance sketch of Clifford et al. [SODA 2019] and the structural description of the kk-mismatch occurrences of a pattern in a text by Charalampopoulos et al. [FOCS 2020] to present a more efficient streaming algorithm for period detection under the Hamming distance. As a corollary, we derive a streaming algorithm for detecting periods of strings which may contain wildcards, a special symbol that match any character of the alphabet. Our algorithm is not only more efficient than that of Ergün et al. [TCS 2020], but it also operates without their assumption that the string must be free of wildcards in its final characters. Additionally, we introduce the first two-pass streaming algorithm for computing periods under the edit distance by leveraging and extending the Bhattacharya-Koucký's grammar decomposition technique [STOC 2023].
03 Apr 2025
Given a probability measure with density, Fermat distances and density-driven metrics are conformal transformation of the Euclidean metric that shrink distances in high density areas and enlarge distances in low density areas. Although they have been widely studied and have shown to be useful in various machine learning tasks, they are limited to measures with density (with respect to Lebesgue measure, or volume form on manifold). In this paper, by replacing the density with the Distance-to-Measure, we introduce a new metric, the Fermat Distance-to-Measure, defined for any probability measure in R^d. We derive strong stability properties for the Fermat Distance-to-Measure with respect to the measure and propose an estimator from random sampling of the measure, featuring an explicit bound on its convergence speed.
This thesis is interested in the application of statistical physics methods and inference to sparse linear estimation problems. The main tools are the graphical models and approximate message-passing algorithm together with the cavity method. We will also use the replica method of statistical physics of disordered systems which allows to associate to the studied problems a cost function referred as the potential of free entropy in physics. It allows to predict the different phases of typical complexity of the problem as a function of external parameters such as the noise level or the number of measurements one has about the signal: the inference can be typically easy, hard or impossible. We will see that the hard phase corresponds to a regime of coexistence of the actual solution together with another unwanted solution of the message passing equations. In this phase, it represents a metastable state which is not the true equilibrium solution. This phenomenon can be linked to supercooled water blocked in the liquid state below its freezing critical temperature. We will use a method that allows to overcome the metastability mimicing the strategy adopted by nature itself for supercooled water: the nucleation and spatial coupling. In supercooled water, a weak localized perturbation is enough to create a crystal nucleus that will propagate in all the medium thanks to the physical couplings between closeby atoms. The same process will help the algorithm to find the signal, thanks to the introduction of a nucleus containing local information about the signal. It will then spread as a "reconstruction wave" similar to the crystal in the water. After an introduction to statistical inference and sparse linear estimation, we will introduce the necessary tools. Then we will move to applications of these notions to signal processing and coding theory problems.
We present a light-weight body-terrain clearance evaluation algorithm for the automated path planning of NASA's Mars 2020 rover. Extraterrestrial path planning is challenging due to the combination of terrain roughness and severe limitation in computational resources. Path planning on cluttered and/or uneven terrains requires repeated safety checks on all the candidate paths at a small interval. Predicting the future rover state requires simulating the vehicle settling on the terrain, which involves an inverse-kinematics problem with iterative nonlinear optimization under geometric constraints. However, such expensive computation is intractable for slow spacecraft computers, such as RAD750, which is used by the Curiosity Mars rover and upcoming Mars 2020 rover. We propose the Approximate Clearance Evaluation (ACE) algorithm, which obtains conservative bounds on vehicle clearance, attitude, and suspension angles without iterative computation. It obtains those bounds by estimating the lowest and highest heights that each wheel may reach given the underlying terrain, and calculating the worst-case vehicle configuration associated with those extreme wheel heights. The bounds are guaranteed to be conservative, hence ensuring vehicle safety during autonomous navigation. ACE is planned to be used as part of the new onboard path planner of the Mars 2020 rover. This paper describes the algorithm in detail and validates our claim of conservatism and fast computation through experiments.
The possibility that a classical space-time and quantum matter cohabit at the deepest level, i.e. the possibility of having a fundamental and not phenomenological semiclassical gravity, is often disregarded for lack of a good candidate theory. The standard semiclassical theory suffers from fundamental inconsistencies (e.g.: Schr\"odinger cat sources, faster-than-light communication and violation of the Born rule) which can only be ignored in simple typical situations. We harness the power of spontaneous localization models, historically constructed to solve the measurement problem in quantum mechanics, to build a consistent theory of (stochastic) semiclassical gravity in the Newtonian limit. Our model makes quantitative and potentially testable predictions: we recover the Newtonian pair potential up to a short distance cut-off (hence we predict no 1 particle self-interaction) and uncover an additional gravitational decoherence term which depends on the specifics of the underlying spontaneous localization model considered. We hint at a possible program to go past the Newtonian limit, towards a consistent general relativistic semiclassical gravity.
Palindromes are non-empty strings that read the same forward and backward. The problem of recognizing strings that can be represented as the concatenation of even-length palindromes, the concatenation of palindromes of length greater than one, and the concatenation of exactly kk palindromes was introduced in the seminal paper of Knuth, Morris, and Pratt [SIAM J. Comput., 1977]. In this work, we study the problem of recognizing so-called kk-palindromic strings, which can be represented as the concatenation of exactly kk palindromes. It was shown that the problem is solvable in linear space and time [Rubinchik and Schur, MFCS'2020]. We aim to develop a sublinear-space solution, and show the following results: (1) First, we show a structural characterization of the set of all kk-palindromic prefixes of a string by representing it as a union of a small number of highly structured string sets, called affine prefix sets. We show that the size of this representation is of the right asymptotic form by constructing an almost matching lower bound. (2) Secondly, we derive a read-only algorithm that, given a string TT of length nn and an integer kk, computes a compact representation of the ii-palindromic prefixes of TT, for all 1ik1 \le i \le k. (3) Finally, we also give a read-only algorithm for computing the palindromic length of TT, which is the smallest \ell such that TT is \ell-palindromic, given that k\ell \le k. The algorithms use O(n6k2logkn)\mathcal O(n \cdot 6^{k^2} \cdot \log^k n) time and O(6k2logkn)\mathcal O(6^{k^2} \cdot \log^k n) space. Our work is the first step toward a streaming algorithm for the recognition of kk-palindromic prefixes.
In the Euclidean kk-traveling salesman problem (kk-TSP), we are given nn points in the dd-dimensional Euclidean space, for some fixed constant d2d\geq 2, and a positive integer kk. The goal is to find a shortest tour visiting at least kk points. We give an approximation scheme for the Euclidean kk-TSP in time n2O(1/εd1)(logn)2d22dn\cdot 2^{O(1/\varepsilon^{d-1})} \cdot(\log n)^{2d^2\cdot 2^d}. This improves Arora's approximation scheme of running time nk(logn)(O(d/ε))d1n\cdot k\cdot (\log n)^{\left(O\left(\sqrt{d}/\varepsilon\right)\right)^{d-1}} [J. ACM 1998]. Our algorithm is Gap-ETH tight and can be derandomized by increasing the running time by a factor O(nd)O(n^d).
We study the Longest Common Extension (LCE) problem in a string containing wildcards. Wildcards (also called "don't cares" or "holes") are special characters that match any other character in the alphabet, similar to the character "?" in Unix commands or "." in regular expression engines. We consider the problem parametrized by GG, the number of maximal contiguous groups of wildcards in the input string. Our main contribution is a simple data structure for this problem that can be built in O(n(G/t)logn)O(n (G/t) \log n) time, occupies O(nG/t)O(nG/t) space, and answers queries in O(t)O(t) time, for any $t \in [1 .. G].Uptothe. Up to the O(\log n)$ factor, this interpolates smoothly between the data structure of Crochemore et al. [JDA 2015], which has O(nG)O(nG) preprocessing time and space, and O(1)O(1) query time, and a simple solution based on the ``kangaroo jumping'' technique [Landau and Vishkin, STOC 1986], which has O(n)O(n) preprocessing time and space, and O(G)O(G) query time. By establishing a connection between this problem and Boolean matrix multiplication, we show that our solution is optimal up to subpolynomial factors when G=Ω(n)G = \Omega(n) under a widely believed hypothesis. In addition, we develop a new simple, deterministic and combinatorial algorithm for sparse Boolean matrix multiplication. Finally, we show that our data structure can be used to obtain efficient algorithms for approximate pattern matching and structural analysis of strings with wildcards.
We propose and axiomatize a new model of incomplete preferences under uncertainty, which we call hope-and-prepare preferences. Act f is considered more desirable than act g when, and only when, both an optimistic evaluation, computed as the welfare level attained in a best-case scenario, and a pessimistic one, computed as the welfare level attained in a worst-case scenario, rank f above g. Our comparison criterion involves multiple priors, as best and worst cases are determined among sets of probability distributions, and is, generically, less conservative than Bewley preferences and twofold multi-prior preferences, the two ambiguity models that are closest to ours.
In this paper, we build a double theory capturing the idea of nondeterministic behaviours and trajectories. Following Jaz Myers' Double Categorical Systems Theory, we construct a monoidal double category of systems and interfaces, which then yields (co)representable behaviours. We use conditional products in Markov categories to get compositional trajectories and behaviours, and represent nondeterministic systems and lenses using parametric deterministic maps. The resulting theory can also represent imprecise probability via naming Knightian choices, \emph{à la} Liell-Cock and Staton.
We provide a formal framework accounting for a widespread idea in the theory of economic design: analytically established incompatibilities between given axioms should be qualified by the likelihood of their violation. We define the degree to which rules satisfy an axiom, as well as several axioms, on the basis of a probability measure over the inputs of the rules. Armed with this notion of degree, we propose and characterize i) a criterion to evaluate and compare rules given a set of axioms, allowing the importance of each combination of axioms to differ, and ii) a criterion to measure the compatibility between given axioms, building on a analogy with cooperative game theory.
We describe a class of coadjoint orbits of the group of Hamiltonian diffeomorphisms of a symplectic manifold (M,ω)(M,\omega) by implementing symplectic reduction for the dual pair associated to the Hamiltonian description of ideal fluids. The description is given in terms of nonlinear Grassmannians (manifolds of submanifolds) with additional geometric structures. Reduction at zero momentum yields the identification of coadjoint orbits with Grassmannians of isotropic volume submanifolds, slightly generalizing the results in Weinstein [1990] and Lee [2009]. At the other extreme, the case of a nondegenerate momentum recovers the identification of connected components of the nonlinear symplectic Grassmannian with coadjoint orbits, thereby recovering the result of Haller and Vizman [2004]. We also comment on the intermediate cases which correspond to new classes of coadjoint orbits. The description of these coadjoint orbits as well as their orbit symplectic form is obtained in a systematic way by exploiting the general properties of dual pairs of momentum maps. We also show that whenever the symplectic manifold (M,ω)(M,\omega) is prequantizable, the coadjoint orbits that consist of isotropic submanifolds with total volume aZa\in\mathbb{Z} are prequantizable. The prequantum bundle is constructed explicitly and, in the Lagrangian case, recovers the Berry bundle constructed in Weinstein [1990].
We present a new idea to adapt relational abstract domains to the analysis of IEEE 754-compliant floating-point numbers in order to statically detect, through abstract Interpretation-based static analyses, potential floating-point run-time exceptions such as overflows or invalid operations. In order to take the non-linearity of rounding into account, expressions are modeled as linear forms with interval coefficients. We show how to extend already existing numerical abstract domains, such as the octagon abstract domain, to efficiently abstract transfer functions based on interval linear forms. We discuss specific fixpoint stabilization techniques and give some experimental results.
We present a new experimental facility to investigate the nucleation and growth of liquid droplets and ice particles under controlled conditions and characterize processes relevant to cloud microphysics: the rapid expansion aerosol chamber (REACh). REACh is an intermediate size chamber (~0.14 m3^3) combining the principle of an expansion chamber with the ability to probe the influence of turbulent flows. Nucleation is achieved via a sudden pressure drop accompanied by a temperature drop, which can cause humid air to condense into a cloud of droplets under the appropriate thermodynamic conditions. REACh features tight control and monitoring of the initial saturation ratio of water vapor, identity and concentration of seeding aerosol particles, temperature, pressure, and air flow mixing, together with high speed real time measurements of aerosol and droplet size and number. Here, we demonstrate that the minimum temperature reached during each expansion can be reasonably described by the thermodynamics of dry or moist adiabats, for a wide range of initial relative humidity. The size and number of droplets formed, and the overall lifetime of the cloud, are characterized as a function of the aerosol concentration and initial water vapor saturation ratio. The total droplet concentration scales linearly with the seeding aerosol concentration, suggesting that all injected aerosol particles serve as condensation nuclei. While the total number of droplets formed increases with aerosol concentration, the mean droplet size decreases with the concentration of seeding aerosols as a result of competition for the available water vapor. Theoretical considerations provide a quantitative prediction for the mean droplet size for a wide range of conditions.
This article presents a new numerical abstract domain for static analysis by abstract interpretation. It extends a former numerical abstract domain based on Difference-Bound Matrices and allows us to represent invariants of the form (+/-x+/-y<=c), where x and y are program variables and c is a real constant. We focus on giving an efficient representation based on Difference-Bound Matrices - O(n2) memory cost, where n is the number of variables - and graph-based algorithms for all common abstract operators - O(n3) time cost. This includes a normal form algorithm to test equivalence of representation and a widening operator to compute least fixpoint approximations.
There are no more papers matching your filters at the moment.