Weierstrass Institute
A nonparametric estimator for path-valued data integrates the Nadaraya-Watson framework with semi-metrics derived from the signature transform. This method achieves Euclidean-type nonparametric convergence rates, demonstrating faster learning from data, and shows superior accuracy and computational efficiency in SDE learning and time series classification tasks.
We study the strong rate of convergence of the Euler--Maruyama scheme for a multidimensional stochastic differential equation (SDE) dXt=b(Xt)dt+dLt, dX_t = b(X_t) \, dt + dL_t, with irregular β\beta-Hölder drift, β>0\beta > 0, driven by a Lévy process with exponent α(0,2]\alpha \in (0, 2]. For α[2/3,2]\alpha \in [2/3, 2], we obtain strong LpL_p and almost sure convergence rates in the entire range β>1α/2\beta > 1 - \alpha/2, where the SDE is known to be strongly well-posed. This significantly improves the current state of the art, both in terms of convergence rate and the range of α\alpha. Notably, the obtained convergence rate does not depend on pp, which is a novelty even in the case of smooth drifts. As a corollary of the obtained moment-independent error rate, we show that the Euler--Maruyama scheme for such SDEs converges almost surely and obtain an explicit convergence rate. Additionally, as a byproduct of our results, we derive strong LpL_p convergence rates for approximations of nonsmooth additive functionals of a Lévy process. Our technique is based on a new extension of stochastic sewing arguments and Lê's quantitative John-Nirenberg inequality.
We study a multidimensional stochastic differential equation with additive noise: dXt=b(t,Xt)dt+dξt, d X_t=b(t, X_t) dt +d \xi_t, where the drift bb is integrable in space and time, and ξ\xi is either a fractional Brownian motion or an α\alpha-stable process. We show weak existence of solutions to this equation under the optimal condition on integrability indices of bb, going beyond the subcritical Krylov-Röckner (Prodi-Serrin-Ladyzhenskaya) regime. This extends the recent results of Krylov (2020) to the fractional Brownian and Lévy cases. We also construct a counterexample to demonstrate the optimality of this condition. Our methods are built upon a version of the stochastic sewing lemma of Lê and the John--Nirenberg inequality.
We present a fit-for-purpose introduction to tensors and their operations. It is envisaged to help the reader become acquainted with its underpinning concepts for the study of path signatures. The text includes exercises, solutions and many intuitive explanations. The material discusses direct sums and tensor products as two possible operations that make the Cartesian product of vectors spaces a vector space. The difference lies in linear Vs. multilinear structures -- the latter being the suitable one to deal with path signatures. The presentation is offered to understand tensors in a deeper sense than just a multidimensional array. The text concludes with the prime example of an algebra in relation to path signatures: the 'tensor algebra'. This manuscript is the extended version (with two extra sections) of a chapter to appear in Open Access in a forthcoming Springer volume ``Signatures Methods in Finance: An Introduction with Computational Applications". The two additional sections here discuss the factoring of tensor product expressions to a minimal number of terms. This problem is relevant for the path signatures theory but not necessary for what is presented in the book. Tensor factorization is an elegant way of becoming familiar with the language of tensors and tensor products. A GitHub repository is attached.
Optimal Transport (OT) problems are a cornerstone of many applications, but solving them is computationally expensive. To address this problem, we propose UNOT (Universal Neural Optimal Transport), a novel framework capable of accurately predicting (entropic) OT distances and plans between discrete measures for a given cost function. UNOT builds on Fourier Neural Operators, a universal class of neural networks that map between function spaces and that are discretization-invariant, which enables our network to process measures of variable resolutions. The network is trained adversarially using a second, generating network and a self-supervised bootstrapping loss. We ground UNOT in an extensive theoretical framework. Through experiments on Euclidean and non-Euclidean domains, we show that our network not only accurately predicts OT distances and plans across a wide range of datasets, but also captures the geometry of the Wasserstein space correctly. Furthermore, we show that our network can be used as a state-of-the-art initialization for the Sinkhorn algorithm with speedups of up to 7.4×7.4\times, significantly outperforming existing approaches.
This note presents sharp inequalities for deviation probability of a general quadratic form of a random vector \xiv\xiv with finite exponential moments. The obtained deviation bounds are similar to the case of a Gaussian random vector. The results are stated under general conditions and do not suppose any special structure of the vector \xiv\xiv. The obtained bounds are exact (non-asymptotic), all constants are explicit and the leading terms in the bounds are sharp.
Consider the stochastic heat equation \begin{equation*} \partial_t u_t(x)=\frac12 \partial^2_{xx}u_t(x) +b(u_t(x))+\dot{W}_{t}(x),\quad t\in(0,T],\, x\in D, \end{equation*} where bb is a generalized function, DD is either [0,1][0,1] or R\mathbb{R}, and W˙\dot W is space-time white noise on R+×D\mathbb{R}_+\times D. If the drift bb is a sufficiently regular function, then it is well-known that any analytically weak solution to this equation is also analytically mild, and vice versa. We extend this result to drifts that are generalized functions, with an appropriate adaptation of the notions of mild and weak solutions. As a corollary of our results, we show that for bLp(R)b\in L_p(\mathbb{R}), p1p\ge1, this equation has a unique analytically weak and mild solution, thus extending the classical results of Gyöngy and Pardoux (1993).
22 Apr 2025
The recent paper \cite{GSZ2023} on estimation and inference for top-ranking problem in Bradley-Terry-Lice (BTL) model presented a surprising result: component-wise estimation and inference can be done under much weaker conditions on the number of comparison then it is required for the full dimensional estimation. The present paper revisits this finding from completely different viewpoint. Namely, we show how a theoretical study of \emph{estimation in sup-norm} can be reduced to the analysis of \emph{plug-in semiparametric estimation}. For the latter, we adopt and extend the general approach from \cite{Sp2024} to high-dimensional estimation and inference. The main tool of the analysis is a theory of \emph{perturbed marginal optimization} when an objective function depends on a low-dimensional target parameter along with a high-dimensional nuisance parameter. A particular focus of the study is the critical dimension condition. Full-dimensional estimation requires in general the condition \mathbbmslNp \mathbbmsl{N} \gg \mathbb{p} between the effective parameter dimension p \mathbb{p} and the effective sample size \( \mathbbmsl{N} \) corresponding to the smallest eigenvalue of the Fisher information matrix \mathbbmslF \mathbbmsl{F} . Inference on the estimated parameter is even more demanding: the condition \mathbbmslNp2 \mathbbmsl{N} \gg \mathbb{p}^{2} cannot be generally avoided; see \cite{Sp2024}. However, for the sup-norm estimation, the critical dimension condition can be reduced to \( \mathbbmsl{N} \geq C \log p \).
We consider stochastic differential equation dXt=b(Xt)dt+dWtH, d X_t=b(X_t) dt +d W_t^H, where the drift bb is either a measure or an integrable function, and WHW^H is a dd-dimensional fractional Brownian motion with Hurst parameter H(0,1)H\in(0,1), dNd\in\mathbb{N}. For the case where bLp(Rd)b\in L_p(\mathbb{R}^d), p[1,]p\in[1,\infty] we show weak existence of solutions to this equation under the condition \frac{d}p<\frac1H-1, which is an extension of the Krylov-Röckner condition (2005) to the fractional case. We construct a counter-example showing optimality of this condition. If bb is a Radon measure, particularly the delta measure, we prove weak existence of solutions to this equation under the optimal condition H<\frac1{d+1}. We also show strong well-posedness of solutions to this equation under certain conditions. To establish these results, we utilize the stochastic sewing technique and develop a new version of the stochastic sewing lemma.
Motivated by the challenges related to the calibration of financial models, we consider the problem of numerically solving a singular McKean-Vlasov equation d X_t= \sigma(t,X_t) X_t \frac{\sqrt v_t}{\sqrt {E[v_t|X_t]}}dW_t, where WW is a Brownian motion and vv is an adapted diffusion process. This equation can be considered as a singular local stochastic volatility model. Whilst such models are quite popular among practitioners, unfortunately, its well-posedness has not been fully understood yet and, in general, is possibly not guaranteed at all. We develop a novel regularization approach based on the reproducing kernel Hilbert space (RKHS) technique and show that the regularized model is well-posed. Furthermore, we prove propagation of chaos. We demonstrate numerically that a thus regularized model is able to perfectly replicate option prices due to typical local volatility models. Our results are also applicable to more general McKean--Vlasov equations.
We consider the optimal control of a PDE with random source term subject to probabilistic or almost sure state constraints. In the main theoretical result, we provide an exact formula for the Clarke subdifferential of the probability function without a restrictive assumption made in an earlier paper. The focus of the paper is on numerical solution algorithms. As for probabilistic constraints, we apply the method of spherical radial decomposition. Almost sure constraints are dealt with a Moreau--Yosida smoothing of the constraint function accompanied by Monte Carlo sampling of the given distribution or its support or even just the boundary of its support. Moreover, one can understand the almost sure constraint as a probabilistic constraint with safety level one which offers yet another perspective. Finally, robust optimization can be applied efficiently when the support is sufficiently simple. A comparative study of these five different methodologies is carried out and illustrated.
We propose LeAP-SSN (Levenberg--Marquardt Adaptive Proximal Semismooth Newton method), a semismooth Newton-type method with a simple, parameter-free globalisation strategy that guarantees convergence from arbitrary starting points in nonconvex settings to stationary points, and under a Polyak--Lojasiewicz condition, to a global minimum, in Hilbert spaces. The method employs an adaptive Levenberg--Marquardt regularisation for the Newton steps, combined with backtracking, and does not require knowledge of problem-specific constants. We establish global nonasymptotic rates: O(1/k)\mathcal{O}(1/k) for convex problems in terms of objective values, O(1/k)\mathcal{O}(1/\sqrt{k}) under nonconvexity in terms of subgradients, and linear convergence under a Polyak--Lojasiewicz condition. The algorithm achieves superlinear convergence under mild semismoothness and Dennis--Moré or partial smoothness conditions, even for non-isolated minimisers. By combining strong global guarantees with superlinear local rates in a fully parameter-agnostic framework, LeAP-SSN bridges the gap between globally convergent algorithms and the fast asymptotics of Newton's method. The practical efficiency of the method is illustrated on representative problems from imaging, contact mechanics, and machine learning.
15 Jun 2014
The classical parametric and semiparametric Bernstein -- von Mises (BvM) results are reconsidered in a non-classical setup allowing finite samples and model misspecification. In the case of a finite dimensional nuisance parameter we obtain an upper bound on the error of Gaussian approximation of the posterior distribution for the target parameter which is explicit in the dimension of the nuisance and target parameters. This helps to identify the so called \emph{critical dimension} p p of the full parameter for which the BvM result is applicable. In the important i.i.d. case, we show that the condition "p3/n p^{3} / n is small" is sufficient for BvM result to be valid under general assumptions on the model. We also provide an example of a model with the phase transition effect: the statement of the BvM theorem fails when the dimension p p approaches n1/3 n^{1/3} . The results are extended to the case of infinite dimensional parameters with the nuisance parameter from a Sobolev class. In particular we show near normality of the posterior if the smoothness parameter ss exceeds 3/2.
16 Apr 2025
Many statistical problems can be reduced to a linear inverse problem in which only a noisy version of the operator is available. Particular examples include random design regression, deconvolution problem, instrumental variable regression, functional data analysis, error-in-variable regression, drift estimation in stochastic diffusion, and many others. The pragmatic plug-in approach can be well justified in the classical asymptotic setup with a growing sample size. However, recent developments in high dimensional inference reveal some new features of this problem. In high dimensional linear regression with a random design, the plug-in approach is questionable but the use of a simple ridge penalization yields a benign overfitting phenomenon; see \cite{baLoLu2020}, \cite{ChMo2022}, \cite{NoPuSp2024}. This paper revisits the general Error-in-Operator problem for finite samples and high dimension of the source and image spaces. A particular focus is on the choice of a proper regularization. We show that a simple ridge penalty (Tikhonov regularization) works properly in the case when the operator is more regular than the signal. In the opposite case, some model reduction technique like spectral truncation should be applied.
We study evolution equations on metric graphs with reservoirs, that is graphs where a one-dimensional interval is associated to each edge and, in addition, the vertices are able to store and exchange mass with these intervals. Focusing on the case where the dynamics are driven by an entropy functional defined both on the metric edges and vertices, we provide a rigorous understanding of such systems of coupled ordinary and partial differential equations as (generalized) gradient flows in continuity equation format. Approximating the edges by a sequence of vertices, which yields a fully discrete system, we are able to establish existence of solutions in this formalism. Furthermore, we study several scaling limits using the recently developed framework of EDP convergence with embeddings to rigorously show convergence to gradient flows on reduced metric and combinatorial graphs. Finally, numerical studies confirm our theoretical findings and provide additional insights into the dynamics under rescaling.
The concept of signatures and expected signatures is vital in data science, especially for sequential data analysis. The signature transform, a Cartan type development, translates paths into high-dimensional feature vectors, capturing their intrinsic characteristics. Under natural conditions, the expectation of the signature determines the law of the signature, providing a statistical summary of the data distribution. This property facilitates robust modeling and inference in machine learning and stochastic processes. Building on previous work by the present authors [Unified signature cumulants and generalized Magnus expansions, FoM Sigma '22] we here revisit the actual computation of expected signatures, in a general semimartingale setting. Several new formulae are given. A log-transform of (expected) signatures leads to log-signatures (signature cumulants), offering a significant reduction in complexity.
07 Aug 2008
We use the fitted Pareto law to construct an accompanying approximation of the excess distribution function. A selection rule of the location of the excess distribution function is proposed based on a stagewise lack-of-fit testing procedure. Our main result is an oracle type inequality for the Kullback--Leibler loss.
We consider a framework for approximating the obstacle problem through a penalty approach by nonlinear PDEs. By using tools from capacity theory, we show that derivatives of the solution maps of the penalised problems converge in the weak operator topology to an element of the strong-weak Bouligand subdifferential. We are able to treat smooth penalty terms as well as nonsmooth ones involving for example the positive part function max(0,)\max(0,\cdot). Our abstract framework applies to several specific choices of penalty functions which are omnipresent in the literature. We conclude with consequences to the theory of optimal control of the obstacle problem.
We explore the algebraic properties of a generalized version of the iterated-sums signature, inspired by previous work of F.~Kir\'aly and H.~Oberhauser. In particular, we show how to recover the character property of the associated linear map over the tensor algebra by considering a deformed quasi-shuffle product of words on the latter. We introduce three non-linear transformations on iterated-sums signatures, close in spirit to Machine Learning applications, and show some of their properties.
Least squares Monte Carlo methods are a popular numerical approximation method for solving stochastic control problems. Based on dynamic programming, their key feature is the approximation of the conditional expectation of future rewards by linear least squares regression. Hence, the choice of basis functions is crucial for the accuracy of the method. Earlier work by some of us [Belomestny, Schoenmakers, Spokoiny, Zharkynbay. Commun.~Math.~Sci., 18(1):109-121, 2020](arXiv:1808.02341) proposes to reinforce the basis functions in the case of optimal stopping problems by already computed value functions for later times, thereby considerably improving the accuracy with limited additional computational cost. We extend the reinforced regression method to a general class of stochastic control problems, while considerably improving the method's efficiency, as demonstrated by substantial numerical examples as well as theoretical analysis.
There are no more papers matching your filters at the moment.