Radial Research and Development
In this paper we study the problem of comparing the means of a single observation and a reference sample in the presence of a common data covariance matrix, where the data dimension pp grows linearly with the number of samples nn and p/np/n converges to a number between 0 and 1. The approach we take is to replace the sample covariance matrix with a nonlinear shrinkage estimator -- i.e., a matrix with the same eigenvectors -- in Hotelling's T2T^2 test. Current approaches of this sort typically assume that the data covariance matrix has a condition number or spiked rank that increases slowly with dimension. However, this assumption is ill-suited to data sets containing many strongly correlated background covariates, as often found in finance, genetics, and remote sensing. To address this problem we construct, using variational methods and new local random-matrix laws, a nonlinear covariance shrinkage method tailored to optimize detection performance across a broad range of spiked ranks and condition numbers. We then demonstrate, via both simulated and real-world data, that our method outperforms existing approaches.
We investigate covariance shrinkage for Hotelling's T2T^2 in the regime where the data dimension pp and the sample size nn grow in a fixed ratio -- without assuming that the population covariance matrix is spiked or well-conditioned. When p/nϕ(0,1)p/n\to\phi \in (0,1), we propose a practical finite-sample shrinker that, for any maximum-entropy signal prior and any fixed significance level, (a) asymptotically maximizes power under Gaussian data, and (b) asymptotically saturates the Hanson--Wright lower bound on power in the more general sub-Gaussian case. Our approach is to formulate and solve a variational problem characterizing the optimal limiting shrinker, and to show that our finite-sample method consistently approximates this limit by extending recent local random matrix laws. Empirical studies on simulated and real-world data, including the Crawdad UMich/RSS data set, demonstrate up to a 50%50\% gain in power over leading linear and nonlinear competitors at a significance level of 10410^{-4}.
There are no more papers matching your filters at the moment.