Basic Research of Artificial Intelligence Laboratory (BRAIn Lab)
Adaptive Regularized Newton Method with Inexact Hessian
Newton's method is the most widespread high-order method, demanding the gradient and the Hessian of the objective function. However, one of the main disadvantages of Newtons method is its lack of global convergence and high iteration cost. Both these drawbacks are critical for modern optimization motivated primarily by current applications in machine learning. In this paper, we introduce a novel algorithm to deal with these disadvantages. Our method can be implemented with various Hessian approximations, including methods that use only the first-order information. Thus, computational costs might be drastically reduced. Also, it can be adjusted to problems' geometries via the usage of different Bregman divergences. The proposed method converges for nonconvex and convex problems globally and it has the same rates as other well-known methods that lack mentioned properties. We present experiments validating our method performs according to the theoretical bounds and shows competitive performance among other Newton-based methods.
View blog
Resources
Unified Theory of Adaptive Variance Reduction
Variance reduction is a family of powerful mechanisms for stochastic optimization that appears to be helpful in many machine learning tasks. It is based on estimating the exact gradient with some recursive sequences. Previously, many papers demonstrated that methods with unbiased variance-reduction estimators can be described in a single framework. We generalize this approach and show that the unbiasedness assumption is excessive; hence, we include biased estimators in this analysis. But the main contribution of our work is the proposition of new variance reduction methods with adaptive step sizes that are adjusted throughout the algorithm iterations and, moreover, do not need hyperparameter tuning. Our analysis covers finite- sum problems, distributed optimization, and coordinate methods. Numerical experiments in various tasks validate the effectiveness of our methods.
View blog
Resources
There are no more papers matching your filters at the moment.