Institute for Computer Science and ControlEötvös Loránd Research Network (ELKH)
Stochastic multi-armed bandits (MABs) provide a fundamental reinforcement learning model to study sequential decision making in uncertain environments. The upper confidence bounds (UCB) algorithm gave birth to the renaissance of bandit algorithms, as it achieves near-optimal regret rates under various moment assumptions. Up until recently most UCB methods relied on concentration inequalities leading to confidence bounds which depend on moment parameters, such as the variance proxy, that are usually unknown in practice. In this paper, we propose a new distribution-free, data-driven UCB algorithm for symmetric reward distributions, which needs no moment information. The key idea is to combine a refined, one-sided version of the recently developed resampled median-of-means (RMM) method with UCB. We prove a near-optimal regret bound for the proposed anytime, parameter-free RMM-UCB method, even for heavy-tailed distributions.
This paper presents a multi-step procedure to construct the dynamic motion model of an autonomous quadcopter, identify the model parameters, and design a model-based nonlinear trajectory tracking controller. The aim of the proposed method is to speed up the commissioning of a new quadcopter design, i.e., to enable the drone to perform agile maneuvers with high precision in the shortest time possible. After a brief introduction to the theoretical background of the modelling and control design, the steps of the proposed method are presented using the example of a self-developed quadcopter platform. The performance of the method is tested and evaluated by real flight experiments.
Tournaments are a widely used mechanism to rank alternatives in a noisy environment. This paper investigates a fundamental issue of economics in tournament design: what is the best usage of limited resources, that is, how should the alternatives be compared pairwise to best approximate their true but latent ranking. We consider various formats including knockout tournaments, multi-stage championships consisting of round-robin groups followed by single elimination, and the Swiss-system. They are evaluated via Monte-Carlo simulations under six different assumptions on winning probabilities. Comparing the same pair of alternatives multiple times turns out to be an inefficacious policy. While seeding can increase the efficacy of the knockout and group-based designs, its influence remains marginal unless one has an unrealistically good estimation on the true ranking of the players. The Swiss-system is found to be the most accurate among all these tournament formats, especially in its ability to rank all participants. A possible explanation is that it does not eliminate a player after a single loss, while it takes the history of the comparisons into account. The results can be especially interesting for emerging esports, where the tournament designs are not yet solidified.
The Koopman framework is a popular approach to transform a finite dimensional nonlinear system into an infinite dimensional, but linear model through a lifting process, using so-called observable functions. While there is an extensive theory on infinite dimensional representations in the operator sense, there are few constructive results on how to select the observables to realize them. When it comes to the possibility of finite Koopman representations, which are highly important form a practical point of view, there is no constructive theory. Hence, in practice, often a data-based method and ad-hoc choice of the observable functions is used. When truncating to a finite number of basis, there is also no clear indication of the introduced approximation error. In this paper, we propose a systematic method to compute the finite dimensional Koopman embedding of a specific class of polynomial nonlinear systems in continuous-time such that, the embedding, without approximation, can fully represent the dynamics of the nonlinear system.
The Linear Parameter-Varying (LPV) framework provides a modeling and control design toolchain to address nonlinear (NL) system behavior via linear surrogate models. Despite major research effort on LPV data-driven modeling, a key shortcoming of the current identification theory is that often the scheduling variable is assumed to be a given measured signal in the data set. In case of identifying an LPV model of a NL system, the selection of the scheduling map, which describes the relation to the measurable scheduling signal, is put on the users' shoulder, with only limited supporting tools available. This choice however greatly affects the usability and complexity of the resulting LPV model. This paper presents a deep-learning-based approach to provide joint estimation of a scheduling map and an LPV state-space model of a NL system from input-output data, and has consistency guarantees under general innovation-type noise conditions. Its efficiency is demonstrated on a realistic identification problem.
The area of online machine learning in big data streams covers algorithms that are (1) distributed and (2) work from data streams with only a limited possibility to store past data. The first requirement mostly concerns software architectures and efficient algorithms. The second one also imposes nontrivial theoretical restrictions on the modeling methods: In the data stream model, older data is no longer available to revise earlier suboptimal modeling decisions as the fresh data arrives. In this article, we provide an overview of distributed software architectures and libraries as well as machine learning models for online learning. We highlight the most important ideas for classification, regression, recommendation, and unsupervised modeling from streaming data, and we show how they are implemented in various distributed data stream processing systems. This article is a reference material and not a survey. We do not attempt to be comprehensive in describing all existing methods and solutions; rather, we give pointers to the most important resources in the field. All related sub-fields, online algorithms, online learning, and distributed data processing are hugely dominant in current research and development with conceptually new research results and software components emerging at the time of writing. In this article, we refer to several survey results, both for distributed data processing and for online machine learning. Compared to past surveys, our article is different because we discuss recommender systems in extended detail.
84
The divisibility restrictions in the famous equation a n+bn=cn in Fermat Last Theorem (FLT, 1637) is analyzed how it selects out many triples to be Fermat triple (i.e. solutions) if n greater than 2, decreasing the cardinality of Fermat triples. In our analysis, the restriction on positive integer (PI) solutions ((a,b,c,n) up to the point when there is no more) is not along with restriction on power n in PI as decreasing sets {PI } containing {odd} containing {primes} containing {regular primes}, etc. as in the literature, but with respect to exclusion of more and more c in PI as increasing sets {primes p} in {p k} in {PI}. The divisibility and co-prime property in Fermat equation is analyzed in relation to exclusion of solutions, and the effect of simultaneous values of gcd(a,b,c), gcd(a+b,cn), gcd(c-a,bn) and gcd(c-b,an) on the decrease of cardinality of solutions is exhibited. Again, our derivation focuses mainly on the variable c rather than on variable n, oppositely to the literature in which the FLT is historically separated via the values of power n. Among the most famous are the known, about 2500 years old, existing Pythagorean triples (a,b,c,n=2) and the first milestones as the proved cases (of non-existence as n=3 by Gauss and later by Euler (1753) and n=4 by Fermat) less than 400 years ago. As it is known, Wiles has proved the FLT in 1995 in an abstract roundabout way. The n<0, n:=1/m, as well as complex and quaternion (a,b,c) cases focusing on Pythagoreans are commented. Odd powers FLT over quaternions breaks.
Using Artificial Neural Networks (ANN) for nonlinear system identification has proven to be a promising approach, but despite of all recent research efforts, many practical and theoretical problems still remain open. Specifically, noise handling and models, issues of consistency and reliable estimation under minimisation of the prediction error are the most severe problems. The latter comes with numerous practical challenges such as explosion of the computational cost in terms of the number of data samples and the occurrence of instabilities during optimization. In this paper, we aim to overcome these issues by proposing a method which uses a truncated prediction loss and a subspace encoder for state estimation. The truncated prediction loss is computed by selecting multiple truncated subsections from the time series and computing the average prediction loss. To obtain a computationally efficient estimation method that minimizes the truncated prediction loss, a subspace encoder represented by an artificial neural network is introduced. This encoder aims to approximate the state reconstructability map of the estimated model to provide an initial state for each truncated subsection given past inputs and outputs. By theoretical analysis, we show that, under mild conditions, the proposed method is locally consistent, increases optimization stability, and achieves increased data efficiency by allowing for overlap between the subsections. Lastly, we provide practical insights and user guidelines employing a numerical example and state-of-the-art benchmark results.
Context. Recent observational data show that the Milky Way (MW) galaxy contains about 170 globular clusters (GCs). A fraction of them is likely formed in dwarf galaxies accreted onto the MW in the past, while the remaining of clusters are formed in-situ. Therefore, different parameters, including orbits, of the globular clusters is a valuable tool for studying the Milky Way evolution. However, since the evolution of the 3D mass distribution of the MW is poorly constrained, the orbits of the clusters are usually calculated in static potentials. Aims. In this work, we study the evolution of the GCs in several external potentials, where we aim to quantify the effects of the evolving galaxy potential on the orbits of the GCs. Methods. For the orbits calculation we used five MW-like potentials from IllustrisTNG-100 simulation. The orbits of 159 GCs were integrated using a high-order N-body parallel dynamic code phi-GPU, with initial conditions obtained from recent Gaia DR3 catalogues. Results. We provide a classification of the GCs orbits according to their 3D shapes and association with different components of the MW (disk, halo, bulge). We also found that the globular clusters in the external potentials have roughly similar energy-angular momentum distributions at the present time. However, both total energy and total angular momentum of the GCs are not conserved due to time-varying nature of the potentials. In some extreme cases, the total energy can change up to 40% (18 objects) over the last 5 Gyr of evolution. We found that the in-situ formed GCs are less affected by the evolution of the TNG potentials as compared to the clusters which are likely formed ex-situ. Therefore, our results suggest that time-varying potentials significantly affect the orbits of the GC, thus making it vital for understanding the formation of the MW.
Automated driving systems are often used for lane keeping tasks. By these systems, a local path is planned ahead of the vehicle. However, these paths are often found unnatural by human drivers. We propose a linear driver model, which can calculate node points that reflect the preferences of human drivers and based on these node points a human driver preferred motion path can be designed for autonomous driving. The model input is the road curvature. We apply this model to a self-developed Euler-curve-based curve fitting algorithm. Through a case study, we show that the model based planned path can reproduce the average behavior of human curve path selection. We analyze the performance of the proposed model through statistical analysis that shows the validity of the captured relations.
Identifying systems with high-dimensional inputs and outputs, such as systems measured by video streams, is a challenging problem with numerous applications in robotics, autonomous vehicles and medical imaging. In this paper, we propose a novel non-linear state-space identification method starting from high-dimensional input and output data. Multiple computational and conceptual advances are combined to handle the high-dimensional nature of the data. An encoder function, represented by a neural network, is introduced to learn a reconstructability map to estimate the model states from past inputs and outputs. This encoder function is jointly learned with the dynamics. Furthermore, multiple computational improvements, such as an improved reformulation of multiple shooting and batch optimization, are proposed to keep the computational time under control when dealing with high-dimensional and large datasets. We apply the proposed method to a video stream of a simulated environment of a controllable ball in a unit box. The study shows low simulation error with excellent long term prediction capability of the model obtained using the proposed method.
Many real-world networks exhibit correlations between the node degrees. For instance, in social networks nodes tend to connect to nodes of similar degree. Conversely, in biological and technological networks, high-degree nodes tend to be linked with low-degree nodes. Degree correlations also affect the dynamics of processes supported by a network structure, such as the spread of opinions or epidemics. The proper modelling of these systems, i.e., without uncontrolled biases, requires the sampling of networks with a specified set of constraints. We present a solution to the sampling problem when the constraints imposed are the degree correlations. In particular, we develop an efficient and exact method to construct and sample graphs with a specified joint-degree matrix, which is a matrix providing the number of edges between all the sets of nodes of a given degree, for all degrees, thus completely specifying all pairwise degree correlations, and additionally, the degree sequence itself. Our algorithm always produces independent samples without backtracking. The complexity of the graph construction algorithm is O(NM) where N is the number of nodes and M is the number of edges.
Competitive balance, which refers to the level of control teams have over a sports competition, is a crucial indicator for tournament organisers. According to previous studies, competitive balance has significantly declined in the UEFA Champions League group stage over the recent decades. Our paper introduces alternative indices to investigate this issue. Two ex ante measures are based on Elo ratings, and four dynamic concentration indicators compare the final group ranking to reasonable benchmarks. Using these indices, we find no evidence of any long-run trend in the competitive balance of the UEFA Champions League group stage between the 2003/04 and 2023/24 seasons.
An important issue in model-based control design is that an accurate dynamic model of the system is generally nonlinear, complex, and costly to obtain. This limits achievable control performance in practice. Gaussian process (GP) based estimation of system models is an effective tool to learn unknown dynamics directly from input/output data. However, conventional GP-based control methods often ignore the computational cost associated with accumulating data during the operation of the system and how to handle forgetting in continuous adaption. In this paper, we present a novel Dual Gaussian Process (DGP) based model predictive control (MPC) strategy that enables efficient use of online learning based predictive control without the danger of catastrophic forgetting. The bio-inspired DGP structure is a combination of a long-term GP and a short-term GP, where the long-term GP is used to keep the learned knowledge in memory and the short-term GP is employed to rapidly compensate unknown dynamics during online operation. Furthermore, a novel recursive online update strategy for the short-term GP is proposed to successively improve the learnt model during online operation. Effectiveness of the proposed strategy is demonstrated via numerical simulations.
Variable accretion in young stellar objects reveals itself photometrically and spectroscopically over a continuum of timescales and amplitudes. Most dramatic are the large outbursts (e.g., FU Ori, V1647 Ori, and EX Lup type events), but more frequent are the less coherent, smaller burst-like variations in accretion rate. Improving our understanding of time-variable accretion directly addresses the fundamental question of how stars gain their masses. We review variability phenomena, as characterized from observations across the wavelength spectrum, and how those observations probe underlying physical conditions. The diversity of observed lightcurves and spectra at optical and infrared wavelengths defies a simple classification of outbursts and bursts into well-defined categories. Mid-infrared and submillimeter wavelengths are sensitive to lower-temperature phenomena and more embedded, younger sources, and it is currently unclear if observed flux variations probe similar or distinct physics relative to the shorter wavelengths. We highlight unresolved issues and emphasize the value of spectroscopy, multiwavelength studies, and ultimately patience in using variable accretion to understand stellar mass assembly.
The paper suggests a generalization of the Sign-Perturbed Sums (SPS) finite sample system identification method for the identification of closed-loop observable stochastic linear systems in state-space form. The solution builds on the theory of matrix-variate regression and instrumental variable methods to construct distribution-free confidence regions for the state-space matrices. Both direct and indirect identification are studied, and the exactness as well as the strong consistency of the construction are proved. Furthermore, a new, computationally efficient ellipsoidal outer-approximation algorithm for the confidence regions is proposed. The new construction results in a semidefinite optimization problem which has an order-of-magnitude smaller number of constraints, as if one applied the ellipsoidal outer-approximation after vectorization. The effectiveness of the approach is also demonstrated empirically via a series of numerical experiments.
The importance of proper data normalization for deep neural networks is well known. However, in continuous-time state-space model estimation, it has been observed that improper normalization of either the hidden state or hidden state derivative of the model estimate, or even of the time interval can lead to numerical and optimization challenges with deep learning based methods. This results in a reduced model quality. In this contribution, we show that these three normalization tasks are inherently coupled. Due to the existence of this coupling, we propose a solution to all three normalization challenges by introducing a normalization constant at the state derivative level. We show that the appropriate choice of the normalization constant is related to the dynamics of the to-be-identified system and we derive multiple methods of obtaining an effective normalization constant. We compare and discuss all the normalization strategies on a benchmark problem based on experimental data from a cascaded tanks system and compare our results with other methods of the identification literature.
The eigenvalue method, suggested by the developer of the extensively used Analytic Hierarchy Process methodology, exhibits right-left asymmetry: the priorities derived from the right eigenvector do not necessarily coincide with the priorities derived from the reciprocal left eigenvector. This paper offers a comprehensive numerical experiment to compare the two eigenvector-based weighting procedures and their reasonable alternative of the row geometric mean with respect to four measures. The underlying pairwise comparison matrices are constructed randomly with different dimensions and levels of inconsistency. The disagreement between the two eigenvectors turns out to be not always a monotonic function of these important characteristics of the matrix. The ranking contradictions can affect alternatives with relatively distant priorities. The row geometric mean is found to be almost at the midpoint between the right and inverse left eigenvectors, making it a straightforward compromise between them.
We present a new technique to generate the light curves of RRab stars in different photometric bands (II and VV bands) using Artificial Neural Networks (ANN). A pre-computed grid of models was used to train the ANN, and the architecture was tuned using the II band light curves. The best-performing network was adopted to make the final interpolators in the II and VV bands. The trained interpolators were used to predict the light curve of RRab stars in the Magellanic Clouds, and the distances to the LMC and SMC were determined based on the reddening independent Wesenheit index. The estimated distances are in good agreement with the literature. The comparison of the predicted and observed amplitudes, and Fourier amplitude ratios showed good agreement, but the Fourier phase parameters displayed a few discrepancies. To showcase the utility of the interpolators, the light curve of the RRab star EZ Cnc was generated and compared with the observed light curve from the Kepler mission. The reported distance to EZ Cnc was found to be in excellent agreement with the updated parallax measurement from Gaia EDR3. Our ANN interpolator provides a fast and efficient technique to generate a smooth grid of model light curves for a wide range of physical parameters, which is computationally expensive and time-consuming using stellar pulsation codes.
In this paper, we present a virtual control contraction metric (VCCM) based nonlinear parameter-varying (NPV) approach to design a state-feedback controller for a control moment gyroscope (CMG) to track a user-defined trajectory set. This VCCM based nonlinear stabilization and performance synthesis approach, which is similar to linear parameter-varying (LPV) control approaches, allows to achieve exact guarantees of exponential stability and L2\mathcal{L}_2-gain performance on nonlinear systems with respect to all trajectories from the predetermined set, which is not the case with the conventional LPV methods. Simulation and experimental studies conducted in both fully- and under-actuated operating modes of the CMG show effectiveness of this approach compared to standard LPV control methods.
There are no more papers matching your filters at the moment.