Institute for Computer Science and Control
Stochastic multi-armed bandits (MABs) provide a fundamental reinforcement learning model to study sequential decision making in uncertain environments. The upper confidence bounds (UCB) algorithm gave birth to the renaissance of bandit algorithms, as it achieves near-optimal regret rates under various moment assumptions. Up until recently most UCB methods relied on concentration inequalities leading to confidence bounds which depend on moment parameters, such as the variance proxy, that are usually unknown in practice. In this paper, we propose a new distribution-free, data-driven UCB algorithm for symmetric reward distributions, which needs no moment information. The key idea is to combine a refined, one-sided version of the recently developed resampled median-of-means (RMM) method with UCB. We prove a near-optimal regret bound for the proposed anytime, parameter-free RMM-UCB method, even for heavy-tailed distributions.
This paper presents a multi-step procedure to construct the dynamic motion model of an autonomous quadcopter, identify the model parameters, and design a model-based nonlinear trajectory tracking controller. The aim of the proposed method is to speed up the commissioning of a new quadcopter design, i.e., to enable the drone to perform agile maneuvers with high precision in the shortest time possible. After a brief introduction to the theoretical background of the modelling and control design, the steps of the proposed method are presented using the example of a self-developed quadcopter platform. The performance of the method is tested and evaluated by real flight experiments.
The area of online machine learning in big data streams covers algorithms that are (1) distributed and (2) work from data streams with only a limited possibility to store past data. The first requirement mostly concerns software architectures and efficient algorithms. The second one also imposes nontrivial theoretical restrictions on the modeling methods: In the data stream model, older data is no longer available to revise earlier suboptimal modeling decisions as the fresh data arrives. In this article, we provide an overview of distributed software architectures and libraries as well as machine learning models for online learning. We highlight the most important ideas for classification, regression, recommendation, and unsupervised modeling from streaming data, and we show how they are implemented in various distributed data stream processing systems. This article is a reference material and not a survey. We do not attempt to be comprehensive in describing all existing methods and solutions; rather, we give pointers to the most important resources in the field. All related sub-fields, online algorithms, online learning, and distributed data processing are hugely dominant in current research and development with conceptually new research results and software components emerging at the time of writing. In this article, we refer to several survey results, both for distributed data processing and for online machine learning. Compared to past surveys, our article is different because we discuss recommender systems in extended detail.
84
The Linear Parameter-Varying (LPV) framework provides a modeling and control design toolchain to address nonlinear (NL) system behavior via linear surrogate models. Despite major research effort on LPV data-driven modeling, a key shortcoming of the current identification theory is that often the scheduling variable is assumed to be a given measured signal in the data set. In case of identifying an LPV model of a NL system, the selection of the scheduling map, which describes the relation to the measurable scheduling signal, is put on the users' shoulder, with only limited supporting tools available. This choice however greatly affects the usability and complexity of the resulting LPV model. This paper presents a deep-learning-based approach to provide joint estimation of a scheduling map and an LPV state-space model of a NL system from input-output data, and has consistency guarantees under general innovation-type noise conditions. Its efficiency is demonstrated on a realistic identification problem.
The Koopman framework is a popular approach to transform a finite dimensional nonlinear system into an infinite dimensional, but linear model through a lifting process, using so-called observable functions. While there is an extensive theory on infinite dimensional representations in the operator sense, there are few constructive results on how to select the observables to realize them. When it comes to the possibility of finite Koopman representations, which are highly important form a practical point of view, there is no constructive theory. Hence, in practice, often a data-based method and ad-hoc choice of the observable functions is used. When truncating to a finite number of basis, there is also no clear indication of the introduced approximation error. In this paper, we propose a systematic method to compute the finite dimensional Koopman embedding of a specific class of polynomial nonlinear systems in continuous-time such that, the embedding, without approximation, can fully represent the dynamics of the nonlinear system.
Using Artificial Neural Networks (ANN) for nonlinear system identification has proven to be a promising approach, but despite of all recent research efforts, many practical and theoretical problems still remain open. Specifically, noise handling and models, issues of consistency and reliable estimation under minimisation of the prediction error are the most severe problems. The latter comes with numerous practical challenges such as explosion of the computational cost in terms of the number of data samples and the occurrence of instabilities during optimization. In this paper, we aim to overcome these issues by proposing a method which uses a truncated prediction loss and a subspace encoder for state estimation. The truncated prediction loss is computed by selecting multiple truncated subsections from the time series and computing the average prediction loss. To obtain a computationally efficient estimation method that minimizes the truncated prediction loss, a subspace encoder represented by an artificial neural network is introduced. This encoder aims to approximate the state reconstructability map of the estimated model to provide an initial state for each truncated subsection given past inputs and outputs. By theoretical analysis, we show that, under mild conditions, the proposed method is locally consistent, increases optimization stability, and achieves increased data efficiency by allowing for overlap between the subsections. Lastly, we provide practical insights and user guidelines employing a numerical example and state-of-the-art benchmark results.
Automated driving systems are often used for lane keeping tasks. By these systems, a local path is planned ahead of the vehicle. However, these paths are often found unnatural by human drivers. We propose a linear driver model, which can calculate node points that reflect the preferences of human drivers and based on these node points a human driver preferred motion path can be designed for autonomous driving. The model input is the road curvature. We apply this model to a self-developed Euler-curve-based curve fitting algorithm. Through a case study, we show that the model based planned path can reproduce the average behavior of human curve path selection. We analyze the performance of the proposed model through statistical analysis that shows the validity of the captured relations.
Competitive balance, which refers to the level of control teams have over a sports competition, is a crucial indicator for tournament organisers. According to previous studies, competitive balance has significantly declined in the UEFA Champions League group stage over the recent decades. Our paper introduces alternative indices to investigate this issue. Two ex ante measures are based on Elo ratings, and four dynamic concentration indicators compare the final group ranking to reasonable benchmarks. Using these indices, we find no evidence of any long-run trend in the competitive balance of the UEFA Champions League group stage between the 2003/04 and 2023/24 seasons.
The paper suggests a generalization of the Sign-Perturbed Sums (SPS) finite sample system identification method for the identification of closed-loop observable stochastic linear systems in state-space form. The solution builds on the theory of matrix-variate regression and instrumental variable methods to construct distribution-free confidence regions for the state-space matrices. Both direct and indirect identification are studied, and the exactness as well as the strong consistency of the construction are proved. Furthermore, a new, computationally efficient ellipsoidal outer-approximation algorithm for the confidence regions is proposed. The new construction results in a semidefinite optimization problem which has an order-of-magnitude smaller number of constraints, as if one applied the ellipsoidal outer-approximation after vectorization. The effectiveness of the approach is also demonstrated empirically via a series of numerical experiments.
An important issue in model-based control design is that an accurate dynamic model of the system is generally nonlinear, complex, and costly to obtain. This limits achievable control performance in practice. Gaussian process (GP) based estimation of system models is an effective tool to learn unknown dynamics directly from input/output data. However, conventional GP-based control methods often ignore the computational cost associated with accumulating data during the operation of the system and how to handle forgetting in continuous adaption. In this paper, we present a novel Dual Gaussian Process (DGP) based model predictive control (MPC) strategy that enables efficient use of online learning based predictive control without the danger of catastrophic forgetting. The bio-inspired DGP structure is a combination of a long-term GP and a short-term GP, where the long-term GP is used to keep the learned knowledge in memory and the short-term GP is employed to rapidly compensate unknown dynamics during online operation. Furthermore, a novel recursive online update strategy for the short-term GP is proposed to successively improve the learnt model during online operation. Effectiveness of the proposed strategy is demonstrated via numerical simulations.
In this paper, we present a virtual control contraction metric (VCCM) based nonlinear parameter-varying (NPV) approach to design a state-feedback controller for a control moment gyroscope (CMG) to track a user-defined trajectory set. This VCCM based nonlinear stabilization and performance synthesis approach, which is similar to linear parameter-varying (LPV) control approaches, allows to achieve exact guarantees of exponential stability and L2\mathcal{L}_2-gain performance on nonlinear systems with respect to all trajectories from the predetermined set, which is not the case with the conventional LPV methods. Simulation and experimental studies conducted in both fully- and under-actuated operating modes of the CMG show effectiveness of this approach compared to standard LPV control methods.
Many real-world networks exhibit correlations between the node degrees. For instance, in social networks nodes tend to connect to nodes of similar degree. Conversely, in biological and technological networks, high-degree nodes tend to be linked with low-degree nodes. Degree correlations also affect the dynamics of processes supported by a network structure, such as the spread of opinions or epidemics. The proper modelling of these systems, i.e., without uncontrolled biases, requires the sampling of networks with a specified set of constraints. We present a solution to the sampling problem when the constraints imposed are the degree correlations. In particular, we develop an efficient and exact method to construct and sample graphs with a specified joint-degree matrix, which is a matrix providing the number of edges between all the sets of nodes of a given degree, for all degrees, thus completely specifying all pairwise degree correlations, and additionally, the degree sequence itself. Our algorithm always produces independent samples without backtracking. The complexity of the graph construction algorithm is O(NM) where N is the number of nodes and M is the number of edges.
The importance of proper data normalization for deep neural networks is well known. However, in continuous-time state-space model estimation, it has been observed that improper normalization of either the hidden state or hidden state derivative of the model estimate, or even of the time interval can lead to numerical and optimization challenges with deep learning based methods. This results in a reduced model quality. In this contribution, we show that these three normalization tasks are inherently coupled. Due to the existence of this coupling, we propose a solution to all three normalization challenges by introducing a normalization constant at the state derivative level. We show that the appropriate choice of the normalization constant is related to the dynamics of the to-be-identified system and we derive multiple methods of obtaining an effective normalization constant. We compare and discuss all the normalization strategies on a benchmark problem based on experimental data from a cascaded tanks system and compare our results with other methods of the identification literature.
Identifying systems with high-dimensional inputs and outputs, such as systems measured by video streams, is a challenging problem with numerous applications in robotics, autonomous vehicles and medical imaging. In this paper, we propose a novel non-linear state-space identification method starting from high-dimensional input and output data. Multiple computational and conceptual advances are combined to handle the high-dimensional nature of the data. An encoder function, represented by a neural network, is introduced to learn a reconstructability map to estimate the model states from past inputs and outputs. This encoder function is jointly learned with the dynamics. Furthermore, multiple computational improvements, such as an improved reformulation of multiple shooting and batch optimization, are proposed to keep the computational time under control when dealing with high-dimensional and large datasets. We apply the proposed method to a video stream of a simulated environment of a controllable ball in a unit box. The study shows low simulation error with excellent long term prediction capability of the model obtained using the proposed method.
In this paper, we present a novel approach to combine data-driven non-parametric representations with model-based representations of dynamical systems. Based on a data-driven form of linear fractional transformations, we introduce a data-driven form of generalized plants. This form can be leveraged to accomplish performance characterizations, e.g., in the form of a mixed-sensitivity approach, and LMI-based conditions to verify finite-horizon dissipativity. In particular, we show how finite-horizon 2\ell_2-gain under weighting filter-based general performance specifications are verified for implemented controllers on systems for which only input-output data is available. The overall effectiveness of the proposed method is demonstrated by simulation examples.
Through the use of the Fundamental Lemma for linear systems, a direct data-driven state-feedback control synthesis method is presented for a rather general class of nonlinear (NL) systems. The core idea is to develop a data-driven representation of the so-called velocity-form, i.e., the time-difference dynamics, of the NL system, which is shown to admit a direct linear parameter-varying (LPV) representation. By applying the LPV extension of the Fundamental Lemma in this velocity domain, a state-feedback controller is directly synthesized to provide asymptotic stability and dissipativity of the velocity-form. By using realization theory, the synthesized controller is realized as a NL state-feedback law for the original unknown NL system with guarantees of universal shifted stability and dissipativity, i.e., stability and dissipativity w.r.t. any (forced) equilibrium point, of the closed-loop behavior. This is achieved by the use of a single sequence of data from the system and a predefined basis function set to span the scheduling map. The applicability of the results is demonstrated on a simulation example of an unbalanced disc.
In recent years, the Linear Parameter-Varying (LPV) framework has become increasingly useful for analysis and control of time-varying systems. Generally, LPV control synthesis is performed in the continuous-time (CT) domain due to significantly more intuitive performance shaping methods in CT. However, the main complication of CT synthesis approaches is the successive implementation of the resulting CT control solutions on physical hardware. In the literature, several discretization methods have been developed for LPV systems. However, most of these approaches necessitate heavy nonlinear operations introduced by the discretization of these time-varying matrices or can introduce significant approximation error, thereby severely limiting implementation capabilities of CT LPV control solutions. Alternatively, the ww' discretization approach has been introduced in the LTI case to allow for preservation of the CT control. Based on these observations, this paper aims at extending the ww' discretization approach to LPV systems, such that implementation of CT LPV control solutions on physical hardware is simplified.
Based on the Fundamental Lemma by Willems et al., the entire behaviour of a Linear Time-Invariant (LTI) system can be characterised by a single data sequence of the system as long the input is persistently exciting. This is an essential result for data-driven analysis and control. In this work, we aim to generalise this LTI result to Linear Parameter-Varying (LPV) systems. Based on the behavioural framework for LPV systems, we prove that one can obtain a result similar to Willems'. Based on an LPV representation, i.e., embedding, of nonlinear systems, this allows the application of the Fundamental Lemma for systems beyond the linear class.
Automated driving applications require accurate vehicle specific models to precisely predict and control the motion dynamics. However, modern vehicles have a wide array of digital and mechatronic components that are difficult to model, manufactures do not disclose all details required for modelling and even existing models of subcomponents require coefficient estimation to match the specific characteristics of each vehicle and their change over time. Hence, it is attractive to use data-driven modelling to capture the relevant vehicle dynamics and synthesise model-based control solutions. In this paper, we address identification of the steering system of an autonomous car based on measured data. We show that the underlying dynamics are highly nonlinear and challenging to be captured, necessitating the use of data-driven methods that fuse the approximation capabilities of learning and the efficiency of dynamic system identification. We demonstrate that such a neural network based subspace-encoder method can successfully capture the underlying dynamics while other methods fall short to provide reliable results.
In order to make data-driven models of physical systems interpretable and reliable, it is essential to include prior physical knowledge in the modeling framework. Hamiltonian Neural Networks (HNNs) implement Hamiltonian theory in deep learning and form a comprehensive framework for modeling autonomous energy-conservative systems. Despite being suitable to estimate a wide range of physical system behavior from data, classical HNNs are restricted to systems without inputs and require noiseless state measurements and information on the derivative of the state to be available. To address these challenges, this paper introduces an Output Error Hamiltonian Neural Network (OE-HNN) modeling approach to address the modeling of physical systems with inputs and noisy state measurements. Furthermore, it does not require the state derivatives to be known. Instead, the OE-HNN utilizes an ODE-solver embedded in the training process, which enables the OE-HNN to learn the dynamics from noisy state measurements. In addition, extending HNNs based on the generalized Hamiltonian theory enables to include external inputs into the framework which are important for engineering applications. We demonstrate via simulation examples that the proposed OE-HNNs results in superior modeling performance compared to classical HNNs.
There are no more papers matching your filters at the moment.