alphaXiv

History

Papers Benchmarks

Volkswagen Group

149

26 Aug 2021

computer-science disordered-systems-and-neural-networks machine-learning

TensorFlow Quantum: A Software Framework for Quantum Machine Learning

University of Waterloo

Google Research Fermi National Accelerator Laboratory USRA Ludwig-Maximilian-University NASA University Erlangen-Nürnberg Volkswagen Group X Volkswagen Group Advanced Technologies

We introduce TensorFlow Quantum (TFQ), an open source library for the rapid prototyping of hybrid quantum-classical models for classical or quantum data. This framework offers high-level abstractions for the design and training of both discriminative and generative quantum models under TensorFlow and supports high-performance quantum circuit simulators. We provide an overview of the software architecture and building blocks through several examples and review the theory of hybrid quantum-classical neural networks. We illustrate TFQ functionalities via several basic applications including supervised learning for quantum classification, quantum control, simulating noisy quantum circuits, and quantum approximate optimization. Moreover, we demonstrate how one can apply TFQ to tackle advanced quantum learning tasks including meta-learning, layerwise learning, Hamiltonian learning, sampling thermal states, variational quantum eigensolvers, classification of quantum phase transitions, generative adversarial networks, and reinforcement learning. We hope this framework provides the necessary tools for the quantum computing and machine learning research communities to explore models of both natural and artificial quantum systems, and ultimately discover new quantum algorithms which could potentially yield a quantum advantage.

18 Jul 2024

computer-science machine-learning robotics

LIMT: Language-Informed Multi-Task Visual World Models

ETH Zurich University of Zurich

Technical University of Munich Eötvös Loránd University Volkswagen Group

Most recent successes in robot reinforcement learning involve learning a specialized single-task agent. However, robots capable of performing multiple tasks can be much more valuable in real-world applications. Multi-task reinforcement learning can be very challenging due to the increased sample complexity and the potentially conflicting task objectives. Previous work on this topic is dominated by model-free approaches. The latter can be very sample inefficient even when learning specialized single-task agents. In this work, we focus on model-based multi-task reinforcement learning. We propose a method for learning multi-task visual world models, leveraging pre-trained language models to extract semantically meaningful task representations. These representations are used by the world model and policy to reason about task similarity in dynamics and behavior. Our results highlight the benefits of using language-driven task representations for world models and a clear advantage of model-based multi-task learning over the more common model-free paradigm.

04 Aug 2020

autonomous-vehicles computer-science artificial-intelligence

Learning to Fly via Deep Model-Based Reinforcement Learning

Max Planck Institute for Intelligent Systems Technical University of Darmstadt Volkswagen Group

Learning to control robots without requiring engineered models has been a long-term goal, promising diverse and novel applications. Yet, reinforcement learning has only achieved limited impact on real-time robot control due to its high demand of real-world interactions. In this work, by leveraging a learnt probabilistic model of drone dynamics, we learn a thrust-attitude controller for a quadrotor through model-based reinforcement learning. No prior knowledge of the flight dynamics is assumed; instead, a sequential latent variable model, used generatively and as an online filter, is learnt from raw sensory input. The controller and value function are optimised entirely by propagating stochastic analytic gradients through generated latent trajectories. We show that "learning to fly" can be achieved with less than 30 minutes of experience with a single drone, and can be deployed solely using onboard computational resources and sensors, on a self-built drone.

12 Aug 2025

computer-science artificial-intelligence machine-learning

TechOps: Technical Documentation Templates for the AI Act

Technical University of Munich Zalando SE Volkswagen Group ELTE University Budapest Capgemini Deutschland GmbH Foundation Robotics

Operationalizing the EU AI Act requires clear technical documentation to ensure AI systems are transparent, traceable, and accountable. Existing documentation templates for AI systems do not fully cover the entire AI lifecycle while meeting the technical documentation requirements of the AI Act. This paper addresses those shortcomings by introducing open-source templates and examples for documenting data, models, and applications to provide sufficient documentation for certifying compliance with the AI Act. These templates track the system status over the entire AI lifecycle, ensuring traceability, reproducibility, and compliance with the AI Act. They also promote discoverability and collaboration, reduce risks, and align with best practices in AI documentation and governance. The templates are evaluated and refined based on user feedback to enable insights into their usability and implementability. We then validate the approach on real-world scenarios, providing examples that further guide their implementation: the data template is followed to document a skin tones dataset created to support fairness evaluations of downstream computer vision models and human-centric applications; the model template is followed to document a neural network for segmenting human silhouettes in photos. The application template is tested on a system deployed for construction site safety using real-time video analytics and sensor data. Our results show that TechOps can serve as a practical tool to enable oversight for regulatory compliance and responsible AI development.

29 Apr 2024

computer-science machine-learning robotics

On the Role of the Action Space in Robot Manipulation Learning and Sim-to-Real Transfer

Volkswagen Group

We study the choice of action space in robot manipulation learning and sim-to-real transfer. We define metrics that assess the performance, and examine the emerging properties in the different action spaces. We train over 250 reinforcement learning~(RL) agents in simulated reaching and pushing tasks, using 13 different control spaces. The choice of spaces spans combinations of common action space design characteristics. We evaluate the training performance in simulation and the transfer to a real-world environment. We identify good and bad characteristics of robotic action spaces and make recommendations for future designs. Our findings have important implications for the design of RL algorithms for robot manipulation tasks, and highlight the need for careful consideration of action spaces when training and transferring RL agents for real-world robotics.

08 Aug 2025

ai-for-health computer-science artificial-intelligence

A dataset of primary nasopharyngeal carcinoma MRI with multi-modalities segmentation

Zhongshan Hospital, Fudan University Volkswagen Group The First People’s Hospital of Foshan ELTE University The Second Affiliated Hospital of Anhui Medical University

Multi-modality magnetic resonance imaging(MRI) data facilitate the early diagnosis, tumor segmentation, and disease staging in the management of nasopharyngeal carcinoma (NPC). The lack of publicly available, comprehensive datasets limits advancements in diagnosis, treatment planning, and the development of machine learning algorithms for NPC. Addressing this critical need, we introduce the first comprehensive NPC MRI dataset, encompassing MR axial imaging of 277 primary NPC patients. This dataset includes T1-weighted, T2-weighted, and contrast-enhanced T1-weighted sequences, totaling 831 scans. In addition to the corresponding clinical data, manually annotated and labeled segmentations by experienced radiologists offer high-quality data resources from untreated primary NPC.

06 Nov 2020

physics quantum-physics

Beating classical heuristics for the binary paint shop problem with the quantum approximate optimization algorithm

Leiden University Volkswagen Group University Erlangen-N ̈urnberg (FAU)

The binary paint shop problem (BPSP) is an APX-hard optimization problem of the automotive industry. In this work, we show how to use the Quantum Approximate Optimization Algorithm (QAOA) to find solutions of the BPSP and demonstrate that QAOA with constant depth is able to beat classical heuristics on average in the infinite size limit

n\rightarrow\infty

. For the BPSP, it is known that no classical algorithm can exist which approximates the problem in polynomial runtime. We introduce a BPSP instance which is hard to solve with QAOA, and numerically investigate its performance and discuss QAOA's ability to generate approximate solutions. We complete our studies by running first experiments of small-sized instances on a trapped-ion quantum computer through AWS Braket.

27 Jan 2019

computer-science machine-learning statistics

Bayesian Learning of Neural Network Architectures

Volkswagen Group argmax.ai

In this paper we propose a Bayesian method for estimating architectural parameters of neural networks, namely layer size and network depth. We do this by learning concrete distributions over these parameters. Our results show that regular networks with a learnt structure can generalise better on small datasets, while fully stochastic networks can be more robust to parameter initialisation. The proposed method relies on standard neural variational learning and, unlike randomised architecture search, does not require a retraining of the model, thus keeping the computational overhead at minimum.

13 Nov 2020

physics quantum-physics

Quantum algorithms with local particle number conservation: noise effects and error correction

Volkswagen Group USRA Research Institute for Advanced Computer Science (RIACS)NASA, Ames Research Center University Erlangen-N Hernberg (FAU)University Erlangen-N burnberg (FAU)

Quantum circuits with local particle number conservation (LPNC) restrict the quantum computation to a subspace of the Hilbert space of the qubit register. In a noiseless or fault-tolerant quantum computation, such quantities are preserved. In the presence of noise, however, the evolution's symmetry could be broken and non-valid states could be sampled at the end of the computation. On the other hand, the restriction to a subspace in the ideal case suggest the possibility of more resource efficient error mitigation techniques for circuits preserving symmetries that are not possible for general circuits. Here, we analyze the probability of staying in such symmetry-preserved subspaces under noise, providing an exact formula for local depolarizing noise. We apply our findings to benchmark, under depolarizing noise, the symmetry robustness of XY-QAOA, which has local particle number conserving symmetries, and is a special case of the Quantum Alternating Operator Ansatz. We also analyze the influence of the choice of encoding the problem on the symmetry robustness of the algorithm and discuss a simple adaption of the bit flip code to correct for symmetry-breaking errors with reduced resources.

136

12 May 2025

computer-science computational-engineering-finance-and-science discrete-mathematics

QUEST: QUantum-Enhanced Shared Transportation

Technical University of Munich RWTH-Aachen Volkswagen Group Forschungszentrum J",

Volkswagen Group and academic researchers develop QUEST, a quantum-enhanced framework for optimizing vehicle pairing in shared transportation through aerodynamic drafting, implementing the solution using QAOA on IBM's quantum hardware while demonstrating successful problem encoding and identifying key hardware limitations for practical deployment.

09 Nov 2024

computer-science artificial-intelligence computers-and-society

A Collaborative, Human-Centred Taxonomy of AI, Algorithmic, and Automation Harms

New York University University of Pavia Aristotle University of Thessaloniki Dublin City University Trinity College Dublin TU Wien Heriot-Watt University European University Institute WU Wien Algoma University Volkswagen Group University of Maine School of Law SciencesPo Paris

This paper introduces a collaborative, human-centred taxonomy of AI, algorithmic and automation harms. We argue that existing taxonomies, while valuable, can be narrow, unclear, typically cater to practitioners and government, and often overlook the needs of the wider public. Drawing on existing taxonomies and a large repository of documented incidents, we propose a taxonomy that is clear and understandable to a broad set of audiences, as well as being flexible, extensible, and interoperable. Through iterative refinement with topic experts and crowdsourced annotation testing, we propose a taxonomy that can serve as a powerful tool for civil society organisations, educators, policymakers, product teams and the general public. By fostering a greater understanding of the real-world harms of AI and related technologies, we aim to increase understanding, empower NGOs and individuals to identify and report violations, inform policy discussions, and encourage responsible technology development and deployment.

12 Aug 2020

computer-science machine-learning generative-models

Learning Flat Latent Manifolds with VAEs

Volkswagen Group Autonomous Intelligent Driving GmbH

Measuring the similarity between data points often requires domain knowledge, which can in parts be compensated by relying on unsupervised methods such as latent-variable models, where similarity/distance is estimated in a more compact latent space. Prevalent is the use of the Euclidean metric, which has the drawback of ignoring information about similarity of data stored in the decoder, as captured by the framework of Riemannian geometry. We propose an extension to the framework of variational auto-encoders allows learning flat latent manifolds, where the Euclidean metric is a proxy for the similarity between data points. This is achieved by defining the latent space as a Riemannian manifold and by regularising the metric tensor to be a scaled identity matrix. Additionally, we replace the compact prior typically used in variational auto-encoders with a recently presented, more expressive hierarchical one---and formulate the learning problem as a constrained optimisation problem. We evaluate our method on a range of data-sets, including a video-tracking benchmark, where the performance of our unsupervised approach nears that of state-of-the-art supervised approaches, while retaining the computational efficiency of straight-line-based approaches.

23 Apr 2025

computer-science machine-learning imitation-learning

Overcoming Knowledge Barriers: Online Imitation Learning from Visual Observation with Pretrained World Models

Technical University of Munich Volkswagen Group Eötvös Loránd University Budapest

Pretraining and finetuning models has become increasingly popular in decision-making. But there are still serious impediments in Imitation Learning from Observation (ILfO) with pretrained models. This study identifies two primary obstacles: the Embodiment Knowledge Barrier (EKB) and the Demonstration Knowledge Barrier (DKB). The EKB emerges due to the pretrained models' limitations in handling novel observations, which leads to inaccurate action inference. Conversely, the DKB stems from the reliance on limited demonstration datasets, restricting the model's adaptability across diverse scenarios. We propose separate solutions to overcome each barrier and apply them to Action Inference by Maximising Evidence (AIME), a state-of-the-art algorithm. This new algorithm, AIME-NoB, integrates online interactions and a data-driven regulariser to mitigate the EKB. Additionally, it uses a surrogate reward function to broaden the policy's supported states, addressing the DKB. Our experiments on vision-based control tasks from the DeepMind Control Suite and MetaWorld benchmarks show that AIME-NoB significantly improves sample efficiency and converged performance, presenting a robust framework for overcoming the challenges in ILfO with pretrained models. Code available at this https URL

08 Feb 2018

computer-science machine-learning generative-models

Metrics for Deep Generative Models

Volkswagen Group

Neural samplers such as variational autoencoders (VAEs) or generative adversarial networks (GANs) approximate distributions by transforming samples from a simple random source---the latent space---to samples from a more complex distribution represented by a dataset. While the manifold hypothesis implies that the density induced by a dataset contains large regions of low density, the training criterions of VAEs and GANs will make the latent space densely covered. Consequently points that are separated by low-density regions in observation space will be pushed together in latent space, making stationary distances poor proxies for similarity. We transfer ideas from Riemannian geometry to this setting, letting the distance between two points be the shortest path on a Riemannian manifold induced by the transformation. The method yields a principled distance measure, provides a tool for visual inspection of deep generative models, and an alternative to linear interpolation in latent space. In addition, it can be applied for robot movement generalization using previously learned skills. The method is evaluated on a synthetic dataset with known ground truth; on a simulated robot arm dataset; on human motion capture data; and on a generative model of handwritten digits.

167

03 Nov 2017

computer-science symbolic-computation statistics

Automatic Differentiation for Tensor Algebras

Technical University Munich Volkswagen Group

Kjolstad et. al. proposed a tensor algebra compiler. It takes expressions that define a tensor element-wise, such as $f_{ij}(a,b,c,d) = \exp\left[-\sum_{k=0}^4 \left((a_{ik}+b_{jk})^2\, c_{ii} + d_{i+k}^3 \right) \right]$, and generates the corresponding compute kernel code. For machine learning, especially deep learning, it is often necessary to compute the gradient of a loss function

l(a,b,c,d)=l(f(a,b,c,d))

with respect to parameters

a,b,c,d

. If tensor compilers are to be applied in this field, it is necessary to derive expressions for the derivatives of element-wise defined tensors, i.e. expressions for

(da)_{ik}=\partial l/\partial a_{ik}

. When the mapping between function indices and argument indices is not 1:1, special attention is required. For the function

f_{ij} (x) = x_i^2

, the derivative of the loss is $(dx)_i=\partial l/\partial x_i=\sum_j (df)_{ij}2x_i

; the sum is necessary because index

j$ does not appear in the indices of

f

. Another example is

f_{i}(x)=x_{ii}^2

, where

x

is a matrix; here we have

(dx)_{ij}=\delta_{ij}(df)_i2x_{ii}

; the Kronecker delta is necessary because the derivative is zero for off-diagonal elements. Another indexing scheme is used by

f_{ij}(x)=\exp x_{i+j}

; here the correct derivative is

(dx)_{k}=\sum_i (df)_{i,k-i} \exp x_{k}

, where the range of the sum must be chosen appropriately. In this publication we present an algorithm that can handle any case in which the indices of an argument are an arbitrary linear combination of the indices of the function, thus all the above examples can be handled. Sums (and their ranges) and Kronecker deltas are automatically inserted into the derivatives as necessary. Additionally, the indices are transformed, if required (as in the last example). The algorithm outputs a symbolic expression that can be subsequently fed into a tensor algebra compiler. Source code is provided.

29 May 2019

bayesian-deep-learning computer-science machine-learning

Switching Linear Dynamics for Variational Bayes Filtering

Technische Universität Darmstadt Volkswagen Group

System identification of complex and nonlinear systems is a central problem for model predictive control and model-based reinforcement learning. Despite their complexity, such systems can often be approximated well by a set of linear dynamical systems if broken into appropriate subsequences. This mechanism not only helps us find good approximations of dynamics, but also gives us deeper insight into the underlying system. Leveraging Bayesian inference, Variational Autoencoders and Concrete relaxations, we show how to learn a richer and more meaningful state space, e.g. encoding joint constraints and collisions with walls in a maze, from partial and high-dimensional observations. This representation translates into a gain of accuracy of learned dynamics showcased on various simulated tasks.

06 Dec 2022

autonomous-vehicles computer-science computer-vision-and-pattern-recognition

PRISM: Probabilistic Real-Time Inference in Spatial World Models

Technical University of Munich Volkswagen Group

We introduce PRISM, a method for real-time filtering in a probabilistic generative model of agent motion and visual perception. Previous approaches either lack uncertainty estimates for the map and agent state, do not run in real-time, do not have a dense scene representation or do not model agent dynamics. Our solution reconciles all of these aspects. We start from a predefined state-space model which combines differentiable rendering and 6-DoF dynamics. Probabilistic inference in this model amounts to simultaneous localisation and mapping (SLAM) and is intractable. We use a series of approximations to Bayesian inference to arrive at probabilistic map and state estimates. We take advantage of well-established methods and closed-form updates, preserving accuracy and enabling real-time capability. The proposed solution runs at 10Hz real-time and is similarly accurate to state-of-the-art SLAM in small to medium-sized indoor environments, with high-speed UAV and handheld camera agents (Blackbird, EuRoC and TUM-RGBD).

11 Mar 2025

computer-science artificial-intelligence machine-learning

M-HOF-Opt: Multi-Objective Hierarchical Output Feedback Optimization via Multiplier Induced Loss Landscape Scheduling

Helmholtz Munich Volkswagen Group U.S. FDA Center for Devices and Radiological Health KTH Royal Instiute of Technology

A probabilistic graphical model is proposed, modeling the joint model parameter and multiplier evolution, with a hypervolume based likelihood, promoting multi-objective descent in structural risk minimization. We address multi-objective model parameter optimization via a surrogate single objective penalty loss with time-varying multipliers, equivalent to online scheduling of loss landscape. The multi-objective descent goal is dispatched hierarchically into a series of constraint optimization sub-problems with shrinking bounds according to Pareto dominance. The bound serves as setpoint for the low-level multiplier controller to schedule loss landscapes via output feedback of each loss term. Our method forms closed loop of model parameter dynamic, circumvents excessive memory requirements and extra computational burden of existing multi-objective deep learning methods, and is robust against controller hyperparameter variation, demonstrated on domain generalization tasks with multi-dimensional regularization losses.

21 Mar 2024

computer-science machine-learning software-engineering

DomainLab: A modular Python package for domain generalization in deep learning

Technical University of Munich Helmholtz Munich German Cancer Research Center (DKFZ)Helmholtz AI Medical University of Innsbruck Volkswagen Group US Food and Drug Administration Roche

Poor generalization performance caused by distribution shifts in unseen domains often hinders the trustworthy deployment of deep neural networks. Many domain generalization techniques address this problem by adding a domain invariant regularization loss terms during training. However, there is a lack of modular software that allows users to combine the advantages of different methods with minimal effort for reproducibility. DomainLab is a modular Python package for training user specified neural networks with composable regularization loss terms. Its decoupled design allows the separation of neural networks from regularization loss construction. Hierarchical combinations of neural networks, different domain generalization methods, and associated hyperparameters, can all be specified together with other experimental setup in a single configuration file. Hierarchical combinations of neural networks, different domain generalization methods, and associated hyperparameters, can all be specified together with other experimental setup in a single configuration file. In addition, DomainLab offers powerful benchmarking functionality to evaluate the generalization performance of neural networks in out-of-distribution data. The package supports running the specified benchmark on an HPC cluster or on a standalone machine. The package is well tested with over 95 percent coverage and well documented. From the user perspective, it is closed to modification but open to extension. The package is under the MIT license, and its source code, tutorial and documentation can be found at this https URL

15 Jan 2025

computer-science artificial-intelligence machine-learning

Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning

Technical University of Munich Volkswagen Group Eötvös Loránd University Budapest

In offline reinforcement learning, a policy is learned using a static dataset in the absence of costly feedback from the environment. In contrast to the online setting, only using static datasets poses additional challenges, such as policies generating out-of-distribution samples. Model-based offline reinforcement learning methods try to overcome these by learning a model of the underlying dynamics of the environment and using it to guide policy search. It is beneficial but, with limited datasets, errors in the model and the issue of value overestimation among out-of-distribution states can worsen performance. Current model-based methods apply some notion of conservatism to the Bellman update, often implemented using uncertainty estimation derived from model ensembles. In this paper, we propose Constrained Latent Action Policies (C-LAP) which learns a generative model of the joint distribution of observations and actions. We cast policy learning as a constrained objective to always stay within the support of the latent action distribution, and use the generative capabilities of the model to impose an implicit constraint on the generated actions. Thereby eliminating the need to use additional uncertainty penalties on the Bellman update and significantly decreasing the number of gradient steps required to learn a policy. We empirically evaluate C-LAP on the D4RL and V-D4RL benchmark, and show that C-LAP is competitive to state-of-the-art methods, especially outperforming on datasets with visual observations.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

TensorFlow Quantum: A Software Framework for Quantum Machine Learning

LIMT: Language-Informed Multi-Task Visual World Models

Learning to Fly via Deep Model-Based Reinforcement Learning

TechOps: Technical Documentation Templates for the AI Act

On the Role of the Action Space in Robot Manipulation Learning and Sim-to-Real Transfer

A dataset of primary nasopharyngeal carcinoma MRI with multi-modalities segmentation

Beating classical heuristics for the binary paint shop problem with the quantum approximate optimization algorithm

Bayesian Learning of Neural Network Architectures

Quantum algorithms with local particle number conservation: noise effects and error correction

QUEST: QUantum-Enhanced Shared Transportation

A Collaborative, Human-Centred Taxonomy of AI, Algorithmic, and Automation Harms

Learning Flat Latent Manifolds with VAEs

Overcoming Knowledge Barriers: Online Imitation Learning from Visual Observation with Pretrained World Models

Metrics for Deep Generative Models

Automatic Differentiation for Tensor Algebras

Switching Linear Dynamics for Variational Bayes Filtering

PRISM: Probabilistic Real-Time Inference in Spatial World Models

M-HOF-Opt: Multi-Objective Hierarchical Output Feedback Optimization via Multiplier Induced Loss Landscape Scheduling

DomainLab: A modular Python package for domain generalization in deep learning

Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning

Events

AI for Law

Personalize Your Feed