Aeronautics Institute of Technology
With the widespread adoption of deep learning, reinforcement learning (RL) has experienced a dramatic increase in popularity, scaling to previously intractable problems, such as playing complex games from pixel observations, sustaining conversations with humans, and controlling robotic agents. However, there is still a wide range of domains inaccessible to RL due to the high cost and danger of interacting with the environment. Offline RL is a paradigm that learns exclusively from static datasets of previously collected interactions, making it feasible to extract policies from large and diverse training datasets. Effective offline RL algorithms have a much wider range of applications than online RL, being particularly appealing for real-world applications, such as education, healthcare, and robotics. In this work, we contribute with a unifying taxonomy to classify offline RL methods. Furthermore, we provide a comprehensive review of the latest algorithmic breakthroughs in the field using a unified notation as well as a review of existing benchmarks' properties and shortcomings. Additionally, we provide a figure that summarizes the performance of each method and class of methods on different dataset properties, equipping researchers with the tools to decide which type of algorithm is best suited for the problem at hand and identify which classes of algorithms look the most promising. Finally, we provide our perspective on open problems and propose future research directions for this rapidly growing field.
We introduce TartanAviation, an open-source multi-modal dataset focused on terminal-area airspace operations. TartanAviation provides a holistic view of the airport environment by concurrently collecting image, speech, and ADS-B trajectory data using setups installed inside airport boundaries. The datasets were collected at both towered and non-towered airfields across multiple months to capture diversity in aircraft operations, seasons, aircraft types, and weather conditions. In total, TartanAviation provides 3.1M images, 3374 hours of Air Traffic Control speech data, and 661 days of ADS-B trajectory data. The data was filtered, processed, and validated to create a curated dataset. In addition to the dataset, we also open-source the code-base used to collect and pre-process the dataset, further enhancing accessibility and usability. We believe this dataset has many potential use cases and would be particularly vital in allowing AI and machine learning technologies to be integrated into air traffic control systems and advance the adoption of autonomous aircraft in the airspace.
Training SER models in natural, spontaneous speech is especially challenging due to the subtle expression of emotions and the unpredictable nature of real-world audio. In this paper, we present a robust system for the INTERSPEECH 2025 Speech Emotion Recognition in Naturalistic Conditions Challenge, focusing on categorical emotion recognition. Our method combines state-of-the-art audio models with text features enriched by prosodic and spectral cues. In particular, we investigate the effectiveness of Fundamental Frequency (F0) quantization and the use of a pretrained audio tagging model. We also employ an ensemble model to improve robustness. On the official test set, our system achieved a Macro F1-score of 39.79% (42.20% on validation). Our results underscore the potential of these methods, and analysis of fusion techniques confirmed the effectiveness of Graph Attention Networks. Our source code is publicly available.
TSformer-VO, an end-to-end Transformer-based model, reframes monocular visual odometry as a video understanding task to estimate 6-DoF camera pose. It surpasses the DeepVO baseline and achieves competitive translational accuracy against ORB-SLAM3 on the KITTI dataset, operating in real-time.
75
This work contributes to developing an agent based on deep reinforcement learning capable of acting in a beyond visual range (BVR) air combat simulation environment. The paper presents an overview of building an agent representing a high-performance fighter aircraft that can learn and improve its role in BVR combat over time based on rewards calculated using operational metrics. Also, through self-play experiments, it expects to generate new air combat tactics never seen before. Finally, we hope to examine a real pilot's ability, using virtual simulation, to interact in the same environment with the trained agent and compare their performances. This research will contribute to the air combat training context by developing agents that can interact with real pilots to improve their performances in air defense missions.
In this paper, we introduce an alternative approach to enhancing Multi-Agent Reinforcement Learning (MARL) through the integration of domain knowledge and attention-based policy mechanisms. Our methodology focuses on the incorporation of domain-specific expertise into the learning process, which simplifies the development of collaborative behaviors. This approach aims to reduce the complexity and learning overhead typically associated with MARL by enabling agents to concentrate on essential aspects of complex tasks, thus optimizing the learning curve. The utilization of attention mechanisms plays a key role in our model. It allows for the effective processing of dynamic context data and nuanced agent interactions, leading to more refined decision-making. Applied in standard MARL scenarios, such as the Stanford Intelligent Systems Laboratory (SISL) Pursuit and Multi-Particle Environments (MPE) Simple Spread, our method has been shown to improve both learning efficiency and the effectiveness of collaborative behaviors. The results indicate that our attention-based approach can be a viable approach for improving the efficiency of MARL training process, integrating domain-specific knowledge at the action level.
Solar tornadoes are believed to influence plasma dynamics and create conditions for heating, yet a direct quantitative link is lacking. Here, for the first time, we directly measure vortex-driven dynamics using information-theoretic diagnostics in a Bifrost simulation. By combining Shannon Entropy (SE) and Normalized Mutual Information (NMI), we track how vortices restructure plasma-magnetic interactions and channel energy into heat. The vortex flow originates in the upper photosphere and extends into the chromosphere and upper atmosphere. Relative to a control region dominated by shear flows and transient swirls, the coherent vortex shows stronger statistical interdependence between vorticity and other MHD variables. SE shows that vertical magnetic field entropy increases at the vortex onset as magnetic flux is redistributed, then decreases as the field becomes ordered. Magnetic shear and magnetic energy entropy peak during vortex development, reflecting current and energy build-up. In the upper atmosphere, low entropy in temperature and pressure alongside high entropy in density indicates ordered thermal and pressure fields but irregular mass distribution, a departure from ideal-gas behavior confirmed by weakened temperature-density coupling. NMI shows that temperature couples to different heating drivers with height: compression and vorticity (viscous heating, since vorticity traces velocity gradients) in the lower atmosphere, and vorticity plus magnetic shear in the upper atmosphere (viscous and current-driven heating). Although the simulation does not include explicit physical dissipation, hyperdiffusivity acts on sharp gradients and mimics these processes. Our results demonstrate that vortices drive multiscale coupling between flows and fields, locally shaping solar atmospheric dynamics and heating
Surface-to-Air Missiles (SAMs) are crucial in modern air defense systems. A critical aspect of their effectiveness is the Engagement Zone (EZ), the spatial region within which a SAM can effectively engage and neutralize a target. Notably, the EZ is intrinsically related to the missile's maximum range; it defines the furthest distance at which a missile can intercept a target. The accurate computation of this EZ is essential but challenging due to the dynamic and complex factors involved, which often lead to high computational costs and extended processing times when using conventional simulation methods. In light of these challenges, our study investigates the potential of machine learning techniques, proposing an approach that integrates machine learning with a custom-designed simulation tool to train supervised algorithms. We leverage a comprehensive dataset of pre-computed SAM EZ simulations, enabling our model to accurately predict the SAM EZ for new input parameters. It accelerates SAM EZ simulations, enhances air defense strategic planning, and provides real-time insights, improving SAM system performance. The study also includes a comparative analysis of machine learning algorithms, illuminating their capabilities and performance metrics and suggesting areas for future research, highlighting the transformative potential of machine learning in SAM EZ simulations.
Monocular visual odometry consists of the estimation of the position of an agent through images of a single camera, and it is applied in autonomous vehicles, medical robots, and augmented reality. However, monocular systems suffer from the scale ambiguity problem due to the lack of depth information in 2D frames. This paper contributes by showing an application of the dense prediction transformer model for scale estimation in monocular visual odometry systems. Experimental results show that the scale drift problem of monocular systems can be reduced through the accurate estimation of the depth map by this model, achieving competitive state-of-the-art performance on a visual odometry benchmark.
33
The modernization of an airline's fleet can reduce its operating costs, improve the perceived quality of service offered to passengers, and mitigate emissions. The present paper investigates the market incentives that airlines have to adopt technological innovation from manufacturers by acquiring new generation aircraft. We develop an econometric model of fleet modernization in the Brazilian commercial aviation over two decades. We examine the hypothesis of an inverted-U relationship between market concentration and fleet modernization and find evidence that both the extremes of competition and concentration may inhibit innovation adoption by carriers. We find limited evidence associating either hubbing activity or low-cost carriers with the more intense introduction of new types of aircraft models and variants in the industry. Finally, our results suggest that energy cost rises may provoke boosts in fleet modernization in the long term, with carriers possibly targeting more eco-efficient operations up to two years after an upsurge in fuel price.
Extracting single-cell information from microscopy data requires accurate instance-wise segmentations. Obtaining pixel-wise segmentations from microscopy imagery remains a challenging task, especially with the added complexity of microstructured environments. This paper presents a novel dataset for segmenting yeast cells in microstructures. We offer pixel-wise instance segmentation labels for both cells and trap microstructures. In total, we release 493 densely annotated microscopy images. To facilitate a unified comparison between novel segmentation algorithms, we propose a standardized evaluation strategy for our dataset. The aim of the dataset and evaluation strategy is to facilitate the development of new cell segmentation approaches. The dataset is publicly available at this https URL .
14
Since the introduction of Industry 4.0, digital twin technology has significantly evolved, laying the groundwork for a transition toward Industry 5.0 principles centered on human-centricity, sustainability, and resilience. Through digital twins, real-time connected production systems are anticipated to be more efficient, resilient, and sustainable, facilitating communication and connectivity between digital and physical systems. However, environmental performance and integration with virtual reality (VR) and artificial intelligence (AI) of such systems remain challenging. Further exploration of digital twin technologies is needed to validate the real-world impact and benefits. This paper investigates these challenges by implementing a real-time digital twin based on the ISO 23247 standard, connecting the physical factory and simulation software with VR capabilities. This digital twin system provides cognitive assistance and a user-friendly interface for operators, thereby improving cognitive ergonomics. The connection of the Internet of Things (IoT) platform allows the digital twin to have real-time bidirectional communication, collaboration, monitoring, and assistance. A lab-scale drone factory was used as the digital twin application to test and evaluate the ISO 23247 standard and its potential benefits. Additionally, AI integration and environmental performance Key Performance Indicators (KPIs) have been considered as the next stages in improving VR-integrated digital twins. With a solid theoretical foundation and a demonstration of the VR-integrated digital twins, this paper addresses integration issues between various technologies and advances the framework of digital twins based on ISO 23247.
In the current level of evolution of Soccer 3D, motion control is a key factor in team's performance. Recent works takes advantages of model-free approaches based on Machine Learning to exploit robot dynamics in order to obtain faster locomotion skills, achieving running policies and, therefore, opening a new research direction in the Soccer 3D environment. In this work, we present a methodology based on Deep Reinforcement Learning that learns running skills without any prior knowledge, using a neural network whose inputs are related to robot's dynamics. Our results outperformed the previous state-of-the-art sprint velocity reported in Soccer 3D literature by a significant margin. It also demonstrated improvement in sample efficiency, being able to learn how to run in just few hours. We reported our results analyzing the training procedure and also evaluating the policies in terms of speed, reliability and human similarity. Finally, we presented key factors that lead us to improve previous results and shared some ideas for future work.
3
This paper explores the application of quantum-hydrodynamic models to study two-dimensional electron gases, with a focus on nonlocal plasmonics and nonlinear optics. We begin by reviewing the derivation of the Madelung equations from the Wigner distribution function. Using the Madelung equations in conjunction with Poisson's equation, we calculate the spectrum of magnetoplasmons and the magneto-optical conductivity in the electrostatic regime, incorporating nonlocal corrections due to the Fermi pressure. In the absence of a magnetic field, we analyze nonlinear and nonlocal second-harmonic generation, demonstrating how plasmon excitation enhances this process. We further discuss the emergence of self-modulation phenomena driven by nonlinearity, leading to the renormalization of the plasmon dispersion. Notably, we show that nonlinearity amplifies nonlocal effects and, leveraging the hydrodynamic formalism, derive a simple analytic expression for the renormalized spectra. Additionally, we examine the role of the quantum potential, interpreted as a gradient correction to the Thomas--Fermi kinetic energy. Our results provide new insights into quantum effects in plasmonic systems, with significant implications for future advances in nanophotonics through the lens of hydrodynamic theory.
Magnetized ferromagnetic disks or wires support strong inhomogeneous fields in their borders. Such magnetic fields create an effective potential, due to Zeeman and diamagnetic contributions, that can localize charge carriers. For the case of two-dimensional transition metal dichalcogenides, this potential can valley-localize excitons due to the Zeeman term, which breaks the valley symmetry. We show that the diamagnetic term is negligible when compared to the Zeeman term for monolayers of transition metal dichalcogenides. The latter is responsible for trapping excitons near the magnetized structure border with valley-dependent characteristics, in which, for one of the valleys, the exciton is confined inside the disk, while for the other, it is outside. This spatial valley separation of exciton can be probed by circularly polarized light, and moreover, we show that the inhomogeneous magnetic field magnitude, the dielectric environment, and the magnetized structure parameters can tailor the spatial separation of the exciton wavefunctions.
This study proposes social navigation metrics for autonomous agents in air combat, aiming to facilitate their smooth integration into pilot formations. The absence of such metrics poses challenges to safety and effectiveness in mixed human-autonomous teams. The proposed metrics prioritize naturalness and comfort. We suggest validating them through a user study involving military pilots in simulated air combat scenarios alongside autonomous loyal wingmen. The experiment will involve setting up simulations, designing scenarios, and evaluating performance using feedback from questionnaires and data analysis. These metrics aim to enhance the operational performance of autonomous loyal wingmen, thereby contributing to safer and more strategic air combat.
We investigate the influence of the finite Larmor radius on the dynamics of guiding-center test particles subjected to an E×B\mathbf{E} \times \mathbf{B} drift in a large aspect-ratio tokamak. For that, we adopt the drift-wave test particle transport model presented by W. Horton [Physics of Plasmas \textbf{5}, 3910 (1998)] and introduce a second-order gyro-averaged extension, which accounts for the finite Larmor radius effect that arises from a spatially varying electric field. Using this extended model, we numerically examine the influence of the finite Larmor radius on chaotic transport and the formation of transport barriers. For non-monotonic plasma profiles, we show that the twist condition of the dynamical system, i.e.,\ KAM theorem's non-degeneracy condition for the Hamiltonian, is violated along a special curve, which, under non-equilibrium conditions, exhibits significant resilience to destruction, thereby inhibiting chaotic transport. This curve acts as a robust barrier to transport and is usually called shearless transport barrier. While varying the amplitude of the electrostatic perturbations, we analyze bifurcation diagrams of the shearless barriers and escape rates of orbits to explore the impact of the finite Larmor radius on controlling chaotic transport. Our findings show that increasing the Larmor radius enhances the robustness of transport barriers, as larger electrostatic perturbation amplitudes are required to disrupt them. Additionally, as the Larmor radius increases, even in the absence of transport barriers, we observe a reduction in the escape rates, indicating a decrease in chaotic transport.
By harnessing the unique properties of bilayer graphene, we present a flexible platform for achieving electrically tunable exciton polaritons within a microcavity. Using a semiclassical approach, we solve Maxwell's equations within the cavity, approximating the optical conductivity of bilayer graphene through its excitonic response as described by the Elliott formula. Transitioning to a quantum mechanical framework, we diagonalize the Hamiltonian governing excitons and cavity photons, revealing the resulting polariton dispersions, Hopfield coefficients and Rabi splittings. Our analysis predicts that, under realistic exciton lifetimes, the exciton-photon interaction reaches the strong coupling regime. Furthermore, we explore the integration of an epsilon-near-zero material within the cavity, demonstrating that such a configuration can further enhance the light-matter interaction.
This paper presents five different statistical methods for ground scene prediction (GSP) in wavelength-resolution synthetic aperture radar (SAR) images. The GSP image can be used as a reference image in a change detection algorithm yielding a high probability of detection and low false alarm rate. The predictions are based on image stacks, which are composed of images from the same scene acquired at different instants with the same flight geometry. The considered methods for obtaining the ground scene prediction include (i) autoregressive models; (ii) trimmed mean; (iii) median; (iv) intensity mean; and (v) mean. It is expected that the predicted image presents the true ground scene without change and preserves the ground backscattering pattern. The study indicate that the the median method provided the most accurate representation of the true ground. To show the applicability of the GSP, a change detection algorithm was considered using the median ground scene as a reference image. As a result, the median method displayed the probability of detection of 97%97\% and a false alarm rate of 0.11/km$^2, when considering military vehicles concealed in a forest.
This work investigates the use of a Deep Neural Network (DNN) to perform an estimation of the Weapon Engagement Zone (WEZ) maximum launch range. The WEZ allows the pilot to identify an airspace in which the available missile has a more significant probability of successfully engaging a particular target, i.e., a hypothetical area surrounding an aircraft in which an adversary is vulnerable to a shot. We propose an approach to determine the WEZ of a given missile using 50,000 simulated launches in variate conditions. These simulations are used to train a DNN that can predict the WEZ when the aircraft finds itself on different firing conditions, with a coefficient of determination of 0.99. It provides another procedure concerning preceding research since it employs a non-discretized model, i.e., it considers all directions of the WEZ at once, which has not been done previously. Additionally, the proposed method uses an experimental design that allows for fewer simulation runs, providing faster model training.
There are no more papers matching your filters at the moment.