AGH University of Science and Technology
Deep Reinforcement Learning (DRL) has shown a dramatic improvement in decision-making and automated control problems. Consequently, DRL represents a promising technique to efficiently solve many relevant optimization problems (e.g., routing) in self-driving networks. However, existing DRL-based solutions applied to networking fail to generalize, which means that they are not able to operate properly when applied to network topologies not observed during training. This lack of generalization capability significantly hinders the deployment of DRL technologies in production networks. This is because state-of-the-art DRL-based networking solutions use standard neural networks (e.g., fully connected, convolutional), which are not suited to learn from information structured as graphs. In this paper, we integrate Graph Neural Networks (GNN) into DRL agents and we design a problem specific action space to enable generalization. GNNs are Deep Learning models inherently designed to generalize over graphs of different sizes and structures. This allows the proposed GNN-based DRL agent to learn and generalize over arbitrary network topologies. We test our DRL+GNN agent in a routing optimization use case in optical networks and evaluate it on 180 and 232 unseen synthetic and real-world network topologies respectively. The results show that the DRL+GNN agent is able to outperform state-of-the-art solutions in topologies never seen during training.
95
Masked Image Modeling (MIM) has emerged as a promising approach for Self-Supervised Learning (SSL) of visual representations. However, the out-of-the-box performance of MIMs is typically inferior to competing approaches. Most users cannot afford fine-tuning due to the need for large amounts of data, high GPU consumption, and specialized user knowledge. Therefore, the practical use of MIM representations is limited. In this paper we ask what is the reason for the poor out-of-the-box performance of MIMs. Is it due to weaker features produced by MIM models, or is it due to suboptimal usage? Through detailed analysis, we show that attention in MIMs is spread almost uniformly over many patches, leading to ineffective aggregation by the [cls] token. Based on this insight, we propose Selective Aggregation to better capture the rich semantic information retained in patch tokens, which significantly improves the out-of-the-box performance of MIM.
2
This study investigates the potential of quantum machine learning to improve flood forecasting we focus on daily flood events along Germany's Wupper River in 2023 our approach combines classical machine learning techniques with QML techniques this hybrid model leverages quantum properties like superposition and entanglement to achieve better accuracy and efficiency classical and QML models are compared based on training time accuracy and scalability results show that QML models offer competitive training times and improved prediction accuracy this research signifies a step towards utilizing quantum technologies for climate change adaptation we emphasize collaboration and continuous innovation to implement this model in real-world flood management ultimately enhancing global resilience against floods
Deep learning has contributed greatly to many successes in artificial intelligence in recent years. Today, it is possible to train models that have thousands of layers and hundreds of billions of parameters. Large-scale deep models have achieved great success, but the enormous computational complexity and gigantic storage requirements make it extremely difficult to implement them in real-time applications. On the other hand, the size of the dataset is still a real problem in many domains. Data are often missing, too expensive, or impossible to obtain for other reasons. Ensemble learning is partially a solution to the problem of small datasets and overfitting. However, ensemble learning in its basic version is associated with a linear increase in computational complexity. We analyzed the impact of the ensemble decision-fusion mechanism and checked various methods of sharing the decisions including voting algorithms. We used the modified knowledge distillation framework as a decision-fusion mechanism which allows in addition compressing of the entire ensemble model into a weight space of a single model. We showed that knowledge distillation can aggregate knowledge from multiple teachers in only one student model and, with the same computational complexity, obtain a better-performing model compared to a model trained in the standard manner. We have developed our own method for mimicking the responses of all teachers at the same time, simultaneously. We tested these solutions on several benchmark datasets. In the end, we presented a wide application use of the efficient multi-teacher knowledge distillation framework. In the first example, we used knowledge distillation to develop models that could automate corrosion detection on aircraft fuselage. The second example describes detection of smoke on observation cameras in order to counteract wildfires in forests.
Reinforcement learning is of increasing importance in the field of robot control and simulation plays a~key role in this process. In the unmanned aerial vehicles (UAVs, drones), there is also an increase in the number of published scientific papers involving this approach. In this work, an autonomous drone control system was prepared to fly forward (according to its coordinates system) and pass the trees encountered in the forest based on the data from a rotating LiDAR sensor. The Proximal Policy Optimization (PPO) algorithm, an example of reinforcement learning (RL), was used to prepare it. A custom simulator in the Python language was developed for this purpose. The Gazebo environment, integrated with the Robot Operating System (ROS), was also used to test the resulting control algorithm. Finally, the prepared solution was implemented in the Nvidia Jetson Nano eGPU and verified in the real tests scenarios. During them, the drone successfully completed the set task and was able to repeatably avoid trees and fly through the forest.
Integrating modern machine learning and clinical decision-making has great promise for mitigating healthcare's increasing cost and complexity. We introduce the Enhanced Transformer for Health Outcome Simulation (ETHOS), a novel application of the transformer deep-learning architecture for analyzing high-dimensional, heterogeneous, and episodic health data. ETHOS is trained using Patient Health Timelines (PHTs)-detailed, tokenized records of health events-to predict future health trajectories, leveraging a zero-shot learning approach. ETHOS represents a significant advancement in foundation model development for healthcare analytics, eliminating the need for labeled data and model fine-tuning. Its ability to simulate various treatment pathways and consider patient-specific factors positions ETHOS as a tool for care optimization and addressing biases in healthcare delivery. Future developments will expand ETHOS' capabilities to incorporate a wider range of data types and data sources. Our work demonstrates a pathway toward accelerated AI development and deployment in healthcare.
Network modeling is a fundamental tool in network research, design, and operation. Arguably the most popular method for modeling is Queuing Theory (QT). Its main limitation is that it imposes strong assumptions on the packet arrival process, which typically do not hold in real networks. In the field of Deep Learning, Graph Neural Networks (GNN) have emerged as a new technique to build data-driven models that can learn complex and non-linear behavior. In this paper, we present \emph{RouteNet-Erlang}, a pioneering GNN architecture designed to model computer networks. RouteNet-Erlang supports complex traffic models, multi-queue scheduling policies, routing policies and can provide accurate estimates in networks not seen in the training phase. We benchmark RouteNet-Erlang against a state-of-the-art QT model, and our results show that it outperforms QT in all the network scenarios.
In recent years, a series of Transformer-based models unlocked major improvements in general natural language understanding (NLU) tasks. Such a fast pace of research would not be possible without general NLU benchmarks, which allow for a fair comparison of the proposed methods. However, such benchmarks are available only for a handful of languages. To alleviate this issue, we introduce a comprehensive multi-task benchmark for the Polish language understanding, accompanied by an online leaderboard. It consists of a diverse set of tasks, adopted from existing datasets for named entity recognition, question-answering, textual entailment, and others. We also introduce a new sentiment analysis task for the e-commerce domain, named Allegro Reviews (AR). To ensure a common evaluation scheme and promote models that generalize to different NLU tasks, the benchmark includes datasets from varying domains and applications. Additionally, we release HerBERT, a Transformer-based model trained specifically for the Polish language, which has the best average performance and obtains the best results for three out of nine tasks. Finally, we provide an extensive evaluation, including several standard baselines and recently proposed, multilingual Transformer-based models.
CERN, in collaboration with AGH University, developed a conceptual framework to integrate an LSTM-based quench prediction algorithm into its ELQA web-based data analysis platform, aiming to provide an additional monitoring layer for LHC superconducting magnets. This work establishes the architectural feasibility of combining advanced deep learning with an existing operational data environment, noting that current data resolution limits precise prediction.
Network models are an essential block of modern networks. For example, they are widely used in network planning and optimization. However, as networks increase in scale and complexity, some models present limitations, such as the assumption of Markovian traffic in queuing theory models, or the high computational cost of network simulators. Recent advances in machine learning, such as Graph Neural Networks (GNN), are enabling a new generation of network models that are data-driven and can learn complex non-linear behaviors. In this paper, we present RouteNet-Fermi, a custom GNN model that shares the same goals as Queuing Theory, while being considerably more accurate in the presence of realistic traffic models. The proposed model predicts accurately the delay, jitter, and packet loss of a network. We have tested RouteNet-Fermi in networks of increasing size (up to 300 nodes), including samples with mixed traffic profiles -- e.g., with complex non-Markovian models -- and arbitrary routing and queue scheduling configurations. Our experimental results show that RouteNet-Fermi achieves similar accuracy as computationally-expensive packet-level simulators and scales accurately to larger networks. Our model produces delay estimates with a mean relative error of 6.24% when applied to a test dataset of 1,000 samples, including network topologies one order of magnitude larger than those seen during training. Finally, we have also evaluated RouteNet-Fermi with measurements from a physical testbed and packet traces from a real-life network.
A collision between a proton and a heavy nucleus at ultrarelativistic energy creates particles whose rapidity distribution is asymmetric, with more particles emitted in the direction of the nucleus than in the direction of the proton. This asymmetry becomes more pronounced as the centrality estimator, defined from the energy deposited in a calorimeter, increases. We argue that for high-multiplicity collisions, the variation of the impact parameter plays a negligible role, and that the fluctuations of the multiplicity and of the centrality estimator are dominated by quantum fluctuations, whose probability distribution can be well approximated by a correlated gamma distribution. We show that this simple model reproduces existing data, and we make quantitative predictions for collisions in the 00.1%0-0.1\% and 00.01%0-0.01\% centrality windows. We argue that by repeating the same analysis with a different centrality estimator, one can obtain direct information about the rapidity decorrelation in particle production.
Network modeling is a key enabler to achieve efficient network operation in future self-driving Software-Defined Networks. However, we still lack functional network models able to produce accurate predictions of Key Performance Indicators (KPI) such as delay, jitter or loss at limited cost. In this paper we propose RouteNet, a novel network model based on Graph Neural Network (GNN) that is able to understand the complex relationship between topology, routing, and input traffic to produce accurate estimates of the per-source/destination per-packet delay distribution and loss. RouteNet leverages the ability of GNNs to learn and model graph-structured information and as a result, our model is able to generalize over arbitrary topologies, routing schemes and traffic intensity. In our evaluation, we show that RouteNet is able to predict accurately the delay distribution (mean delay and jitter) and loss even in topologies, routing and traffic unseen in the training (worst case MRE=15.4%). Also, we present several use cases where we leverage the KPI predictions of our GNN model to achieve efficient routing optimization and network planning.
Action scene understanding in soccer is a challenging task due to the complex and dynamic nature of the game, as well as the interactions between players. This article provides a comprehensive overview of this task divided into action recognition, spotting, and spatio-temporal action localization, with a particular emphasis on the modalities used and multimodal methods. We explore the publicly available data sources and metrics used to evaluate models' performance. The article reviews recent state-of-the-art methods that leverage deep learning techniques and traditional methods. We focus on multimodal methods, which integrate information from multiple sources, such as video and audio data, and also those that represent one source in various ways. The advantages and limitations of methods are discussed, along with their potential for improving the accuracy and robustness of models. Finally, the article highlights some of the open research questions and future directions in the field of soccer action recognition, including the potential for multimodal methods to advance this field. Overall, this survey provides a valuable resource for researchers interested in the field of action scene understanding in soccer.
The Deferred Acceptance (DA) algorithm is an elegant procedure for finding a stable matching in two-sided matching markets. It ensures that no pair of agents prefers each other to their matched partners. In this work, we initiate the study of two-sided manipulations in matching markets as non-cooperative games. We introduce the accomplice manipulation game, where a man misreports to help a specific woman obtain a better partner, whenever possible. We provide a polynomial time algorithm for finding a pure strategy Nash equilibrium (NE) and show that our algorithm always yields a stable matching - although not every Nash equilibrium corresponds to a stable matching. Additionally, we show how our analytical techniques for the accomplice manipulation game can be applied to other manipulation games in matching markets, such as one-for-many and the standard self-manipulation games. We complement our theoretical findings with empirical evaluations of different properties of the resulting NE, such as the welfare of the agents.
Object detection is an essential component of many vision systems. For example, pedestrian detection is used in advanced driver assistance systems (ADAS) and advanced video surveillance systems (AVSS). Currently, most detectors use deep convolutional neural networks (e.g., the YOLO -- You Only Look Once -- family), which, however, due to their high computational complexity, are not able to process a very high-resolution video stream in real-time, especially within a limited energy budget. In this paper we present a hardware implementation of the well-known pedestrian detector with HOG (Histogram of Oriented Gradients) feature extraction and SVM (Support Vector Machine) classification. Our system running on AMD Xilinx Zynq UltraScale+ MPSoC (Multiprocessor System on Chip) device allows real-time processing of 4K resolution (UHD -- Ultra High Definition, 3840 x 2160 pixels) video for 60 frames per second. The system is capable of detecting a pedestrian in a single scale. The results obtained confirm the high suitability of reprogrammable devices in the real-time implementation of embedded vision systems.
We perform a search for light sterile neutrinos using the data from the T2K far detector at a baseline of 295 km, with an exposure of 14.7 (7.6)$\times 10^{20}$ protons on target in neutrino (antineutrino) mode. A selection of neutral current interaction samples are also used to enhance the sensitivity to sterile mixing. No evidence of sterile neutrino mixing in the 3+1 model was found from a simultaneous fit to the charged-current muon, electron and neutral current neutrino samples. We set the most stringent limit on the sterile oscillation amplitude sin2θ24\sin^2\theta_{24} for the sterile neutrino mass splitting \Delta m^2_{41}<3\times 10^{-3} eV2/c4^2/c^4.
University of Toronto logoUniversity of TorontoUniversity of MississippiAcademia SinicaUniversity of CincinnatiUniversity of Illinois at Urbana-Champaign logoUniversity of Illinois at Urbana-ChampaignUniversity of Pittsburgh logoUniversity of PittsburghUniversity of OsloUniversity of Cambridge logoUniversity of CambridgeUniversity of VictoriaKyungpook National UniversityVanderbilt UniversityUniversité de Montréal logoUniversité de MontréalUniversity of OklahomaDESYUniversity of Manchester logoUniversity of ManchesterUniversity of ZurichUniversity of BernTel Aviv University logoTel Aviv UniversityUC Berkeley logoUC BerkeleyUniversity of Oxford logoUniversity of OxfordNikhefUniversity of Science and Technology of China logoUniversity of Science and Technology of ChinaSungkyunkwan UniversityUniversity of California, Irvine logoUniversity of California, IrvinePanjab UniversityKyoto University logoKyoto UniversityUniversity of Bristol logoUniversity of BristolThe University of EdinburghFermilabUniversity of British Columbia logoUniversity of British ColumbiaOkayama UniversityNorthwestern University logoNorthwestern UniversityBoston University logoBoston UniversityUniversity of Texas at Austin logoUniversity of Texas at AustinLancaster UniversityUniversity of Florida logoUniversity of FloridaINFN Sezione di PisaKansas State UniversityCERN logoCERNArgonne National Laboratory logoArgonne National LaboratoryUniversidad de GranadaUniversity of Southampton logoUniversity of SouthamptonUniversity of Minnesota logoUniversity of MinnesotaUniversity of Maryland logoUniversity of MarylandBrookhaven National Laboratory logoBrookhaven National LaboratoryUniversity of Wisconsin-Madison logoUniversity of Wisconsin-MadisonUniversité Paris-Saclay logoUniversité Paris-SaclayUniversity of HelsinkiKing’s College London logoKing’s College LondonUniversity of LiverpoolSorbonne Université logoSorbonne UniversitéUniversity of Massachusetts AmherstUniversity of RochesterVirginia Tech logoVirginia TechFermi National Accelerator LaboratoryUniversity of SheffieldTechnionUniversity of GenevaBergische Universität WuppertalUniversity of BelgradeUniversity of GlasgowUniversity of SiegenQueen Mary University of London logoQueen Mary University of LondonUniversity of Warwick logoUniversity of WarwickUniversidade Federal do ABCWayne State UniversityIndian Institute of Technology MadrasIowa State UniversityKarlsruhe Institute of Technology logoKarlsruhe Institute of TechnologyUniversità di GenovaUniversity of SussexUniversity College DublinUniversity of New MexicoUniversidade Federal do Rio de JaneiroUniversità di TriesteSejong UniversityUniversity of Southern DenmarkUniversity of OregonUniversity of AlabamaUniversität HamburgSOKENDAI (The Graduate University for Advanced Studies)Tokyo Institute of TechnologyUniversitat Autònoma de BarcelonaBelarusian State UniversityUniversit`a di BolognaPontificia Universidad Católica de ChileUniversidad de AntioquiaAlbert-Ludwigs-Universität FreiburgUniversity of KansasINFN, Laboratori Nazionali di FrascatiUniversità di Napoli Federico IIUniversity of California, Santa Cruz logoUniversity of California, Santa CruzCINVESTAVUniversidad de Los AndesUniversity of California RiversideUniversité de Paris-SaclayUniversity of LouvainINFN - Sezione di PadovaAGH University of Science and TechnologyBen Gurion UniversityUniversità degli Studi di Urbino ’Carlo Bo’University of ToyamaINFN Milano-BicoccaInstitute of High Energy Physics, CASSLACINFN Sezione di RomaINFN CagliariINFN - PadovaINFN MilanoUniversity of the PacificINFN-LecceUniversity of Mississippi Medical CenterThe American University in CairoINFN-FirenzeUniversité de Savoie Mont BlancUniversidad Antonio NariñoLaboratoire de Physique Nucléaire et de Hautes ÉnergiesLAPP, Université Savoie Mont Blanc, CNRSCPPM, Aix-Marseille Université, CNRS/IN2P3University of Puerto Rico - MayagüezIFIC (CSIC & Universitat de Valencia)INFN - PerugiaINFN-Sezione di FerraraUniversit catholique de LouvainUniversit Paris DiderotUniversit Libre de BruxellesUniversit de StrasbourgRWTH Aachen UniversityUniversit de LyonUniversit Clermont AuvergneUniversit degli Studi di MilanoUniversit di PaviaUniversit di Roma Tor Vergata
This is the third out of five chapters of the final report [1] of the Workshop on Physics at HL-LHC, and perspectives on HE-LHC [2]. It is devoted to the study of the potential, in the search for Beyond the Standard Model (BSM) physics, of the High Luminosity (HL) phase of the LHC, defined as 3 ab13~\mathrm{ab}^{-1} of data taken at a centre-of-mass energy of 14 TeV14~\mathrm{TeV}, and of a possible future upgrade, the High Energy (HE) LHC, defined as 15 ab115~\mathrm{ab}^{-1} of data at a centre-of-mass energy of 27 TeV27~\mathrm{TeV}. We consider a large variety of new physics models, both in a simplified model fashion and in a more model-dependent one. A long list of contributions from the theory and experimental (ATLAS, CMS, LHCb) communities have been collected and merged together to give a complete, wide, and consistent view of future prospects for BSM physics at the considered colliders. On top of the usual standard candles, such as supersymmetric simplified models and resonances, considered for the evaluation of future collider potentials, this report contains results on dark matter and dark sectors, long lived particles, leptoquarks, sterile neutrinos, axion-like particles, heavy scalars, vector-like quarks, and more. Particular attention is placed, especially in the study of the HL-LHC prospects, to the detector upgrades, the assessment of the future systematic uncertainties, and new experimental techniques. The general conclusion is that the HL-LHC, on top of allowing to extend the present LHC mass and coupling reach by 2050%20-50\% on most new physics scenarios, will also be able to constrain, and potentially discover, new physics that is presently unconstrained. Moreover, compared to the HL-LHC, the reach in most observables will generally more than double at the HE-LHC, which may represent a good candidate future facility for a final test of TeV-scale new physics.
Artificial intelligence have contributed to advancements across various industries. However, the rapid growth of artificial intelligence technologies also raises concerns about their environmental impact, due to associated carbon footprints to train computational models. Fetal brain segmentation in medical imaging is challenging due to the small size of the fetal brain and the limited image quality of fast 2D sequences. Deep neural networks are a promising method to overcome this challenge. In this context, the construction of larger models requires extensive data and computing power, leading to high energy consumption. Our study aims to explore model architectures and compression techniques that promote energy efficiency by optimizing the trade-off between accuracy and energy consumption through various strategies such as lightweight network design, architecture search, and optimized distributed training tools. We have identified several effective strategies including optimization of data loading, modern optimizers, distributed training strategy implementation, and reduced floating point operations precision usage with light model architectures while tuning parameters according to available computer resources. Our findings demonstrate that these methods lead to satisfactory model performance with low energy consumption during deep neural network training for medical image segmentation.
The adsorption of MgO molecules on a Fe(001) surface was studied using density functional theory (DFT) and projector augmented wave methods. The energetically most favored configurations for different adsorption sites considered were identified. The most preferable adsorption geometry is when the MgO molecules are parallel to the surface, with Mg in the interstitial site and O in on-top of the Fe atom. During the adsorption of subsequent MgO molecules in this geometry, a sharp, non-oxidized interface is formed between the MgO adlayer and Fe(001) surface. The adsorption of MgO perpendicular to the surface, with oxygen incorporated in the topmost Fe layer is less probable, but may lead to the formation of the FeO layer when stabilized with an excess of oxygen atoms. Structural, electronic and magnetic properties of both interface types were examined for the MgO coverage from 1/9 to 1 monolayer (ML). Electronic and magnetic properties are sensitive to the MgO coverage. For lower coverage of MgO, clear hybridization between the Fe 3d and O 2p states is shown. The average magnetic moment of the surface Fe atoms is reduced with coverage, achieving 2.78 μB\mu_{\rm B} for 1 ML of MgO.
Michigan State University logoMichigan State UniversityUCLA logoUCLAChinese Academy of Sciences logoChinese Academy of SciencesUC Berkeley logoUC BerkeleyFudan University logoFudan UniversityIndiana UniversityPanjab UniversityPusan National UniversityOhio State UniversityPennsylvania State UniversityTexas A&M University logoTexas A&M UniversityJoint Institute for Nuclear ResearchLehigh UniversityArgonne National Laboratory logoArgonne National LaboratoryRice University logoRice UniversityUniversity of Tokyo logoUniversity of TokyoBrookhaven National Laboratory logoBrookhaven National LaboratoryLawrence Berkeley National Laboratory logoLawrence Berkeley National LaboratoryPurdue University logoPurdue UniversityUniversity of California, Davis logoUniversity of California, DavisUniversity of Illinois at ChicagoUniversity of HeidelbergUniversity of HoustonCentral China Normal UniversityShandong University logoShandong UniversityTechnische Universität DarmstadtTemple UniversityUniversity of TsukubaCzech Technical University in PragueUniversidade de São PauloNational Tsing-Hua UniversityKent State UniversityUniversity of KentuckyELTE Eötvös Loránd UniversityUniversity of California RiversideMax-Planck-Institut für PhysikAGH University of Science and TechnologyInstitute of Physics, BhubaneswarAbilene Christian UniversityCreighton UniversityUniversity of RichmondSouthern Connecticut State UniversityFrankfurt Institute for Advanced Studies (FIAS)Nuclear Physics Institute of the CASState University of New York, Stony BrookInstitute of Nuclear Physics PANUniversity of JammuAlikhanov Institute for Theoretical and Experimental PhysicsNational Institute of Science Education and Research, HBNIUniversity of Science and and Technology of ChinaNRC ”Kurchatov Institute”, Institute for High Energy PhysicsNational Research Nuclear University ","MEPhIUniversity of Texas, Austin
We report a new measurement of D0D^0-meson production at mid-rapidity (y|y|\,<\,1) in Au+Au collisions at sNN=200GeV{\sqrt{s_{\rm NN}} = \rm{200\,GeV}} utilizing the Heavy Flavor Tracker, a high resolution silicon detector at the STAR experiment. Invariant yields of D0D^0-mesons with transverse momentum pTp_{T} 9\lesssim 9\,GeV/cc are reported in various centrality bins (0--10\%, 10--20\%, 20--40\%, 40--60\% and 60--80\%). Blast-Wave thermal models are used to fit the D0D^0-meson pTp_{T} spectra to study D0D^0 hadron kinetic freeze-out properties. The average radial flow velocity extracted from the fit is considerably smaller than that of light hadrons (π,K\pi,K and pp), but comparable to that of hadrons containing multiple strange quarks (ϕ,Ξ\phi,\Xi^-), indicating that D0D^0 mesons kinetically decouple from the system earlier than light hadrons. The calculated D0D^0 nuclear modification factors re-affirm that charm quarks suffer large amount of energy loss in the medium, similar to those of light quarks for pTp_{T}\,>\,4\,GeV/cc in central 0--10\% Au+Au collisions. At low pTp_{T}, the nuclear modification factors show a characteristic structure qualitatively consistent with the expectation from model predictions that charm quarks gain sizable collective motion during the medium evolution. The improved measurements are expected to offer new constraints to model calculations and help gain further insights into the hot and dense medium created in these collisions.
There are no more papers matching your filters at the moment.