New Technologies Research CentreUniversity of West Bohemia
The Kalman filter (KF) and its variants are among the most celebrated algorithms in signal processing. These methods are used for state estimation of dynamic systems by relying on mathematical representations in the form of simple state-space (SS) models, which may be crude and inaccurate descriptions of the underlying dynamics. Emerging data-centric artificial intelligence (AI) techniques tackle these tasks using deep neural networks (DNNs), which are model-agnostic. Recent developments illustrate the possibility of fusing DNNs with classic Kalman-type filtering, obtaining systems that learn to track in partially known dynamics. This article provides a tutorial-style overview of design approaches for incorporating AI in aiding KF-type algorithms. We review both generic and dedicated DNN architectures suitable for state estimation, and provide a systematic presentation of techniques for fusing AI tools with KFs and for leveraging partial SS modeling and data, categorizing design approaches into task-oriented and SS model-oriented. The usefulness of each approach in preserving the individual strengths of model-based KFs and data-driven DNNs is investigated in a qualitative and quantitative study, whose code is publicly available, illustrating the gains of hybrid model-based/data-driven designs. We also discuss existing challenges and future research directions that arise from fusing AI and Kalman-type algorithms.
Perovskite oxides are promising for energy and quantum technologies, but wide-gap hosts such as NaAlO3 suffer from deep-UV absorption and limited carrier transport. Using first-principles GGA+U+SOC calculations, we investigate Eu3+-, Gd3+-, and Tb3+-doped NaAlO3 and evaluate their electronic, optical, elastic, and thermoelectric properties. Rare-earth substitution is thermodynamically favorable (formation energies 1.2-1.6 eV) and induces strong f-p hybridization, reducing the pristine band gap (about 6.2 eV) to about 3.1 eV for Tb. Spin-resolved band structures reveal Gd-driven half-metallicity, Eu-induced spin-selective metallicity, and Tb-stabilized p-type semiconducting behavior. The optical spectra show a red-shifted absorption edge (about 2.0-2.2 eV), a large static dielectric response (epsilon1(0) about 95 for Eu), and plasmonic resonances near 4 eV, enabling visible-light harvesting. Elastic analysis indicates mild lattice softening with preserved ductility (Pugh ratio B/G about 1.56-1.57). Thermoelectric performance is enhanced, with Seebeck coefficients greater than 210 uV/K for Eu and Tb and ZT about 0.45 at 500 K. These results identify rare-earth-doped NaAlO3 as a multifunctional perovskite platform for photovoltaics, photocatalysis, thermoelectrics, and spintronics.
The remarkable success of Large Language Models (LLMs) in generative tasks has raised fundamental questions about the nature of their acquired capabilities, which often appear to emerge unexpectedly without explicit training. This paper examines the emergent properties of Deep Neural Networks (DNNs) through both theoretical analysis and empirical observation, addressing the epistemological challenge of "creation without understanding" that characterises contemporary AI development. We explore how the neural approach's reliance on nonlinear, stochastic processes fundamentally differs from symbolic computational paradigms, creating systems whose macro-level behaviours cannot be analytically derived from micro-level neuron activities. Through analysis of scaling laws, grokking phenomena, and phase transitions in model capabilities, I demonstrate that emergent abilities arise from the complex dynamics of highly sensitive nonlinear systems rather than simply from parameter scaling alone. My investigation reveals that current debates over metrics, pre-training loss thresholds, and in-context learning miss the fundamental ontological nature of emergence in DNNs. I argue that these systems exhibit genuine emergent properties analogous to those found in other complex natural phenomena, where systemic capabilities emerge from cooperative interactions among simple components without being reducible to their individual behaviours. The paper concludes that understanding LLM capabilities requires recognising DNNs as a new domain of complex dynamical systems governed by universal principles of emergence, similar to those operating in physics, chemistry, and biology. This perspective shifts the focus from purely phenomenological definitions of emergence to understanding the internal dynamic transformations that enable these systems to acquire capabilities that transcend their individual components.
The emergence of Large Language Models (LLMs) with increasingly sophisticated natural language understanding and generative capabilities has sparked interest in the Agent-based Modelling (ABM) community. With their ability to summarize, generate, analyze, categorize, transcribe and translate text, answer questions, propose explanations, sustain dialogue, extract information from unstructured text, and perform logical reasoning and problem-solving tasks, LLMs have a good potential to contribute to the modelling process. After reviewing the current use of LLMs in ABM, this study reflects on the opportunities and challenges of the potential use of LLMs in ABM. It does so by following the modelling cycle, from problem formulation to documentation and communication of model results, and holding a critical stance.
This paper introduces WildlifeReID-10k, a new large-scale re-identification benchmark with more than 10k animal identities of around 33 species across more than 140k images, re-sampled from 37 existing datasets. WildlifeReID-10k covers diverse animal species and poses significant challenges for SoTA methods, ensuring fair and robust evaluation through its time-aware and similarity-aware split protocol. The latter is designed to address the common issue of training-to-test data leakage caused by visually similar images appearing in both training and test sets. The WildlifeReID-10k dataset and benchmark are publicly available on Kaggle, along with strong baselines for both closed-set and open-set evaluation, enabling fair, transparent, and standardized evaluation of not just multi-species animal re-identification models.
This paper focuses on the multi-target tracking using the Stone Soup framework. In particular, we aim at evaluation of two multi-target tracking scenarios based on the simulated class-B dataset and ADS-B class-A dataset provided by OpenSky Network. The scenarios are evaluated w.r.t. selection of a local state estimator using a range of the Stone Soup metrics. Source code with scenario definitions and Stone Soup set-up are provided along with the paper.
WildFusion is a method for individual animal identification that combines global deep learning features with local feature matching through a calibrated fusion mechanism. It achieved a mean accuracy of 84.0% across 17 diverse wildlife datasets, outperforming the previous state-of-the-art by 8.5 percentage points, while also demonstrating strong zero-shot generalization capabilities to new species.
50
We introduce a new, challenging benchmark and a dataset, FungiTastic, based on fungal records continuously collected over a twenty-year span. The dataset is labelled and curated by experts and consists of about 350k multimodal observations of 6k fine-grained categories (species). The fungi observations include photographs and additional data, e.g., meteorological and climatic data, satellite images, and body part segmentation masks. FungiTastic is one of the few benchmarks that include a test set with DNA-sequenced ground truth of unprecedented label reliability. The benchmark is designed to support (i) standard closed-set classification, (ii) open-set classification, (iii) multi-modal classification, (iv) few-shot learning, (v) domain shift, and many more. We provide tailored baselines for many use cases, a multitude of ready-to-use pre-trained models on this https URL, and a framework for model training. The documentation and the baselines are available at this https URL and this https URL
This paper presents the final results of the ICDAR 2021 Competition on Historical Map Segmentation (MapSeg), encouraging research on a series of historical atlases of Paris, France, drawn at 1/5000 scale between 1894 and 1937. The competition featured three tasks, awarded separately. Task~1 consists in detecting building blocks and was won by the L3IRIS team using a DenseNet-121 network trained in a weakly supervised fashion. This task is evaluated on 3 large images containing hundreds of shapes to detect. Task~2 consists in segmenting map content from the larger map sheet, and was won by the UWB team using a U-Net-like FCN combined with a binarization method to increase detection edge accuracy. Task~3 consists in locating intersection points of geo-referencing lines, and was also won by the UWB team who used a dedicated pipeline combining binarization, line detection with Hough transform, candidate filtering, and template matching for intersection refinement. Tasks~2 and~3 are evaluated on 95 map sheets with complex content. Dataset, evaluation tools and results are available under permissive licensing at \url{this https URL}.
3
While large language models (LLMs) show promise for various tasks, their performance in compound aspect-based sentiment analysis (ABSA) tasks lags behind fine-tuned models. However, the potential of LLMs fine-tuned for ABSA remains unexplored. This paper examines the capabilities of open-source LLMs fine-tuned for ABSA, focusing on LLaMA-based models. We evaluate the performance across four tasks and eight English datasets, finding that the fine-tuned Orca~2 model surpasses state-of-the-art results in all tasks. However, all models struggle in zero-shot and few-shot scenarios compared to fully fine-tuned ones. Additionally, we conduct error analysis to identify challenges faced by fine-tuned models.
The increasing complexity of natural disaster incidents demands innovative technological solutions to support first responders in their efforts. This paper introduces the TRIFFID system, a comprehensive technical framework that integrates unmanned ground and aerial vehicles with advanced artificial intelligence functionalities to enhance disaster response capabilities across wildfires, urban floods, and post-earthquake search and rescue missions. By leveraging state-of-the-art autonomous navigation, semantic perception, and human-robot interaction technologies, TRIFFID provides a sophisticated system composed of the following key components: hybrid robotic platform, centralized ground station, custom communication infrastructure, and smartphone application. The defined research and development activities demonstrate how deep neural networks, knowledge graphs, and multimodal information fusion can enable robots to autonomously navigate and analyze disaster environments, reducing personnel risks and accelerating response times. The proposed system enhances emergency response teams by providing advanced mission planning, safety monitoring, and adaptive task execution capabilities. Moreover, it ensures real-time situational awareness and operational support in complex and risky situations, facilitating rapid and precise information collection and coordinated actions.
In this paper, we describe our method for the detection of lexical semantic change, i.e., word sense changes over time. We examine semantic differences between specific words in two corpora, chosen from different time periods, for English, German, Latin, and Swedish. Our method was created for the SemEval 2020 Task 1: \textit{Unsupervised Lexical Semantic Change Detection.} We ranked 1st1^{st} in Sub-task 1: binary change detection, and 4th4^{th} in Sub-task 2: ranked change detection. Our method is fully unsupervised and language independent. It consists of preparing a semantic vector space for each corpus, earlier and later; computing a linear transformation between earlier and later spaces, using Canonical Correlation Analysis and Orthogonal Transformation; and measuring the cosines between the transformed vector for the target word from the earlier corpus and the vector for the target word in the later corpus.
Rare-earth-doped nitride phosphors are promising materials for solid-state lighting and photonic applications due to their thermal stability, sharp emission lines, and strong UV-blue absorption. In this work, we present a first-principles density functional theory (DFT) study, using the GGA+U approach, of pristine and Eu3+-doped CaAlSiN3 at doping levels of 8.5% and 17%. Electronic structure calculations show that Eu incorporation introduces localized 4f states within the band gap, leading to band-gap narrowing and enabling red photoluminescence through the 5D0 -> 7F2 transition. Spin-polarized density of states and spin density mapping confirm the magnetic nature of Eu3+, while charge density, Bader analysis, and electron localization function (ELF) indicate mixed ionic-covalent bonding and charge transfer from Eu to neighboring N and Al atoms, stabilizing the doped lattice. Optical spectra, including dielectric function, absorption, refractive index, and reflectivity, reveal red-shifted absorption edges and enhanced visible-range light-matter interactions, consistent with experimental red to near-infrared emission. Formation energy analysis confirms the thermodynamic feasibility of Eu substitution, while elastic constants and Pugh's ratio indicate mechanical robustness and ductility. Thermoelectric transport properties, obtained using WIEN2k and BoltzTraP, suggest that moderate Eu3+ doping improves the power factor and reduces lattice thermal conductivity through disorder scattering. These results establish Eu-doped CaAlSiN3 as a stable and efficient red-emitting phosphor for white light-emitting diodes (WLEDs) and provide theoretical insights for crystal site engineering in advanced optoelectronic materials.
This paper presents a novel automatic face recognition approach based on local binary patterns. This descriptor considers a local neighbourhood of a pixel to compute the feature vector values. This method is not very robust to handle image noise, variances and different illumination conditions. We address these issues by proposing a novel descriptor which considers more pixels and different neighbourhoods to compute the feature vector values. The proposed method is evaluated on two benchmark corpora, namely UFI and FERET face datasets. We experimentally show that our approach outperforms state-of-the-art methods and is efficient particularly in the real conditions where the above mentioned issues are obvious. We further show that the proposed method handles well one training sample issue and is also robust to the image resolution.
Layered 2D van der Waals materials, such as transition metal dichalcogenides, are promising for nanoscale spintronic and optoelectronic applications. Harnessing their full potential requires understanding both intrinsic transport and the dynamics of optically excited spin and charge carriers -- particularly the transition between excited spin polarization and the conduction band's intrinsic spin texture. Here, we investigate the spin polarization of the conduction bands of bulk WSe2_2 using static and time-resolved spin-resolved photoemission spectroscopy, complemented by photocurrent calculations. Electron doping reveals the intrinsic spin polarization, while time-resolved measurements trace the evolution of excited spin carriers. We find that intervalley scattering is spin-conserving, with spin transport initially governed by photoexcited carriers and aligning with the intrinsic conduction band polarization after \sim150 fs.
This paper deals with the state estimation of stochastic systems and examines the possible employment of tensor decompositions in grid-based filtering routines, in particular, the tensor-train decomposition. The aim is to show that these techniques can lead to a massive reduction in both the computational and storage complexity of grid-based filtering algorithms without considerable tradeoffs in accuracy. This claim is supported by an algorithm descriptions and numerical illustrations.
This paper introduces the first public large-scale, long-span dataset with sea turtle photographs captured in the wild -- \href{this https URL}{SeaTurtleID2022}. The dataset contains 8729 photographs of 438 unique individuals collected within 13 years, making it the longest-spanned dataset for animal re-identification. All photographs include various annotations, e.g., identity, encounter timestamp, and body parts segmentation masks. Instead of standard "random" splits, the dataset allows for two realistic and ecologically motivated splits: (i) a \textit{time-aware closed-set} with training, validation, and test data from different days/years, and (ii) a \textit{time-aware open-set} with new unknown individuals in test and validation sets. We show that time-aware splits are essential for benchmarking re-identification methods, as random splits lead to performance overestimation. Furthermore, a baseline instance segmentation and re-identification performance over various body parts is provided. Finally, an end-to-end system for sea turtle re-identification is proposed and evaluated. The proposed system based on Hybrid Task Cascade for head instance segmentation and ArcFace-trained feature-extractor achieved an accuracy of 86.8\%.
First-principles theory and atomistic modeling predict emergent B2 chemical orderings in AlTiVNb and AlTiCrMo refractory high-entropy superalloys. The study identifies specific ordering temperatures and elemental sublattice preferences, demonstrating a counter-intuitive increase in residual electrical resistivity upon B2 ordering due to electronic structure changes.
Tram-human interaction safety is an important challenge, given that trams frequently operate in densely populated areas, where collisions can range from minor injuries to fatal outcomes. This paper addresses the issue from the perspective of designing a solution leveraging digital image processing, deep learning, and artificial intelligence to improve the safety of pedestrians, drivers, cyclists, pets, and tram passengers. We present RailSafeNet, a real-time framework that fuses semantic segmentation, object detection and a rule-based Distance Assessor to highlight track intrusions. Using only monocular video, the system identifies rails, localises nearby objects and classifies their risk by comparing projected distances with the standard 1435mm rail gauge. Experiments on the diverse RailSem19 dataset show that a class-filtered SegFormer B3 model achieves 65% intersection-over-union (IoU), while a fine-tuned YOLOv8 attains 75.6% mean average precision (mAP) calculated at an intersection over union (IoU) threshold of 0.50. RailSafeNet therefore delivers accurate, annotation-light scene understanding that can warn drivers before dangerous situations escalate. Code available at this https URL.
In this paper, we present WildlifeDatasets (this https URL) - an open-source toolkit intended primarily for ecologists and computer-vision / machine-learning researchers. The WildlifeDatasets is written in Python, allows straightforward access to publicly available wildlife datasets, and provides a wide variety of methods for dataset pre-processing, performance analysis, and model fine-tuning. We showcase the toolkit in various scenarios and baseline experiments, including, to the best of our knowledge, the most comprehensive experimental comparison of datasets and methods for wildlife re-identification, including both local descriptors and deep learning approaches. Furthermore, we provide the first-ever foundation model for individual re-identification within a wide range of species - MegaDescriptor - that provides state-of-the-art performance on animal re-identification datasets and outperforms other pre-trained models such as CLIP and DINOv2 by a significant margin. To make the model available to the general public and to allow easy integration with any existing wildlife monitoring applications, we provide multiple MegaDescriptor flavors (i.e., Small, Medium, and Large) through the HuggingFace hub (this https URL).
91
There are no more papers matching your filters at the moment.