University of L ̈ubeck
Medical image challenges have played a transformative role in advancing the field, catalyzing algorithmic innovation and establishing new performance standards across diverse clinical applications. Image registration, a foundational task in neuroimaging pipelines, has similarly benefited from the Learn2Reg initiative. Building on this foundation, we introduce the Large-scale Unsupervised Brain MRI Image Registration (LUMIR) challenge, a next-generation benchmark designed to assess and advance unsupervised brain MRI registration. Distinct from prior challenges that leveraged anatomical label maps for supervision, LUMIR removes this dependency by providing over 4,000 preprocessed T1-weighted brain MRIs for training without any label maps, encouraging biologically plausible deformation modeling through self-supervision. In addition to evaluating performance on 590 held-out test subjects, LUMIR introduces a rigorous suite of zero-shot generalization tasks, spanning out-of-domain imaging modalities (e.g., FLAIR, T2-weighted, T2*-weighted), disease populations (e.g., Alzheimer's disease), acquisition protocols (e.g., 9.4T MRI), and species (e.g., macaque brains). A total of 1,158 subjects and over 4,000 image pairs were included for evaluation. Performance was assessed using both segmentation-based metrics (Dice coefficient, 95th percentile Hausdorff distance) and landmark-based registration accuracy (target registration error). Across both in-domain and zero-shot tasks, deep learning-based methods consistently achieved state-of-the-art accuracy while producing anatomically plausible deformation fields. The top-performing deep learning-based models demonstrated diffeomorphic properties and inverse consistency, outperforming several leading optimization-based methods, and showing strong robustness to most domain shifts, the exception being a drop in performance on out-of-domain contrasts.
Urinalysis is a standard diagnostic test to detect urinary system related problems. The automation of urinalysis will reduce the overall diagnostic time. Recent studies used urine microscopic datasets for designing deep learning based algorithms to classify and detect urine cells. But these datasets are not publicly available for further research. To alleviate the need for urine datsets, we prepare our urine sediment microscopic image (UMID) dataset comprising of around 3700 cell annotations and 3 categories of cells namely RBC, pus and epithelial cells. We discuss the several challenges involved in preparing the dataset and the annotations. We make the dataset publicly available.
11
Key to our project flora robotica is the idea of creating a bio-hybrid system of tightly coupled natural plants and distributed robots to grow architectural artifacts and spaces. Our motivation with this ground research project is to lay a principled foundation towards the design and implementation of living architectural systems that provide functionalities beyond those of orthodox building practice, such as self-repair, material accumulation and self-organization. Plants and robots work together to create a living organism that is inhabited by human beings. User-defined design objectives help to steer the directional growth of the plants, but also the system's interactions with its inhabitants determine locations where growth is prohibited or desired (e.g., partitions, windows, occupiable space). We report our plant species selection process and aspects of living architecture. A leitmotif of our project is the rich concept of braiding: braids are produced by robots from continuous material and serve as both scaffolds and initial architectural artifacts before plants take over and grow the desired architecture. We use light and hormones as attraction stimuli and far-red light as repelling stimulus to influence the plants. Applied sensors range from simple proximity sensing to detect the presence of plants to sophisticated sensing technology, such as electrophysiology and measurements of sap flow. We conclude by discussing our anticipated final demonstrator that integrates key features of flora robotica, such as the continuous growth process of architectural artifacts and self-repair of living architecture.
A critical factor in trustworthy machine learning is to develop robust representations of the training data. Only under this guarantee methods are legitimate to artificially generate data, for example, to counteract imbalanced datasets or provide counterfactual explanations for blackbox decision-making systems. In recent years, Generative Adversarial Networks (GANs) have shown considerable results in forming stable representations and generating realistic data. While many applications focus on generating image data, less effort has been made in generating time series data, especially multivariate signals. In this work, a Transformer-based autoencoder is proposed that is regularized using an adversarial training scheme to generate artificial multivariate time series signals. The representation is evaluated using t-SNE visualizations, Dynamic Time Warping (DTW) and Entropy scores. Our results indicate that the generated signals exhibit higher similarity to an exemplary dataset than using a convolutional network approach.
In modern cancer research, the vast volume of medical data generated is often underutilised due to challenges related to patient privacy. The OncoReg Challenge addresses this issue by enabling researchers to develop and validate image registration methods through a two-phase framework that ensures patient privacy while fostering the development of more generalisable AI models. Phase one involves working with a publicly available dataset, while phase two focuses on training models on a private dataset within secure hospital networks. OncoReg builds upon the foundation established by the Learn2Reg Challenge by incorporating the registration of interventional cone-beam computed tomography (CBCT) with standard planning fan-beam CT (FBCT) images in radiotherapy. Accurate image registration is crucial in oncology, particularly for dynamic treatment adjustments in image-guided radiotherapy, where precise alignment is necessary to minimise radiation exposure to healthy tissues while effectively targeting tumours. This work details the methodology and data behind the OncoReg Challenge and provides a comprehensive analysis of the competition entries and results. Findings reveal that feature extraction plays a pivotal role in this registration task. A new method emerging from this challenge demonstrated its versatility, while established approaches continue to perform comparably to newer techniques. Both deep learning and classical approaches still play significant roles in image registration, with the combination of methods, particularly in feature extraction, proving most effective.
Quantum annealing is a meta-heuristic approach tailored to solve combinatorial optimization problems with quantum annealers. In this tutorial, we provide a fundamental and comprehensive introduction to quantum annealing and modern data management systems and show quantum annealing's potential benefits and applications in the realm of database optimization. We demonstrate how to apply quantum annealing for selected database optimization problems, which are critical challenges in many data management platforms. The demonstrations include solving join order optimization problems in relational databases, optimizing sophisticated transaction scheduling, and allocating virtual machines within cloud-based architectures with respect to sustainability metrics. On the one hand, the demonstrations show how to apply quantum annealing on key problems of database management systems (join order selection, transaction scheduling), and on the other hand, they show how quantum annealing can be integrated as a part of larger and dynamic optimization pipelines (virtual machine allocation). The goal of our tutorial is to provide a centralized and condensed source regarding theories and applications of quantum annealing technology for database researchers, practitioners, and everyone who wants to understand how to potentially optimize data management with quantum computing in practice. Besides, we identify the advantages, limitations, and potentials of quantum computing for future database and data management research.
We study formal languages which are capable of fully expressing quantitative probabilistic reasoning and do-calculus reasoning for causal effects, from a computational complexity perspective. We focus on satisfiability problems whose instance formulas allow expressing many tasks in probabilistic and causal inference. The main contribution of this work is establishing the exact computational complexity of these satisfiability problems. We introduce a new natural complexity class, named succ\existsR, which can be viewed as a succinct variant of the well-studied class \existsR, and show that the problems we consider are complete for succ\existsR. Our results imply even stronger algorithmic limitations than were proven by Fagin, Halpern, and Megiddo (1990) and Moss\'{e}, Ibeling, and Icard (2022) for some variants of the standard languages used commonly in probabilistic and causal inference.
The stationary velocity field (SVF) approach allows to build parametrizations of invertible deformation fields, which is often a desirable property in image registration. Its expressiveness is particularly attractive when used as a block following a machine learning-inspired network. However, it can struggle with large deformations. We extend the SVF approach to matrix groups, in particular \SE(3)\SE(3). This moves Euclidean transformations into the low-frequency part, towards which network architectures are often naturally biased, so that larger motions can be recovered more easily. This requires an extension of the flow equation, for which we provide sufficient conditions for existence. We further prove a decomposition condition that allows us to apply a scaling-and-squaring approach for efficient numerical integration of the flow equation. We numerically validate the approach on inter-patient registration of 3D MRI images of the human brain.
This study examines the effectiveness of regional lockdown strategies in mitigating pathogen spread across regional units, termed cities hereinafter. We develop simplified models to analyze infection spread across cities within a country during an epidemic wave. Isolation of a city is initiated when infection numbers within the city surpass defined thresholds. We compare two strategies: strategy (P) consists in prescribing thresholds proportionally to city sizes, while the same threshold is used for all cities under strategy (U). Given the heavy-tailed distribution of city sizes, strategy (P) may result in more secondary infections from larger cities than strategy (U). Random graph models are constructed to represent infection spread as a percolation process. In particular, we consider a model in which mobility between cities only depends on city sizes. We assess the relative efficiency of the two strategies by comparing the ratios of the number of individuals under isolation to the total number of infections by the end of the epidemic wave under strategy (P) and (U). Additionally, we derive analytical formulas for disease prevalence and basic reproduction numbers. Our models are calibrated using mobility data from France, Poland and Japan, validated through simulation. The findings indicate that mobility between cities in France and Poland is mainly determined by city sizes. However, a poor fit was observed with Japanese data, highlighting the importance to include other factors like e.g. geography for some countries in modeling. Our analysis suggest similar effectiveness for both strategies in France and Japan, while strategy (U) demonstrates distinct merits in Poland.
L-BFGS is the state-of-the-art optimization method for many large scale inverse problems. It has a small memory footprint and achieves superlinear convergence. The method approximates Hessian based on an initial approximation and an update rule that models current local curvature information. The initial approximation greatly affects the scaling of a search direction and the overall convergence of the method. We propose a novel, simple, and effective way to initialize the Hessian. Typically, the objective function is a sum of a data-fidelity term and a regularizer. Often, the Hessian of the data-fidelity is computationally challenging, but the regularizer's Hessian is easy to compute. We replace the Hessian of the data-fidelity with a scalar and keep the Hessian of the regularizer to initialize the Hessian approximation at every iteration. The scalar satisfies the secant equation in the sense of ordinary and total least squares and geometric mean regression. Our new strategy not only leads to faster convergence, but the quality of the numerical solutions is generally superior to simple scaling based strategies. Specifically, the proposed schemes based on ordinary least squares formulation and geometric mean regression outperform the state-of-the-art schemes. The implementation of our strategy requires only a small change of a standard L-BFGS code. Our experiments on convex quadratic problems and non-convex image registration problems confirm the effectiveness of the proposed approach.
The majority of current research in deep learning based image registration addresses inter-patient brain registration with moderate deformation magnitudes. The recent Learn2Reg medical registration benchmark has demonstrated that single-scale U-Net architectures, such as VoxelMorph that directly employ a spatial transformer loss, often do not generalise well beyond the cranial vault and fall short of state-of-the-art performance for abdominal or intra-patient lung registration. Here, we propose two straightforward steps that greatly reduce this gap in accuracy. First, we employ keypoint self-supervision with a novel network head that predicts a discretised heatmap and robustly reduces large deformations for better robustness. Second, we replace multiple learned fine-tuning steps by a single instance optimisation with hand-crafted features and the Adam optimiser. Different to other related work, including FlowNet or PDD-Net, our approach does not require a fully discretised architecture with correlation layer. Our ablation study demonstrates the importance of keypoints in both self-supervised and unsupervised (using only a MIND metric) settings. On a multi-centric inspiration-exhale lung CT dataset, including very challenging COPD scans, our method outperforms VoxelMorph by improving nonlinear alignment by 77% compared to 19% - reaching target registration errors of 2 mm that outperform all but one learning methods published to date. Extending the method to semantic features sets new stat-of-the-art performance on inter-subject abdominal CT registration.
Is it possible to integrate a humanoid social robot into the work processes or customer care in an official environment, e.g. in municipal offices? If so, what could such an application scenario look like and what skills would the robot need to have when interacting with human customers? What are requirements for this kind of interactions? We have devised an application scenario for such a case, determined the necessary or desirable capabilities of the robot, developed a corresponding robot application and carried out initial tests and evaluations in a project together with the Kiel City Council. One of the most important insights gained in the project was that a humanoid robot with natural language processing capabilities based on large language models as well as human-like gestures and posture changes (animations) proved to be much more preferred by users compared to standard browser-based solutions on tablets for an information system in the City Council. Furthermore, we propose a connection of the ACT-R cognitive architecture with the robot, where an ACT-R model is used in interaction with the robot application to cognitively process and enhance a dialogue between human and robot.
Magnetic Particle Imaging (MPI) is a highly sensitive imaging method that enables the visualization of magnetic tracer materials with a temporal resolution of more than 46 volumes per second. In MPI the size of the field of view scales with the strengths of the applied magnetic fields. In clinical applications those strengths are limited by peripheral nerve stimulation, specific absorption rates, and the requirement to acquire images of high spatial resolution. Therefore, the size of the field of view is usually a few cubic centimeters. To bypass this limitation, additional focus fields and/or external object movements can be applied. In this work, the latter approach is investigated. An object is moved through the scanner bore one step at a time, while the MPI scanner continuously acquires data from its static field of view. Using a 3D phantom and dynamic 3D in vivo data it is shown that the data from such a moving table experiment can be jointly reconstructed after reordering the data with respect to the stepwise object shifts and heart beat phases.
As humanoid service robots are becoming more and more perceptible in public service settings for instance as a guide to welcome visitors or to explain a procedure to follow, it is desirable to improve the comprehensibility of complex issues for human customers and to adapt the level of difficulty of the information provided as well as the language used to individual requirements. This work examines a case study using a humanoid social robot Pepper performing support for customers in a public service environment offering advice and information. An application architecture is proposed that improves the intelligibility of the information received by providing the possibility to translate this information into easy language and/or into another spoken language.
Human sleep is cyclical with a period of approximately 90 minutes, implying long temporal dependency in the sleep data. Yet, exploring this long-term dependency when developing sleep staging models has remained untouched. In this work, we show that while encoding the logic of a whole sleep cycle is crucial to improve sleep staging performance, the sequential modelling approach in existing state-of-the-art deep learning models are inefficient for that purpose. We thus introduce a method for efficient long sequence modelling and propose a new deep learning model, L-SeqSleepNet, which takes into account whole-cycle sleep information for sleep staging. Evaluating L-SeqSleepNet on four distinct databases of various sizes, we demonstrate state-of-the-art performance obtained by the model over three different EEG setups, including scalp EEG in conventional Polysomnography (PSG), in-ear EEG, and around-the-ear EEG (cEEGrid), even with a single EEG channel input. Our analyses also show that L-SeqSleepNet is able to alleviate the predominance of N2 sleep (the major class in terms of classification) to bring down errors in other sleep stages. Moreover the network becomes much more robust, meaning that for all subjects where the baseline method had exceptionally poor performance, their performance are improved significantly. Finally, the computation time only grows at a sub-linear rate when the sequence length increases.
We present a novel computational approach to fast and memory-efficient deformable image registration. In the variational registration model, the computation of the objective function derivatives is the computationally most expensive operation, both in terms of runtime and memory requirements. In order to target this bottleneck, we analyze the matrix structure of gradient and Hessian computations for the case of the normalized gradient fields distance measure and curvature regularization. Based on this analysis, we derive equivalent matrix-free closed-form expressions for derivative computations, eliminating the need for storing intermediate results and the costs of sparse matrix arithmetic. This has further benefits: (1) matrix computations can be fully parallelized, (2) memory complexity for derivative computation is reduced from linear to constant, and (3) overall computation times are substantially reduced. In comparison with an optimized matrix-based reference implementation, the CPU implementation achieves speedup factors between 3.1 and 9.7, and we are able to handle substantially higher resolutions. Using a GPU implementation, we achieve an additional speedup factor of up to 9.2. Furthermore, we evaluated the approach on real-world medical datasets. On ten publicly available lung CT images from the DIR-Lab 4DCT dataset, we achieve the best mean landmark error of 0.93 mm compared to other submissions on the DIR-Lab website, with an average runtime of only 9.23 s. Complete non-rigid registration of full-size 3D thorax-abdomen CT volumes from oncological follow-up is achieved in 12.6 s. The experimental results show that the proposed matrix-free algorithm enables the use of variational registration models also in applications which were previously impractical due to memory or runtime restrictions.
Existing generative adversarial networks (GANs) for speech enhancement solely rely on the convolution operation, which may obscure temporal dependencies across the sequence input. To remedy this issue, we propose a self-attention layer adapted from non-local attention, coupled with the convolutional and deconvolutional layers of a speech enhancement GAN (SEGAN) using raw signal input. Further, we empirically study the effect of placing the self-attention layer at the (de)convolutional layers with varying layer indices as well as at all of them when memory allows. Our experiments show that introducing self-attention to SEGAN leads to consistent improvement across the objective evaluation metrics of enhancement performance. Furthermore, applying at different (de)convolutional layers does not significantly alter performance, suggesting that it can be conveniently applied at the highest-level (de)convolutional layer with the smallest memory overhead.
16
Gene network information is believed to be beneficial for disease module and pathway identification, but has not been explicitly utilized in the standard random forest (RF) algorithm for gene expression data analysis. We investigate the performance of a network-guided RF where the network information is summarized into a sampling probability of predictor variables which is further used in the construction of the RF. Our results suggest that network-guided RF does not provide better disease prediction than the standard RF. In terms of disease gene discovery, if disease genes form module(s), network-guided RF identifies them more accurately. In addition, when disease status is independent from genes in the given network, spurious gene selection results can occur when using network information, especially on hub genes. Our empirical analysis on two balanced microarray and RNA-Seq breast cancer datasets from The Cancer Genome Atlas (TCGA) for classification of progesterone receptor (PR) status also demonstrates that network-guided RF can identify genes from PGR-related pathways, which leads to a better connected module of identified genes.
In this work, we propose an approach that features deep feature embedding learning and hierarchical classification with triplet loss function for Acoustic Scene Classification (ASC). In the one hand, a deep convolutional neural network is firstly trained to learn a feature embedding from scene audio signals. Via the trained convolutional neural network, the learned embedding embeds an input into the embedding feature space and transforms it into a high-level feature vector for representation. In the other hand, in order to exploit the structure of the scene categories, the original scene classification problem is structured into a hierarchy where similar categories are grouped into meta-categories. Then, hierarchical classification is accomplished using deep neural network classifiers associated with triplet loss function. Our experiments show that the proposed system achieves good performance on both the DCASE 2018 Task 1A and 1B datasets, resulting in accuracy gains of 15.6% and 16.6% absolute over the DCASE 2018 baseline on Task 1A and 1B, respectively.
Polyphonic events are the main error source of audio event detection (AED) systems. In deep-learning context, the most common approach to deal with event overlaps is to treat the AED task as a multi-label classification problem. By doing this, we inherently consider multiple one-vs.-rest classification problems, which are jointly solved by a single (i.e. shared) network. In this work, to better handle polyphonic mixtures, we propose to frame the task as a multi-class classification problem by considering each possible label combination as one class. To circumvent the large number of arising classes due to combinatorial explosion, we divide the event categories into multiple groups and construct a multi-task problem in a divide-and-conquer fashion, where each of the tasks is a multi-class classification problem. A network architecture is then devised for multi-class multi-task modelling. The network is composed of a backbone subnet and multiple task-specific subnets. The task-specific subnets are designed to learn time-frequency and channel attention masks to extract features for the task at hand from the common feature maps learned by the backbone. Experiments on the TUT-SED-Synthetic-2016 with high degree of event overlap show that the proposed approach results in more favorable performance than the common multi-label approach.
There are no more papers matching your filters at the moment.