Bundeswehr University Munich
Deep generative models have the potential to fundamentally change the way we create high-fidelity digital content but are often hard to control. Prompting a generative model is a promising recent development that in principle enables end-users to creatively leverage zero-shot and few-shot learning to assign new tasks to an AI ad-hoc, simply by writing them down. However, for the majority of end-users writing effective prompts is currently largely a trial and error process. To address this, we discuss the key opportunities and challenges for interactive creative applications that use prompting as a new paradigm for Human-AI interaction. Based on our analysis, we propose four design goals for user interfaces that support prompting. We illustrate these with concrete UI design sketches, focusing on the use case of creative writing. The research community in HCI and AI can take these as starting points to develop adequate user interfaces for models capable of zero- and few-shot learning.
In most Vision-Language models (VL), the understanding of the image structure is enabled by injecting the position information (PI) about objects in the image. In our case study of LXMERT, a state-of-the-art VL model, we probe the use of the PI in the representation and study its effect on Visual Question Answering. We show that the model is not capable of leveraging the PI for the image-text matching task on a challenge set where only position differs. Yet, our experiments with probing confirm that the PI is indeed present in the representation. We introduce two strategies to tackle this: (i) Positional Information Pre-training and (ii) Contrastive Learning on PI using Cross-Modality Matching. Doing so, the model can correctly classify if images with detailed PI statements match. Additionally to the 2D information from bounding boxes, we introduce the object's depth as new feature for a better object localization in the space. Even though we were able to improve the model properties as defined by our probes, it only has a negligible effect on the downstream performance. Our results thus highlight an important issue of multimodal modeling: the mere presence of information detectable by a probing classifier is not a guarantee that the information is available in a cross-modal setup.
Masked Image Modeling (MIM) is a self-supervised learning technique that involves masking portions of an image, such as pixels, patches, or latent representations, and training models to predict the missing information using the visible context. This approach has emerged as a cornerstone in self-supervised learning, unlocking new possibilities in visual understanding by leveraging unannotated data for pre-training. In remote sensing, MIM addresses challenges such as incomplete data caused by cloud cover, occlusions, and sensor limitations, enabling applications like cloud removal, multi-modal data fusion, and super-resolution. By synthesizing and critically analyzing recent advancements, this survey (MIMRS) is a pioneering effort to chart the landscape of mask image modeling in remote sensing. We highlight state-of-the-art methodologies, applications, and future research directions, providing a foundational review to guide innovation in this rapidly evolving field.
About half of all optical observations collected via spaceborne satellites are affected by haze or clouds. Consequently, cloud coverage affects the remote sensing practitioner's capabilities of a continuous and seamless monitoring of our planet. This work addresses the challenge of optical satellite image reconstruction and cloud removal by proposing a novel multi-modal and multi-temporal data set called SEN12MS-CR-TS. We propose two models highlighting the benefits and use cases of SEN12MS-CR-TS: First, a multi-modal multi-temporal 3D-Convolution Neural Network that predicts a cloud-free image from a sequence of cloudy optical and radar images. Second, a sequence-to-sequence translation model that predicts a cloud-free time series from a cloud-covered time series. Both approaches are evaluated experimentally, with their respective models trained and tested on SEN12MS-CR-TS. The conducted experiments highlight the contribution of our data set to the remote sensing community as well as the benefits of multi-modal and multi-temporal information to reconstruct noisy information. Our data set is available at this https URL
Function names can greatly aid human reverse engineers, which has spurred the development of machine learning-based approaches to predicting function names in stripped binaries. Much current work in this area now uses transformers, applying a metaphor of machine translation from code to function names. Still, function naming models face challenges in generalizing to projects unrelated to the training set. In this paper, we take a completely new approach by transferring advances in automated image captioning to the domain of binary reverse engineering, such that different parts of a binary function can be associated with parts of its name. We propose BLens, which combines multiple binary function embeddings into a new ensemble representation, aligns it with the name representation latent space via a contrastive learning approach, and generates function names with a transformer architecture tailored for function names. Our experiments demonstrate that BLens significantly outperforms the state of the art. In the usual setting of splitting per binary, we achieve an F1F_1 score of 0.79 compared to 0.70. In the cross-project setting, which emphasizes generalizability, we achieve an F1F_1 score of 0.46 compared to 0.29. Finally, in an experimental setting reducing shared components across projects, we achieve an F1F_1 score of 0.320.32 compared to 0.190.19.
We investigate how multiple sliders with and without feedforward visualizations influence users' control of generative models. In an online study (N=138), we collected a dataset of people interacting with a generative adversarial network (StyleGAN2) in an image reconstruction task. We found that more control dimensions (sliders) significantly increase task difficulty and user actions. Visual feedforward partly mitigates this by enabling more goal-directed interaction. However, we found no evidence of faster or more accurate task performance. This indicates a tradeoff between feedforward detail and implied cognitive costs, such as attention. Moreover, we found that visualizations alone are not always sufficient for users to understand individual control dimensions. Our study quantifies fundamental UI design factors and resulting interaction behavior in this context, revealing opportunities for improvement in the UI design for interactive applications of generative models. We close by discussing design directions and further aspects.
2
Reverse engineers benefit from the presence of identifiers such as function names in a binary, but usually these are removed for release. Training a machine learning model to predict function names automatically is promising but fundamentally hard: unlike words in natural language, most function names occur only once. In this paper, we address this problem by introducing eXtreme Function Labeling (XFL), an extreme multi-label learning approach to selecting appropriate labels for binary functions. XFL splits function names into tokens, treating each as an informative label akin to the problem of tagging texts in natural language. We relate the semantics of binary code to labels through DEXTER, a novel function embedding that combines static analysis-based features with local context from the call graph and global context from the entire binary. We demonstrate that XFL/DEXTER outperforms the state of the art in function labeling on a dataset of 10,047 binaries from the Debian project, achieving a precision of 83.5%. We also study combinations of XFL with alternative binary embeddings from the literature and show that DEXTER consistently performs best for this task. As a result, we demonstrate that binary function labeling can be effectively phrased in terms of multi-label learning, and that binary function embeddings benefit from including explicit semantic features.
Liquid flow in sessile evaporating droplets of ultrapure water typically results from two main contributions: a capillary flow pushing the liquid towards the contact line from the bulk and a thermal Marangoni flow pulling the drop free surface towards the summit. Current analytical and numerical models are in good qualitative agreement with experimental observations however they overestimate the interfacial velocity values by 2-3 orders of magnitude. This discrepancy is generally ascribed to contamination of the water samples with non-soluble surfactants, however an experimental confirmation of this assumption has not yet been provided. In this work, we show that a small "ionic contamination" can cause a significant effect in the flow pattern inside the droplet. To provide the proof, we compare the flow in evaporating droplets of ultrapure water with commercially available bottled water of different mineralization levels. Mineral waters are bottled at natural springs, are micro-biologically pure, and contain only traces of minerals (as well as traces of other possible contaminants), therefore one would expect a slower interfacial flow as the amount of "contaminants" increase. Surprisingly, our results show that the magnitude of the interfacial flow is practically the same for mineral waters with low content of minerals as that of ultrapure water. However, for waters with larger content of minerals, the interfacial flow tends to slow down due to the presence of ionic concentration gradients. Our results show a much more complex scenario than it has been typically suspected, and therefore a deeper and more comprehensive analysis of the huge differences between numerical models and experiments is necessary.
Hydrogen embrittlement in 304L (18wt.% Cr, 8-10wt.% Ni) austenitic stainless steel (ASS) fabricated by laser powder-bed-fusion (LPBF) was investigated by tensile testing after electrochemical hydrogen pre-charging and compared to conventionally available 304L ASSs with two different processing histories, (i) casting plus annealing (CA) and (ii) CA plus thermomechanical treatment (TMT). It was revealed that hydrogen-charging led to a significant reduction in ductility for the CA sample, but only a small effect of hydrogen was observed for the LPBF and CA-TMT samples. Hydrogen-assisted cracking behavior was found to be strongly linked to strain-induced martensitic transformation. In addition, the amount of alpha' martensite was much higher in the CA sample than in other samples, suggesting that severe hydrogen embrittlement can be correlated with the low mechanical stability of austenite. Detailed microstructural characterization showed that low austenite stability of the CA sample was mainly attributed to the retained content of delta ferrite and the chemical inhomogeneity inside the gamma matrix (gamma close to delta has ~2 wt.% higher Cr but ~2 wt.% lower Ni), but TMT enhanced the chemical homogeneity and thus austenite stability. By contrast, the LPBF process led directly, i.e. without any thermomechanical treatment, to a fully austenitic structure with homogeneous elemental distribution in the ASS. These results confirmed that the presence of delta and the chemical inhomogeneity inside gamma matrix, which promoted the deformation-induced martensitic transformation and the associated H enrichment at the gamma-alpha' interface, was the primary reason for the severe H-assisted failure.
A new premixed turbulent combustion model is proposed. It is based on one-dimensional (1D) filtering of density times progress variable and of the reaction source term of laminar premixed flame profiles using a filter kernel which reflects the variation of the slicing area of planar flame fronts as they move through multidimensional filter volumes. It is shown that these multidimensional effects qualitatively change the relation between the filtered reaction source term and the Favre-filtered reaction progress variable compared to 1D filtering, particularly at large filter widths. Analytical results for the filtered quantities are achieved by approximating density times progress variable and reaction source term by suitable Ansatz functions. Filtered data from Direct Numerical Simulations (DNS) of statistically planar turbulent premixed flames at different free stream turbulence levels and heat release parameters is used to develop and validate the model. Two wrinkling factor models as function of filter width and subgrid turbulence level are proposed. For small filter widths up to two times the laminar flame thickness, minor effects from subgrid flame folding are observed. For larger filters, the filtered reaction source term rises linearly with filter width at a rate which increases with subgrid turbulence level. The modelled reaction source term as function of Favre averaged progress variable and filter width shows excellent agreement with filtered DNS data for all investigated free stream turbulence levels, filter widths and heat release parameters.
The ion-optic grid-system is the essential part of electrostatic ion thrusters governing performance and lifetime. Therefore reliable measurements of the grid and aperture geometry over the lifetime are necessary to understand and predict the behavior of the system. Many different methods of measurement were introduced over the years to tackle the challenges encountered when diagnosing single electrodes or the whole assembly at once. Modern industrial X-ray micro-computer-tomographs (uCT) offer the possibility to obtain a three-dimensional density map of a grid-system or it's components down to microscopic scales of precision. This information allows a spectrum of new diagnostic opportunities, like complete verification of the manufactured parts against CAD models, detecting internal defects or density-changes or the inspection of the assembled ion-optics and its internal alignment, which is normally prohibited by the lack of optical access to all parts at once. Hence uCT imaging is a promising tool to complement established methods and open up new experimental possibilities, however it also has its own weaknesses and pitfalls. The methods developed for grid-erosion and -geometry measurement of a small state-of-the-art radio-frequency-ion-thruster, the obstacles encountered along the route will be discussed and possible solutions demonstrated.
Non-destructive X-ray imaging of thruster parts and assemblies down to the scale of several micrometers is a key technology for electric propulsion research and engineering. It allows for thorough product assurance, rapid state acquisition and implementation of more detailed simulation models to understand the physics of device wear and erosion. Being able to inspect parts as 3D density maps allows insight into inner structures hidden from observation. Generating these density maps and also constructing three dimensional mesh objects for further processing depends on the achievable quality of the reconstruction, which is the inverse of Radon's transformation connecting a stack of projections taken from different angles to the original object's structure. Reconstruction is currently flawed by strong mathematical artifacts induced by the many aligned parts and stark density contrasts commonly found in electric propulsion thrusters.
We present a systematic literature review of cryptocurrency and blockchain research in Human-Computer Interaction (HCI) published between 2014 and 2021. We aim to provide an overview of the field, consolidate existing knowledge, and chart paths for future research. Our analysis of 99 articles identifies six major themes: (1) the role of trust, (2) understanding motivation, risk, and perception of cryptocurrencies, (3) cryptocurrency wallets, (4) engaging users with blockchain, (5) using blockchain for application-specific use cases, and (6) support tools for blockchain. We discuss the focus of the existing research body and juxtapose it to the changing landscape of emerging blockchain technologies to highlight future research avenues for HCI and interaction design. With this review, we identify key aspects where interaction design is critical for the adoption of blockchain systems. Doing so, we provide a starting point for new scholars and designers and help them position future contributions.
In the presence of grouped covariates, we propose a framework for boosting that allows to enforce sparsity within and between groups. By using component-wise and group-wise gradient boosting at the same time with adjusted degrees of freedom, a model with similar properties as the sparse group lasso can be fitted through boosting. We show that within-group and between-group sparsity can be controlled by a mixing parameter and discuss similarities and differences to the mixing parameter in the sparse group lasso. With simulations, gene data as well as agricultural data we show the effectiveness and predictive competitiveness of this estimator. The data and simulations suggest, that in the presence of grouped variables the use of sparse group boosting is associated with less biased variable selection and higher predictability compared to component-wise boosting. Additionally, we propose a way of reducing bias in component-wise boosting through the degrees of freedom.
Ensuring fairness is essential for every education system. Machine learning is increasingly supporting the education system and educational data science (EDS) domain, from decision support to educational activities and learning analytics. However, the machine learning-based decisions can be biased because the algorithms may generate the results based on students' protected attributes such as race or gender. Clustering is an important machine learning technique to explore student data in order to support the decision-maker, as well as support educational activities, such as group assignments. Therefore, ensuring high-quality clustering models along with satisfying fairness constraints are important requirements. This chapter comprehensively surveys clustering models and their fairness in EDS. We especially focus on investigating the fair clustering models applied in educational activities. These models are believed to be practical tools for analyzing students' data and ensuring fairness in EDS.
Machine learning and statistical modeling methods were used to analyze the impact of climate change on financial wellbeing of fruit farmers in Tunisia and Chile. The analysis was based on face to face interviews with 801 farmers. Three research questions were investigated. First, whether climate change impacts had an effect on how well the farm was doing financially. Second, if climate change was not influential, what factors were important for predicting financial wellbeing of the farm. And third, ascertain whether observed effects on the financial wellbeing of the farm were a result of interactions between predictor variables. This is the first report directly comparing climate change with other factors potentially impacting financial wellbeing of farms. Certain climate change factors, namely increases in temperature and reductions in precipitation, can regionally impact self-perceived financial wellbeing of fruit farmers. Specifically, increases in temperature and reduction in precipitation can have a measurable negative impact on the financial wellbeing of farms in Chile. This effect is less pronounced in Tunisia. Climate impact differences were observed within Chile but not in Tunisia. However, climate change is only of minor importance for predicting farm financial wellbeing, especially for farms already doing financially well. Factors that are more important, mainly in Tunisia, included trust in information sources and prior farm ownership. Other important factors include farm size, water management systems used and diversity of fruit crops grown. Moreover, some of the important factors identified differed between farms doing and not doing well financially. Interactions between factors may improve or worsen farm financial wellbeing.
AI-driven decision-making can lead to discrimination against certain individuals or social groups based on protected characteristics/attributes such as race, gender, or age. The domain of fairness-aware machine learning focuses on methods and algorithms for understanding, mitigating, and accounting for bias in AI/ML models. Still, thus far, the vast majority of the proposed methods assess fairness based on a single protected attribute, e.g. only gender or race. In reality, though, human identities are multi-dimensional, and discrimination can occur based on more than one protected characteristic, leading to the so-called ``multi-dimensional discrimination'' or ``multi-dimensional fairness'' problem. While well-elaborated in legal literature, the multi-dimensionality of discrimination is less explored in the machine learning community. Recent approaches in this direction mainly follow the so-called intersectional fairness definition from the legal domain, whereas other notions like additive and sequential discrimination are less studied or not considered thus far. In this work, we overview the different definitions of multi-dimensional discrimination/fairness in the legal domain as well as how they have been transferred/ operationalized (if) in the fairness-aware machine learning domain. By juxtaposing these two domains, we draw the connections, identify the limitations, and point out open research directions.
We systematically study cornerstones that must be solved to define an air traffic control benchmarking system based on a Data Envelopment Analysis. Primarily, we examine the appropriate decision-making units, what to consider and what to avoid when choosing inputs and outputs in the case that several countries are included, and how we can identify and deal with outliers, like the Maastricht Service Provider. We argue that Air Navigation Service Providers would be a good choice of decision units within the European context. Based on that, we discuss candidates for DEA inputs and outputs and emphasize that monetary values should be excluded. We, further suggest to use super-efficiency DEA for eliminating outliers. In this context, we compare different DEA approaches and find that standard DEA is performing well.
This paper presents a novel approach using multiple linear regression to process transient signals from silicon photomultipliers. The method provides excellent noise suppression and pulse detection in scenarios with a high pulse count rate and superimposed pulses. Insights into its implementation and benchmark results are presented. We also show how this approach can be used to automatically detect the pulse shape from a given transient signal, providing good detection for count rates up to 90MHz. Experimental data are used to present an application where this algorithm improves charge spectrum resolution by an order of magnitude.
Low cost, easily integrable photodetectors (PDs) for silicon (Si) photonics are still a bottleneck for photonic integrated circuits (PICs), especially for wavelengths above 1.8 μ{\mu}m. Multilayered platinum diselenide (PtSe2_2) is a semi-metallic two-dimensional (2D) material that can be synthesized below 450deg{\deg}C. We integrate PtSe2_2 based PDs directly by conformal growth on Si waveguides. The PDs operate at 1550 nm wavelength with a maximum responsivity of 11 mA/W and response times below 8.4 μ{\mu}s. Fourier transform infrared spectroscopy (FTIR) in the wavelength range from 1.25 μ{\mu}m to 28 μ{\mu}m indicates the suitability of PtSe2_2 for PDs far into the infrared wavelength range. Our PtSe2_2 PDs integrated by direct growth outperform PtSe2_2 PDs manufactured by standard 2D layer transfer. The combination of IR responsivity, chemical stability, selective and conformal growth at low temperatures, and the potential for high carrier mobility, make PtSe2_2 an attractive 2D material for optoelectronics and PICs.
There are no more papers matching your filters at the moment.