Utrecht University
The brain's remarkable and efficient information processing capability is driving research into brain-inspired (neuromorphic) computing paradigms. Artificial aqueous ion channels are emerging as an exciting platform for neuromorphic computing, representing a departure from conventional solid-state devices by directly mimicking the brain's fluidic ion transport. Supported by a quantitative theoretical model, we present easy to fabricate tapered microchannels that embed a conducting network of fluidic nanochannels between a colloidal structure. Due to transient salt concentration polarisation our devices are volatile memristors (memory resistors) that are remarkably stable. The voltage-driven net salt flux and accumulation, that underpin the concentration polarisation, surprisingly combine into a diffusionlike quadratic dependence of the memory retention time on the channel length, allowing channel design for a specific timescale. We implement our device as a synaptic element for neuromorphic reservoir computing. Individual channels distinguish various time series, that together represent (handwritten) numbers, for subsequent in-silico classification with a simple readout function. Our results represent a significant step towards realising the promise of fluidic ion channels as a platform to emulate the rich aqueous dynamics of the brain.
As the gravitational wave detector network is upgraded and the sensitivity of the detectors improves, novel scientific avenues open for exploration. For example, tests of general relativity will become more accurate as smaller deviations can be probed. Additionally, the detection of lensed gravitational waves becomes more likely. However, these new avenues could also interact with each other, and a gravitational wave event presenting deviations from general relativity could be mistaken for a lensed one. Here, we explore how phenomenological deviations from general relativity or binaries of exotic compact objects could impact those lensing searches focusing on a single event. We consider strong lensing, millilensing, and microlensing and find that certain phenomenological deviations from general relativity may be mistaken for all of these types of lensing. Therefore, our study shows that future candidate lensing events would need to be carefully examined to avoid a false claim of lensing where instead a deviation from general relativity has been seen.
The introduction of more renewable energy sources into the energy system increases the variability and weather dependence of electricity generation. Power system simulations are used to assess the adequacy and reliability of the electricity grid over decades, but often become computational intractable for such long simulation periods with high technical detail. To alleviate this computational burden, we investigate the use of outlier detection algorithms to find periods of extreme renewable energy generation which enables detailed modelling of the performance of power systems under these circumstances. Specifically, we apply the Maximum Divergent Intervals (MDI) algorithm to power generation time series that have been derived from ERA5 historical climate reanalysis covering the period from 1950 through 2019. By applying the MDI algorithm on these time series, we identified intervals of extreme low and high energy production. To determine the outlierness of an interval different divergence measures can be used. Where the cross-entropy measure results in shorter and strongly peaking outliers, the unbiased Kullback-Leibler divergence tends to detect longer and more persistent intervals. These intervals are regarded as potential risks for the electricity grid by domain experts, showcasing the capability of the MDI algorithm to detect critical events in these time series. For the historical period analysed, we found no trend in outlier intensity, or shift and lengthening of the outliers that could be attributed to climate change. By applying MDI on climate model output, power system modellers can investigate the adequacy and possible changes of risk for the current and future electricity grid under a wider range of scenarios.
It is well-known that decision-making problems from stochastic control can be formulated by means of a forward-backward stochastic differential equation (FBSDE). Recently, the authors of Ji et al. 2022 proposed an efficient deep learning algorithm based on the stochastic maximum principle (SMP). In this paper, we provide a convergence result for this deep SMP-BSDE algorithm and compare its performance with other existing methods. In particular, by adopting a strategy as in Han and Long 2020, we derive a-posteriori estimate, and show that the total approximation error can be bounded by the value of the loss functional and the discretization error. We present numerical examples for high-dimensional stochastic control problems, both in case of drift- and diffusion control, which showcase superior performance compared to existing algorithms.
This paper establishes a 9-dimensional classification framework for Multi-Agent Deep Reinforcement Learning with Communication (Comm-MADRL), systematically categorizing 41 existing models. It reveals prevailing trends and identifies underexplored areas to guide future research in designing intelligent multi-agent systems.
A comprehensive empirical study assesses the reliability of Large Language Models (LLMs) as automated evaluators across 20 diverse Natural Language Processing tasks. The research evaluates 11 different LLMs, including both proprietary and open-weight models, against human judgments, revealing that LLM performance varies substantially by task and property evaluated and is generally below human inter-annotator agreement.
35
SAM3-I introduces a method that extends the Segment Anything Model (SAM) family to directly interpret complex natural language instructions for visual segmentation. This integrated approach significantly outperforms existing agent-based methods in instruction-following performance for both simple and complex prompts, while operating in a more efficient single-pass inference pipeline.
A study explores how large language models reconcile memorizing incorrect labels with applying generalizable reasoning. It reveals that models retain correct intermediate computations even for noisy instances, employing "outlier heuristics" in specific neurons to override these results for memorized outputs.
This paper explores image modeling from the frequency space and introduces DCTdiff, an end-to-end diffusion generative paradigm that efficiently models images in the discrete cosine transform (DCT) space. We investigate the design space of DCTdiff and reveal the key design factors. Experiments on different frameworks (UViT, DiT), generation tasks, and various diffusion samplers demonstrate that DCTdiff outperforms pixel-based diffusion models regarding generative quality and training efficiency. Remarkably, DCTdiff can seamlessly scale up to 512×\times512 resolution without using the latent diffusion paradigm and beats latent diffusion (using SD-VAE) with only 1/4 training cost. Finally, we illustrate several intriguing properties of DCT image modeling. For example, we provide a theoretical proof of why 'image diffusion can be seen as spectral autoregression', bridging the gap between diffusion and autoregressive models. The effectiveness of DCTdiff and the introduced properties suggest a promising direction for image modeling in the frequency space. The code is this https URL
19
Many researchers have reached the conclusion that AI models should be trained to be aware of the possibility of variation and disagreement in human judgments, and evaluated as per their ability to recognize such variation. The LEWIDI series of shared tasks on Learning With Disagreements was established to promote this approach to training and evaluating AI models, by making suitable datasets more accessible and by developing evaluation methods. The third edition of the task builds on this goal by extending the LEWIDI benchmark to four datasets spanning paraphrase identification, irony detection, sarcasm detection, and natural language inference, with labeling schemes that include not only categorical judgments as in previous editions, but ordinal judgments as well. Another novelty is that we adopt two complementary paradigms to evaluate disagreement-aware systems: the soft-label approach, in which models predict population-level distributions of judgments, and the perspectivist approach, in which models predict the interpretations of individual annotators. Crucially, we moved beyond standard metrics such as cross-entropy, and tested new evaluation metrics for the two paradigms. The task attracted diverse participation, and the results provide insights into the strengths and limitations of methods to modeling variation. Together, these contributions strengthen LEWIDI as a framework and provide new resources, benchmarks, and findings to support the development of disagreement-aware technologies.
The ExPLAIND framework unifies attribution across model components, training data, and training dynamics by extending the Exact Path Kernel (EPK) to modern deep learning optimizers like AdamW. This framework derives additive influence scores, demonstrated to accurately replicate model predictions and uncover a refined multi-phase understanding of phenomena such as Grokking.
Language models can perform implicit multi-hop reasoning up to 4 hops, achieving high accuracy when provided with sufficient training data. This capability, however, incurs an exponential increase in data requirements which curriculum learning can substantially reduce.
In psychotherapy, therapeutic outcome assessment, or treatment outcome evaluation, is essential for enhancing mental health care by systematically evaluating therapeutic processes and outcomes. Existing large language model approaches often focus on therapist-centered, single-session evaluations, neglecting the client's subjective experience and longitudinal progress across multiple sessions. To address these limitations, we propose IPAEval, a client-Informed Psychological Assessment-based Evaluation framework that automates treatment outcome evaluations from the client's perspective using clinical interviews. IPAEval integrates cross-session client-contextual assessment and session-focused client-dynamics assessment to provide a comprehensive understanding of therapeutic progress. Experiments on our newly developed TheraPhase dataset demonstrate that IPAEval effectively tracks symptom severity and treatment outcomes over multiple sessions, outperforming previous single-session models and validating the benefits of items-aware reasoning mechanisms.
Diffusion models have demonstrated impressive generative capabilities, but their \textit{exposure bias} problem, described as the input mismatch between training and sampling, lacks in-depth exploration. In this paper, we systematically investigate the exposure bias problem in diffusion models by first analytically modelling the sampling distribution, based on which we then attribute the prediction error at each sampling step as the root cause of the exposure bias issue. Furthermore, we discuss potential solutions to this issue and propose an intuitive metric for it. Along with the elucidation of exposure bias, we propose a simple, yet effective, training-free method called Epsilon Scaling to alleviate the exposure bias. We show that Epsilon Scaling explicitly moves the sampling trajectory closer to the vector field learned in the training phase by scaling down the network output, mitigating the input mismatch between training and sampling. Experiments on various diffusion frameworks (ADM, DDIM, EDM, LDM, DiT, PFGM++) verify the effectiveness of our method. Remarkably, our ADM-ES, as a state-of-the-art stochastic sampler, obtains 2.17 FID on CIFAR-10 under 100-step unconditional generation. The code is available at \url{https://github.com/forever208/ADM-ES} and \url{https://github.com/forever208/EDM-ES}.
40
We present a comprehensive analysis of generic 5-dimensional Einstein-Maxwell-Dilaton-Axion (EMDA) holographic theories with exponential couplings. We find and classify exact, analytic, anisotropic solutions, both zero-temperature vacua and finite-temperature black brane backgrounds, with anisotropy sourced by scalar axions, magnetic fields, and charge densities, that can be interpreted as IR fixed points of renormalisation-group flows from UV-conformal fixed points. The resulting backgrounds feature a hyperscaling violation exponent and up to three independent Lifshitz-like exponents, generated by an equal number of independent coupling constants in the EMDA action. We derive the holographic stress-energy tensor and the corresponding equation of state, and discuss the behavior of the anisotropic speed of sound and butterfly velocity. We show that these theories can be consistently constrained by imposing several natural requirements, including energy conditions, thermodynamic stability, and causality. Additionally, we analyse hard probes in this class of theories, including Brownian motion, momentum broadening and jet quenching, and we demonstrate that a fully analytic treatment is possible, making their dependence on the underlying anisotropy explicit. We highlight the relevance of these models as benchmarks for strongly coupled anisotropic matter in nature, from the quark-gluon plasma created in heavy-ion collisions to dense QCD phases in neutron-star mergers and the cores of compact objects.
We formulate the inverse problem in a Bayesian framework and aim to train a generative model that allows us to simulate (i.e., sample from the likelihood) and do inference (i.e., sample from the posterior). We review the use of triangular normalizing flows for conditional sampling in this context and show how to combine two such triangular maps (an upper and a lower one) in to one invertible mapping that can be used for simulation and inference. We work out several useful properties of this invertible generative model and propose a possible training loss for training the map directly. We illustrate the workings of this new approach to conditional generative modeling numerically on a few stylized examples.
While both agent interaction and personalisation are vibrant topics in research on large language models (LLMs), there has been limited focus on the effect of language interaction on the behaviour of persona-conditioned LLM agents. Such an endeavour is important to ensure that agents remain consistent to their assigned traits yet are able to engage in open, naturalistic dialogues. In our experiments, we condition GPT-3.5 on personality profiles through prompting and create a two-group population of LLM agents using a simple variability-inducing sampling algorithm. We then administer personality tests and submit the agents to a collaborative writing task, finding that different profiles exhibit different degrees of personality consistency and linguistic alignment to their conversational partners. Our study seeks to lay the groundwork for better understanding of dialogue-based interaction between LLMs and highlights the need for new approaches to crafting robust, more human-like LLM personas for interactive environments.
4
Researchers from Max Planck Institute for Informatics and Saarland University developed RAG-GESTURE, a system that synthesizes natural and semantically rich co-speech gestures by integrating explicit linguistic knowledge into a pre-trained diffusion model during inference. The method showed improved quantitative metrics and consistently higher user preference for both naturalness and appropriateness compared to existing neural and RAG-based gesture generation approaches.
There are no more papers matching your filters at the moment.