Berlin School of Mind and BrainHumboldt-Universität zu Berlin
This paper is concerned with the evolution dynamics of local times of a spectrally positive stable process in the spatial direction. The main results state that conditioned on the finiteness of the first time at which the local time at zero exceeds a given value, the local times at positive half line are equal in distribution to the unique solution of a stochastic Volterra equation driven by a Poisson random measure whose intensity coincides with the Lévy measure. This helps us to provide not only a simple proof for the Hölder regularity, but also a uniform upper bound for all moments of the Hölder coefficient as well as a maximal inequality for the local times. Moreover, based on this stochastic Volterra equation, we extend the method of duality to establish an exponential-affine representation of the Laplace functional in terms of the unique solution of a nonlinear Volterra integral equation associated with the Laplace exponent of the stable process.
The detection of sequential patterns in data is a basic functionality of modern data processing systems for complex event processing (CEP), OLAP, and retrieval-augmented generation (RAG). In practice, pattern matching is challenging, since common applications rely on a large set of patterns that shall be evaluated with tight latency bounds. At the same time, matching needs to maintain state, i.e., intermediate results, that grows exponentially in the input size. Hence, systems turn to best-effort processing, striving for maximal recall under a latency bound. Existing techniques, however, consider each pattern in isolation, neglecting the optimization potential induced by state sharing in pattern matching. In this paper, we present SHARP, a library that employs state reduction to achieve efficient best-effort pattern matching. To this end, SHARP incorporates state sharing between patterns through a new abstraction, coined pattern-sharing degree (PSD). At runtime, this abstraction facilitates the categorization and indexing of partial pattern matches. Based thereon, once a latency bound is exceeded, SHARP realizes best-effort processing by selecting a subset of partial matches for further processing in constant time. In experiments with real-world data, SHARP achieves a recall of 97%, 96% and 73% for pattern matching in CEP, OLAP, and RAG applications, under a bound of 50% of the average processing latency.
We directly compare the persuasion capabilities of a frontier large language model (LLM; Claude Sonnet 3.5) against incentivized human persuaders in an interactive, real-time conversational quiz setting. In this preregistered, large-scale incentivized experiment, participants (quiz takers) completed an online quiz where persuaders (either humans or LLMs) attempted to persuade quiz takers toward correct or incorrect answers. We find that LLM persuaders achieved significantly higher compliance with their directional persuasion attempts than incentivized human persuaders, demonstrating superior persuasive capabilities in both truthful (toward correct answers) and deceptive (toward incorrect answers) contexts. We also find that LLM persuaders significantly increased quiz takers' accuracy, leading to higher earnings, when steering quiz takers toward correct answers, and significantly decreased their accuracy, leading to lower earnings, when steering them toward incorrect answers. Overall, our findings suggest that AI's persuasion capabilities already exceed those of humans that have real-money bonuses tied to performance. Our findings of increasingly capable AI persuaders thus underscore the urgency of emerging alignment and governance frameworks.
Weeds are one of the major reasons for crop yield loss but current weeding practices fail to manage weeds in an efficient and targeted manner. Effective weed management is especially important for crops with high worldwide production such as maize, to maximize crop yield for meeting increasing global demands. Advances in near-sensing and computer vision enable the development of new tools for weed management. Specifically, state-of-the-art segmentation models, coupled with novel sensing technologies, can facilitate timely and accurate weeding and monitoring systems. However, learning-based approaches require annotated data and show a lack of generalization to aerial imaging for different crops. We present a novel dataset for semantic and instance segmentation of crops and weeds in agricultural maize fields. The multispectral UAV-based dataset contains images with RGB, red-edge, and near-infrared bands, a large number of plant instances, dense annotations for maize and four weed classes, and is multitemporal. We provide extensive baseline results for both tasks, including probabilistic methods to quantify prediction uncertainty, improve model calibration, and demonstrate the approach's applicability to out-of-distribution data. The results show the effectiveness of the two additional bands compared to RGB only, and better performance in our target domain than models trained on existing datasets. We hope our dataset advances research on methods and operational systems for fine-grained weed identification, enhancing the robustness and applicability of UAV-based weed management. The dataset and code are available at this https URL
A controversial test for Large Language Models concerns the ability to discern possible from impossible language. While some evidence attests to the models' sensitivity to what crosses the limits of grammatically impossible language, this evidence has been contested on the grounds of the soundness of the testing material. We use model-internal representations to tap directly into the way Large Language Models represent the 'grammatical-ungrammatical' distinction. In a novel benchmark, we elicit probabilities from 4 models and compute minimal-pair surprisal differences, juxtaposing probabilities assigned to grammatical sentences to probabilities assigned to (i) lower frequency grammatical sentences, (ii) ungrammatical sentences, (iii) semantically odd sentences, and (iv) pragmatically odd sentences. The prediction is that if string-probabilities can function as proxies for the limits of grammar, the ungrammatical condition will stand out among the conditions that involve linguistic violations, showing a spike in the surprisal rates. Our results do not reveal a unique surprisal signature for ungrammatical prompts, as the semantically and pragmatically odd conditions consistently show higher surprisal. We thus demonstrate that probabilities do not constitute reliable proxies for model-internal representations of syntactic knowledge. Consequently, claims about models being able to distinguish possible from impossible language need verification through a different methodology.
Researchers at Humboldt-Universität zu Berlin introduced an enhancement for Retrieval-Augmented Generation (RAG) that decomposes complex queries into sub-queries using a zero-shot LLM and then refines retrieved passages with a cross-encoder reranker. This combined strategy improved retrieval recall and answer accuracy, achieving an 16.5% higher Hits@10 on MultiHop-RAG and an F1 score of 35.0 on HotpotQA for answer generation.
4
This thesis examines the correspondence between models of statistical physics and Feynman graphs of quantum field theories (QFTs) by a common property: integrability. We review integrable structures for periodic boundary conditions on both sides, while focusing on the eight- and six-vertex model and the bi-scalar fishnet theory. The latter is a double-scaled γ\gamma-deformation of N=4\mathcal{N} = 4 super Yang-Mills theory. Interesting applications of integrability existing in the literature that we reconsider are the computation of the free energy in the thermodynamic limit and its QFT counterpart, the critical coupling. In addition, we provide a detailed overview of the calculation of exact anomalous dimensions and operator product expansion (OPE) coefficients in the conformal bi-scalar fishnet theory. The original contributions of this work comprise the results of the critical coupling for models with fermions, the brick wall theory, and the fermionic fishnet theory. Additionally, we extend the study of integrable Feynman graphs to supersymmetric diagrams in superspace. By establishing an efficient graphical formalism, we obtain the critical coupling of double-scaled β\beta-deformations of N=4\mathcal{N} = 4 super Yang-Mills theory and Aharony-Bergman-Jafferis-Maldacena theory, the super brick wall and superfishnet theory, respectively. Moreover, we apply superspace methods to the superfishnet theory and find results for anomalous dimensions and an OPE coefficient, which are all-loop exact in the coupling. In addition, we study boundary integrability in the six-vertex model and for Feynman diagrams. We present new box-shaped boundary conditions for the six-vertex model and conjecture a closed form for its partition function at any lattice size. On the QFT side, we find integrable boundary scattering matrices in the form of generalized Feynman diagrams by graphical methods.
This paper from Humboldt-Universität zu Berlin introduces curriculum learning strategies to enable Multi-Token Prediction (MTP) for smaller language models, demonstrating that a Forward curriculum improves both performance and inference speed (1.2-1.7x) for models with 1.3B and 3B parameters. The work also found that byte-level tokenization consistently yields better results for MTP in these smaller architectures.
We review lattice results related to pion, kaon, DD-meson, BB-meson, and nucleon physics with the aim of making them easily accessible to the nuclear and particle physics communities. More specifically, we report on the determination of the light-quark masses, the form factor f+(0)f_+(0) arising in the semileptonic KπK \to \pi transition at zero momentum transfer, as well as the decay-constant ratio fK/fπf_K/f_\pi and its consequences for the CKM matrix elements VusV_{us} and VudV_{ud}. We review the determination of the BKB_K parameter of neutral kaon mixing as well as the additional four BB parameters that arise in theories of physics beyond the Standard Model. For the heavy-quark sector, we provide results for mcm_c and mbm_b as well as those for the decay constants, form factors, and mixing parameters of charmed and bottom mesons and baryons. These are the heavy-quark quantities most relevant for the determination of CKM matrix elements and the global CKM unitarity-triangle fit. We review the status of lattice determinations of the strong coupling constant αs\alpha_s. We review the determinations of nucleon charges from the matrix elements of both isovector and flavour-diagonal axial, scalar and tensor local quark bilinears, and momentum fraction, helicity moment and the transversity moment from one-link quark bilinears. We also review determinations of scale-setting quantities. Finally, in this review we have added a new section on the general definition of the low-energy limit of the Standard Model.
Computing partial differential equation (PDE) operators via nested backpropagation is expensive, yet popular, and severely restricts their utility for scientific machine learning. Recent advances, like the forward Laplacian and randomizing Taylor mode automatic differentiation (AD), propose forward schemes to address this. We introduce an optimization technique for Taylor mode that 'collapses' derivatives by rewriting the computational graph, and demonstrate how to apply it to general linear PDE operators, and randomized Taylor mode. The modifications simply require propagating a sum up the computational graph, which could -- or should -- be done by a machine learning compiler, without exposing complexity to users. We implement our collapsing procedure and evaluate it on popular PDE operators, confirming it accelerates Taylor mode and outperforms nested backpropagation.
We develop the on-shell action formalism within Worldline Quantum Field Theory (WQFT) to describe scattering of spinning compact bodies in General Relativity in the post-Minkowskian (PM) expansion. The real on-shell action is constructed from vacuum diagrams with causal (retarded) propagators from which scattering observables such as momentum impulse and spin kick follow via Poisson brackets of the initial scattering data. Furthermore, we explore the implications of unitarity at the level of the worldline and show how generalised unitarity techniques can be adapted to WQFT to efficiently compute multi-loop contributions. Our work establishes a concrete link between WQFT and amplitude-based methods, elucidating how unitarity cuts ensure equivalence between the on-shell action derived from either approach. Extending the state-of-the-art, we complete the full on-shell action -- including dissipative terms -- at (formal) 3PM order and up to quartic spin interactions on both massive bodies.
Digital pathology has seen the advent of a wealth of foundational models (FM), yet to date their performance on cell phenotyping has not been benchmarked in a unified manner. We therefore propose PhenoBench: A comprehensive benchmark for cell phenotyping on Hematoxylin and Eosin (H&E) stained histopathology images. We provide both PhenoCell, a new H&E dataset featuring 14 granular cell types identified by using multiplexed imaging, and ready-to-use fine-tuning and benchmarking code that allows the systematic evaluation of multiple prominent pathology FMs in terms of dense cell phenotype predictions in different generalization scenarios. We perform extensive benchmarking of existing FMs, providing insights into their generalization behavior under technical vs. medical domain shifts. Furthermore, while FMs achieve macro F1 scores > 0.70 on previously established benchmarks such as Lizard and PanNuke, on PhenoCell, we observe scores as low as 0.20. This indicates a much more challenging task not captured by previous benchmarks, establishing PhenoCell as a prime asset for future benchmarking of FMs and supervised models alike. Code and data are available on GitHub.
6
Improvements to the Adaptive Density Control (ADC) mechanism for 3D Gaussian Splatting (3DGS) are presented, featuring a corrected scene extent calculation, an exponentially ascending gradient threshold, and significance-aware pruning. This refined ADC results in enhanced rendering quality and faster training convergence for 3D scene reconstruction and novel view synthesis.
1
Measuring the Higgs trilinear self-coupling λhhh\lambda_{hhh} is experimentally demanding but fundamental for understanding the shape of the Higgs potential. We present a comprehensive analysis strategy for the HL-LHC using di-Higgs events in the four bb-quark channel (hh4bhh \to 4b), extending current methods in several directions. We perform deep learning to suppress the formidable multijet background with dedicated optimisation for BSM λhhh\lambda_{hhh} scenarios. We compare the λhhh\lambda_{hhh} constraining power of events using different multiplicities of large radius jets with a two-prong structure that reconstruct boosted hbbh \to bb decays. We show that current uncertainties in the SM top Yukawa coupling yty_t can modify λhhh\lambda_{hhh} constraints by 20%\sim 20\%. For SM yty_t, we find prospects of $-0.8 < \lambda_{hhh} / \lambda_{hhh}^\text{SM} < 6.6$ at 68% CL under simplified assumptions for 3000~fb1^{-1} of HL-LHC data. Our results provide a careful assessment of di-Higgs identification and machine learning techniques for all-hadronic measurements of the Higgs self-coupling and sharpens the requirements for future improvement.
Recommender systems have become an integral part of online services to help users locate specific information in a sea of data. However, existing studies show that some recommender systems are vulnerable to poisoning attacks, particularly those that involve learning schemes. A poisoning attack is where an adversary injects carefully crafted data into the process of training a model, with the goal of manipulating the system's final recommendations. Based on recent advancements in artificial intelligence, such attacks have gained importance recently. While numerous countermeasures to poisoning attacks have been developed, they have not yet been systematically linked to the properties of the attacks. Consequently, assessing the respective risks and potential success of mitigation strategies is difficult, if not impossible. This survey aims to fill this gap by primarily focusing on poisoning attacks and their countermeasures. This is in contrast to prior surveys that mainly focus on attacks and their detection methods. Through an exhaustive literature review, we provide a novel taxonomy for poisoning attacks, formalise its dimensions, and accordingly organise 30+ attacks described in the literature. Further, we review 40+ countermeasures to detect and/or prevent poisoning attacks, evaluating their effectiveness against specific types of attacks. This comprehensive survey should serve as a point of reference for protecting recommender systems against poisoning attacks. The article concludes with a discussion on open issues in the field and impactful directions for future research. A rich repository of resources associated with poisoning attacks is available at this https URL.
This thesis introduces a novel methodology for the automated generation of knowledge graphs from user stories by leveraging the advanced capabilities of Large Language Models. Utilizing the LangChain framework as a basis, the User Story Graph Transformer module was developed to extract nodes and relationships from user stories using an LLM to construct accurate knowledge graphs.This innovative technique was implemented in a script to fully automate the knowledge graph extraction process. Additionally, the evaluation was automated through a dedicated evaluation script, utilizing an annotated dataset for assessment. By enhancing the visualization and understanding of user requirements and domain concepts, this method fosters better alignment between software functionalities and user expectations, ultimately contributing to more effective and user-centric software development processes.
Training and evaluating language models increasingly requires the construction of meta-datasets --diverse collections of curated data with clear provenance. Natural language prompting has recently lead to improved zero-shot generalization by transforming existing, supervised datasets into a diversity of novel pretraining tasks, highlighting the benefits of meta-dataset curation. While successful in general-domain text, translating these data-centric approaches to biomedical language modeling remains challenging, as labeled biomedical datasets are significantly underrepresented in popular data hubs. To address this challenge, we introduce BigBIO a community library of 126+ biomedical NLP datasets, currently covering 12 task categories and 10+ languages. BigBIO facilitates reproducible meta-dataset curation via programmatic access to datasets and their metadata, and is compatible with current platforms for prompt engineering and end-to-end few/zero shot language model evaluation. We discuss our process for task schema harmonization, data auditing, contribution guidelines, and outline two illustrative use cases: zero-shot evaluation of biomedical prompts and large-scale, multi-task learning. BigBIO is an ongoing community effort and is available at this https URL
The IceCube Neutrino Observatory is a cubic-kilometer-scale high-energy neutrino detector built into the ice at the South Pole. Construction of IceCube, the largest neutrino detector built to date, was completed in 2011 and enabled the discovery of high-energy astrophysical neutrinos. We describe here the design, production, and calibration of the IceCube digital optical module (DOM), the cable systems, computing hardware, and our methodology for drilling and deployment. We also describe the online triggering and data filtering systems that select candidate neutrino and cosmic ray events for analysis. Due to a rigorous pre-deployment protocol, 98.4% of the DOMs in the deep ice are operating and collecting data. IceCube routinely achieves a detector uptime of 99% by emphasizing software stability and monitoring. Detector operations have been stable since construction was completed, and the detector is expected to operate at least until the end of the next decade.
Maximum mean discrepancy (MMD) flows suffer from high computational costs in large scale computations. In this paper, we show that MMD flows with Riesz kernels K(x,y)=xyrK(x,y) = - \|x-y\|^r, r(0,2)r \in (0,2) have exceptional properties which allow their efficient computation. We prove that the MMD of Riesz kernels, which is also known as energy distance, coincides with the MMD of their sliced version. As a consequence, the computation of gradients of MMDs can be performed in the one-dimensional setting. Here, for r=1r=1, a simple sorting algorithm can be applied to reduce the complexity from O(MN+N2)O(MN+N^2) to O((M+N)log(M+N))O((M+N)\log(M+N)) for two measures with MM and NN support points. As another interesting follow-up result, the MMD of compactly supported measures can be estimated from above and below by the Wasserstein-1 distance. For the implementations we approximate the gradient of the sliced MMD by using only a finite number PP of slices. We show that the resulting error has complexity O(d/P)O(\sqrt{d/P}), where dd is the data dimension. These results enable us to train generative models by approximating MMD gradient flows by neural networks even for image applications. We demonstrate the efficiency of our model by image generation on MNIST, FashionMNIST and CIFAR10.
6
The vast majority of materials science knowledge exists in unstructured natural language, yet structured data is crucial for innovative and systematic materials design. Traditionally, the field has relied on manual curation and partial automation for data extraction for specific use cases. The advent of large language models (LLMs) represents a significant shift, potentially enabling efficient extraction of structured, actionable data from unstructured text by non-experts. While applying LLMs to materials science data extraction presents unique challenges, domain knowledge offers opportunities to guide and validate LLM outputs. This review provides a comprehensive overview of LLM-based structured data extraction in materials science, synthesizing current knowledge and outlining future directions. We address the lack of standardized guidelines and present frameworks for leveraging the synergy between LLMs and materials science expertise. This work serves as a foundational resource for researchers aiming to harness LLMs for data-driven materials research. The insights presented here could significantly enhance how researchers across disciplines access and utilize scientific information, potentially accelerating the development of novel materials for critical societal needs.
There are no more papers matching your filters at the moment.