L3S Research CenterLeibniz Universität Hannover
30 Jan 2024
We consider goal-oriented adaptive space-time finite-element discretizations of the parabolic heat equation on completely unstructured simplicial space-time meshes. In some applications, we are interested in an accurate computation of some possibly nonlinear functionals at the solution, so called goal functionals. This motivates the use of adaptive mesh refinements driven by the dual-weighted residual (DWR) method. The DWR method requires the numerical solution of a linear adjoint problem that provides the sensitivities for the mesh refinement. This can be done by means of the same full space-time finite element discretization as used for the primal linear problem. The numerical experiment presented demonstrates that this goal-oriented, full space-time finite element solver efficiently provides accurate numerical results for a model problem with moving domains and a linear goal functional, where we know the exact value.
We present OpenReviewer, an open-source system for generating high-quality peer reviews of machine learning and AI conference papers. At its core is Llama-OpenReviewer-8B, an 8B parameter language model specifically fine-tuned on 79,000 expert reviews from top conferences. Given a PDF paper submission and review template as input, OpenReviewer extracts the full text, including technical content like equations and tables, and generates a structured review following conference-specific guidelines. Our evaluation on 400 test papers shows that OpenReviewer produces considerably more critical and realistic reviews compared to general-purpose LLMs like GPT-4 and Claude-3.5. While other LLMs tend toward overly positive assessments, OpenReviewer's recommendations closely match the distribution of human reviewer ratings. The system provides authors with rapid, constructive feedback to improve their manuscripts before submission, though it is not intended to replace human peer review. OpenReviewer is available as an online demo and open-source tool.
With increasing digitalization, Artificial Intelligence (AI) is becoming ubiquitous. AI-based systems to identify, optimize, automate, and scale solutions to complex economic and societal problems are being proposed and implemented. This has motivated regulation efforts, including the Proposal of an EU AI Act. This interdisciplinary position paper considers various concerns surrounding fairness and discrimination in AI, and discusses how AI regulations address them, focusing on (but not limited to) the Proposal. We first look at AI and fairness through the lenses of law, (AI) industry, sociotechnology, and (moral) philosophy, and present various perspectives. Then, we map these perspectives along three axes of interests: (i) Standardization vs. Localization, (ii) Utilitarianism vs. Egalitarianism, and (iii) Consequential vs. Deontological ethics which leads us to identify a pattern of common arguments and tensions between these axes. Positioning the discussion within the axes of interest and with a focus on reconciling the key tensions, we identify and propose the roles AI Regulation should take to make the endeavor of the AI Act a success in terms of AI fairness concerns.
Researchers from L3S Research Center, TU Delft, and the University of Glasgow present a comprehensive survey classifying adaptive retrieval and ranking mechanisms that leverage test-time corpus feedback in Retrieval-Augmented Generation (RAG) systems. The work offers a structured overview of techniques that dynamically refine information retrieval during inference, moving beyond static retrieval to enhance RAG performance for complex tasks.
Privacy and interpretability are two important ingredients for achieving trustworthy machine learning. We study the interplay of these two aspects in graph machine learning through graph reconstruction attacks. The goal of the adversary here is to reconstruct the graph structure of the training data given access to model explanations. Based on the different kinds of auxiliary information available to the adversary, we propose several graph reconstruction attacks. We show that additional knowledge of post-hoc feature explanations substantially increases the success rate of these attacks. Further, we investigate in detail the differences between attack performance with respect to three different classes of explanation methods for graph neural networks: gradient-based, perturbation-based, and surrogate model-based methods. While gradient-based explanations reveal the most in terms of the graph structure, we find that these explanations do not always score high in utility. For the other two classes of explanations, privacy leakage increases with an increase in explanation utility. Finally, we propose a defense based on a randomized response mechanism for releasing the explanations, which substantially reduces the attack success rate. Our code is available at this https URL
3
Test-time scaling (TTS) has emerged as a new frontier for scaling the performance of Large Language Models. In test-time scaling, by using more computational resources during inference, LLMs can improve their reasoning process and task performance. Several approaches have emerged for TTS such as distilling reasoning traces from another model or exploring the vast decoding search space by employing a verifier. The verifiers serve as reward models that help score the candidate outputs from the decoding process to diligently explore the vast solution space and select the best outcome. This paradigm commonly termed has emerged as a superior approach owing to parameter free scaling at inference time and high performance gains. The verifiers could be prompt-based, fine-tuned as a discriminative or generative model to verify process paths, outcomes or both. Despite their widespread adoption, there is no detailed collection, clear categorization and discussion of diverse verification approaches and their training mechanisms. In this survey, we cover the diverse approaches in the literature and present a unified view of verifier training, types and their utility in test-time scaling. Our repository can be found at this https URL.
Quantum networks are emerging as powerful platforms for sensing, communication, and fundamental tests of physics. We propose a programmable quantum sensing network based on entangled atomic ensembles, where optical clock qubits emulate mass superpositions in atom and atom-clock interferometry. Our approach uniquely combines scalability to large atom numbers with minimal control requirements, relying only on collective addressing of internal atomic states. This enables the creation of both non-local and local superpositions with spatial separations beyond those achievable in conventional interferometry. Starting from Bell-type seed states distributed via photonic channels, collective operations within atomic ensembles coherently build many-body mass superpositions sensitive to gravitational redshift. The resulting architecture realizes a non-local Ramsey interferometer, with gravitationally induced phase shifts observable in network-based interference patterns. Beyond extending the spatial reach of mass superpositions, our scheme establishes a scalable, programmable platform to probe the interface of quantum mechanics and gravity, and offers a new experimental pathway to test atom and atom-clock interferometer proposals in a network-based quantum laboratory.
To achieve peak predictive performance, hyperparameter optimization (HPO) is a crucial component of machine learning and its applications. Over the last years, the number of efficient algorithms and tools for HPO grew substantially. At the same time, the community is still lacking realistic, diverse, computationally cheap, and standardized benchmarks. This is especially the case for multi-fidelity HPO methods. To close this gap, we propose HPOBench, which includes 7 existing and 5 new benchmark families, with a total of more than 100 multi-fidelity benchmark problems. HPOBench allows to run this extendable set of multi-fidelity HPO benchmarks in a reproducible way by isolating and packaging the individual benchmarks in containers. It also provides surrogate and tabular benchmarks for computationally affordable yet statistically sound evaluations. To demonstrate HPOBench's broad compatibility with various optimization tools, as well as its usefulness, we conduct an exemplary large-scale study evaluating 13 optimizers from 6 optimization tools. We provide HPOBench here: https://github.com/automl/HPOBench.
Building on the work of Eliashberg and Thurston, we associate to a taut foliation on a closed oriented 33-manifold MM a Liouville structure on the thickening [1,1]×M[-1,1] \times M, under suitable hypotheses. Our main result shows that this Liouville structure is a topological invariant of the foliation: two such foliations which are topologically conjugated induce exact symplectomorphic Liouville structures. Specializing to the case of weak foliations of Anosov flows, we obtain that under natural orientability conditions, the Liouville structures originally introduced by Mitsumatsu are invariant under orbit equivalence. Our methods also imply that two orbit equivalent Anosov flows are deformation equivalent through projectively Anosov flows. The proofs combine two main technical ingredients: (1) a careful smoothing scheme for topological conjugacies between C1C^1-foliations, and (2) a refinement of a deep result of Vogel on the uniqueness of contact structures approximating a foliation. In an appendix, this smoothing scheme is used to construct new examples of collapsed Anosov flows, providing a key step to complete the classification of transitive partially hyperbolic diffeomorphisms in dimension three.
This work introduces EMONET-VOICE, a novel suite of open-access synthetic speech datasets and models for fine-grained Speech Emotion Recognition. It presents EMONET-VOICEBIG, a large-scale corpus for pre-training, and EMONET-VOICEBENCH, an expert-verified benchmark with 40 emotion categories, and also establishes EMPATHICINSIGHT-VOICE models which demonstrate state-of-the-art alignment with human expert judgments.
Effective human-AI interaction relies on AI's ability to accurately perceive and interpret human emotions. Current benchmarks for vision and vision-language models are severely limited, offering a narrow emotional spectrum that overlooks nuanced states (e.g., bitterness, intoxication) and fails to distinguish subtle differences between related feelings (e.g., shame vs. embarrassment). Existing datasets also often use uncontrolled imagery with occluded faces and lack demographic diversity, risking significant bias. To address these critical gaps, we introduce EmoNet Face, a comprehensive benchmark suite. EmoNet Face features: (1) A novel 40-category emotion taxonomy, meticulously derived from foundational research to capture finer details of human emotional experiences. (2) Three large-scale, AI-generated datasets (EmoNet HQ, Binary, and Big) with explicit, full-face expressions and controlled demographic balance across ethnicity, age, and gender. (3) Rigorous, multi-expert annotations for training and high-fidelity evaluation. (4) We built EmpathicInsight-Face, a model achieving human-expert-level performance on our benchmark. The publicly released EmoNet Face suite - taxonomy, datasets, and model - provides a robust foundation for developing and evaluating AI systems with a deeper understanding of human emotions.
Researchers from L3S, University of Glasgow, and TU Delft present SlideGar, an adaptive retrieval algorithm enabling listwise Large Language Model (LLM) rerankers to dynamically expand the document pool beyond initial retrieval. It consistently improves nDCG@10 and Recall, particularly with sparse initial retrievers, by leveraging pre-built corpus graphs with minimal computational overhead.
This is an extended and corrected version of lecture notes originally written for a one semester course at Leibniz University Hannover. The main aim of the notes is to give an introduction to the mathematical methods used in describing discrete quantum systems consisting of infinitely many sites. Such systems can be used, for example, to model the materials in condensed matter physics. The notes provide the necessary background material to access recent literature in the field. Some of these recent results are also discussed. The contents are roughly as follows: (1) quick recap of essentials from functional analysis, (2) introduction to operator algebra, (3) algebraic quantum mechanics, (4) infinite systems (quasilocal algebra), (5) KMS and ground states, (6) Lieb-Robinson bounds, (7) algebraic quantum field theory, (8) superselection sectors of the toric code, (9) Haag-Ruelle scattering theory in spin systems, (10) applications to gapped phases. The level is aimed at students who have at least had some exposure to (functional) analysis and have a certain mathematical "maturity".
Recent advances in generative AI, particularly in computer vision (CV), offer new opportunities to optimize workflows across industries, including logistics and manufacturing. However, many AI applications are limited by a lack of expertise and resources, which forces a reliance on general-purpose models. Success with these models often requires domain-specific data for fine-tuning, which can be costly and inefficient. Thus, using synthetic data for fine-tuning is a popular, cost-effective alternative to gathering real-world data. This work investigates the impact of synthetic data on the performance of object detection models, compared to models trained on real-world data only, specifically within the domain of warehouse logistics. To this end, we examined the impact of synthetic data generated using the NVIDIA Omniverse Replicator tool on the effectiveness of object detection models in real-world scenarios. It comprises experiments focused on pallet detection in a warehouse setting, utilizing both real and various synthetic dataset generation strategies. Our findings provide valuable insights into the practical applications of synthetic image data in computer vision, suggesting that a balanced integration of synthetic and real data can lead to robust and efficient object detection models.
Retrieval-Augmented Generation (RAG) enriches Large Language Models (LLMs) by combining their internal, parametric knowledge with external, non-parametric sources, with the goal of improving factual correctness and minimizing hallucinations. The LiveRAG 2025 challenge explores RAG solutions to maximize accuracy on DataMorgana's QA pairs, which are composed of single-hop and multi-hop questions. The challenge provides access to sparse OpenSearch and dense Pinecone indices of the Fineweb 10BT dataset. It restricts model use to LLMs with up to 10B parameters and final answer generation with Falcon-3-10B. A judge-LLM assesses the submitted answers along with human evaluators. By exploring distinct retriever combinations and RAG solutions under the challenge conditions, our final solution emerged using InstructRAG in combination with a Pinecone retriever and a BGE reranker. Our solution achieved a correctness score of 1.13 and a faithfulness score of 0.55 in the non-human evaluation, placing it overall in third place in the SIGIR 2025 LiveRAG Challenge.
Despite the widespread adoption of Large language models (LLMs), their remarkable capabilities remain limited to a few high-resource languages. Additionally, many low-resource languages (\eg African languages) are often evaluated only on basic text classification tasks due to the lack of appropriate or comprehensive benchmarks outside of high-resource languages. In this paper, we introduce IrokoBench -- a human-translated benchmark dataset for 17 typologically-diverse low-resource African languages covering three tasks: natural language inference~(AfriXNLI), mathematical reasoning~(AfriMGSM), and multi-choice knowledge-based question answering~(AfriMMLU). We use IrokoBench to evaluate zero-shot, few-shot, and translate-test settings~(where test sets are translated into English) across 10 open and six proprietary LLMs. Our evaluation reveals a significant performance gap between high-resource languages~(such as English and French) and low-resource African languages. We observe a significant performance gap between open and proprietary models, with the highest performing open model, Gemma 2 27B only at 63\% of the best-performing proprietary model GPT-4o performance. In addition, machine translating the test set to English before evaluation helped to close the gap for larger models that are English-centric, such as Gemma 2 27B and LLaMa 3.1 70B. These findings suggest that more efforts are needed to develop and adapt LLMs for African languages.
Graph Neural Networks (GNNs) have shown state-of-the-art improvements in node classification tasks on graphs. While these improvements have been largely demonstrated in a multi-class classification scenario, a more general and realistic scenario in which each node could have multiple labels has so far received little attention. The first challenge in conducting focused studies on multi-label node classification is the limited number of publicly available multi-label graph datasets. Therefore, as our first contribution, we collect and release three real-world biological datasets and develop a multi-label graph generator to generate datasets with tunable properties. While high label similarity (high homophily) is usually attributed to the success of GNNs, we argue that a multi-label scenario does not follow the usual semantics of homophily and heterophily so far defined for a multi-class scenario. As our second contribution, we define homophily and Cross-Class Neighborhood Similarity for the multi-label scenario and provide a thorough analyses of the collected 99 multi-label datasets. Finally, we perform a large-scale comparative study with 88 methods and 99 datasets and analyse the performances of the methods to assess the progress made by current state of the art in the multi-label node classification scenario. We release our benchmark at this https URL
University of Washington logoUniversity of WashingtonCNRS logoCNRSUniversity of Toronto logoUniversity of TorontoUniversity of MississippiUniversity of CincinnatiCalifornia Institute of Technology logoCalifornia Institute of TechnologyUniversity of Cambridge logoUniversity of CambridgeINFN Sezione di NapoliMonash University logoMonash UniversityNational Central UniversityNational Astronomical Observatory of JapanVanderbilt UniversityUniversity of Notre Dame logoUniversity of Notre DameTel Aviv University logoTel Aviv UniversityUniversity College London logoUniversity College LondonNikhefGeorgia Institute of Technology logoGeorgia Institute of TechnologyUniversity of Science and Technology of China logoUniversity of Science and Technology of ChinaTsinghua University logoTsinghua UniversityThe Chinese University of Hong Kong logoThe Chinese University of Hong KongUniversity of MelbourneThe University of Texas at Austin logoThe University of Texas at AustinUniversity of WarsawPeking University logoPeking UniversityTexas A&M University logoTexas A&M UniversityUniversity of British Columbia logoUniversity of British ColumbiaNorthwestern University logoNorthwestern UniversityNASA Goddard Space Flight Center logoNASA Goddard Space Flight CenterLouisiana State UniversityUniversity of Florida logoUniversity of FloridaINFN Sezione di PisaRutherford Appleton LaboratoryUniversity of Minnesota logoUniversity of MinnesotaUniversity of Maryland logoUniversity of MarylandUniversity of Tokyo logoUniversity of TokyoIndian Institute of ScienceNational Taiwan Normal UniversityThe Pennsylvania State University logoThe Pennsylvania State UniversityRochester Institute of TechnologyGran Sasso Science InstituteSorbonne Université logoSorbonne UniversitéUniversity of Massachusetts AmherstAustralian National University logoAustralian National UniversityUniversity of AucklandCardiff UniversityUniversity of GlasgowLeibniz Universität HannoverUniversity of PortsmouthUniversidade Federal do ABCHigh Energy Accelerator Research Organization (KEK)Indian Institute of Technology MadrasUniversity of StrathclydeUniversità di GenovaUniversity of Alabama in HuntsvilleSyracuse UniversityUniversity of SannioRMIT UniversityInstituto Nacional de Pesquisas EspaciaisUniversità di CamerinoUniversitat de les Illes BalearsMaastricht UniversityUniversity of BirminghamUniversità di TriesteNational Cheng Kung UniversityAix Marseille UniversityKyushu UniversityUniversity of South CarolinaWashington State UniversityUniversity of OregonNational Tsing-Hua UniversityKindai UniversityThe University of Western AustraliaUniversidade de AveiroEötvös Loránd UniversityUniversitat Autònoma de BarcelonaSofia UniversityNicolaus Copernicus Astronomical CenterInstituto de Fisica Teorica UAM/CSICShanghai Astronomical ObservatoryNicolaus Copernicus UniversityINFN, Laboratori Nazionali di FrascatiUniversity of Western OntarioUniversità di Napoli Federico IIUniversity of California, Santa Cruz logoUniversity of California, Santa CruzEmbry-Riddle Aeronautical UniversityUniversity of Hawai’iUniversity of Electro-CommunicationsNational Chung Hsing UniversityMontana State UniversityInternational Centre for Theoretical SciencesINFN Sezione di PerugiaIstituto Nazionale di Alta MatematicaThe University of SheffieldUniversité de la Côte d’AzurPhysikalisch-Technische BundesanstaltInstitut de Física d’Altes Energies (IFAE)INFN - Sezione di PadovaUniversity of the Balearic IslandsLaboratoire Kastler BrosselUniversità di FirenzeUniversity of ToyamaIstituto Nazionale di OtticaINFN-Sezione di GenovaUniversiteit AntwerpenThe University of MississippiUniversity of SzegedUniversità di PerugiaINFN-Sezione di BolognaUniversità di CagliariVU AmsterdamInstitute for Cosmic Ray Research, University of TokyoINFN Sezione di Roma Tor VergataUniversité de Paris, CNRS, Astroparticule et Cosmologie,California State University, Los AngelesUniversità di SienaLIGO Livingston ObservatoryNational Center for High-Performance ComputingNCBJLaboratoire AstroParticule et Cosmologie - CNRSUniversità di Urbino Carlo BoUniversità degli Studi di SassariUniversità di Trento, INFN-TIFPAWigner RCP, RMKIINFN Sezione di CagliariRESCEU, University of TokyoUniv Lyon, ENS de Lyon, CNRS, Université Claude Bernard Lyon 1Universite de Nice, ARTEMIS, CNRS, Observatoire de la Cote d’AzurIstituto de Fısica Teórica, UAM/CSICAlbert-Einstein-Institut, HanoverAPC, AstroParticule et Cosmologie, CNRSGSSI, INFN, Laboratori Nazionali del Gran SassoNational Institute of Technology, Akashi CollegeLAPP, Universit´e Savoie Mont BlancUniversità di NapoliUniversità degli Studi di CamerinoThe University of Sheffield, Department of Physics and AstronomyUniversite de Paris* National and Kapodistrian University of AthensFriedrich-Schiller-Universität JenaUniversit Grenoble AlpesUniversit degli Studi di GenovaUniversit Libre de BruxellesUniversit di TrentoUniversit di SalernoUniversit degli Studi di PadovaUniversit de BordeauxUniversit di Roma La SapienzaUniversit Paris CitUniversit de StrasbourgUniversit de LyonUniversit di PisaINAF Osservatorio Astronomico di PadovaUniversit de MontpellierUniversit di Roma Tor VergataUniversit Di BolognaINAF ` Osservatorio Astronomico di TriesteINFN Sezione di Firenze
The ever-increasing number of detections of gravitational waves (GWs) from compact binaries by the Advanced LIGO and Advanced Virgo detectors allows us to perform ever-more sensitive tests of general relativity (GR) in the dynamical and strong-field regime of gravity. We perform a suite of tests of GR using the compact binary signals observed during the second half of the third observing run of those detectors. We restrict our analysis to the 15 confident signals that have false alarm rates 103yr1\leq 10^{-3}\, {\rm yr}^{-1}. In addition to signals consistent with binary black hole (BH) mergers, the new events include GW200115_042309, a signal consistent with a neutron star--BH merger. We find the residual power, after subtracting the best fit waveform from the data for each event, to be consistent with the detector noise. Additionally, we find all the post-Newtonian deformation coefficients to be consistent with the predictions from GR, with an improvement by a factor of ~2 in the -1PN parameter. We also find that the spin-induced quadrupole moments of the binary BH constituents are consistent with those of Kerr BHs in GR. We find no evidence for dispersion of GWs, non-GR modes of polarization, or post-merger echoes in the events that were analyzed. We update the bound on the mass of the graviton, at 90% credibility, to mg2.42×1023eV/c2m_g \leq 2.42 \times 10^{-23} \mathrm{eV}/c^2. The final mass and final spin as inferred from the pre-merger and post-merger parts of the waveform are consistent with each other. The studies of the properties of the remnant BHs, including deviations of the quasi-normal mode frequencies and damping times, show consistency with the predictions of GR. In addition to considering signals individually, we also combine results from the catalog of GW signals to calculate more precise population constraints. We find no evidence in support of physics beyond GR.
Translating a general quantum circuit on a specific hardware topology with a reduced set of available gates, also known as transpilation, comes with a substantial increase in the length of the equivalent circuit. Due to decoherence, the quality of the computational outcome can degrade seriously with increasing circuit length. Thus, there is major interest to reduce a quantum circuit to an equivalent circuit which is in its gate count as short as possible. One method to address efficient transpilation is based on approaches known from stochastic optimization, e.g. by using random sampling and token replacement strategies. Here, a core challenge is that these methods can suffer from sampling efficiency, causing long and energy consuming optimization time. As a remedy, we propose in this work 2D neural guided sampling. Thus, given a 2D representation of a quantum circuit, a neural network predicts groups of gates in the quantum circuit, which are likely reducible. Thus, it leads to a sampling prior which can heavily reduce the compute time for quantum circuit reduction. In several experiments, we demonstrate that our method is superior to results obtained from different qiskit or BQSKit optimization levels.
10 Oct 2025
Attosecond light sources based on high-order harmonic generation (HHG) constitute to date the only table-top solution for producing coherent broadband radiation covering the spectral range from the extreme ultraviolet to the soft X-rays. The so-called emission cutoff can be extended towards higher photon energies by increasing the driving wavelength at the expense of conversion efficiency. An alternative route is to overdrive the process by using higher laser intensities, with the challenging requirement of interacting with higher plasma densities over short propagation distances. Here, we address this challenge by using a differentially pumped glass chip designed for optimal gas confinement over sub-mm lengths. By driving HHG with multicycle pulses at either 800 nm or 1500 nm, we demonstrate a cutoff extension by a factor of two compared to conventional phase matching approaches and surpassing the present record using multicycle fields. Our three-dimensional propagation simulations, in excellent agreement with the experiment, confirm that gas confinement is crucial since efficient phase matching of cutoff harmonics occurs only for short propagation lengths. Additionally, we show that the high photon energy component is not only temporally confined to the leading edge of the driving pulse, but also spatially confined in the near-field to an off-axis contribution due to reshaping of the driving field along propagation inside the medium. Our findings contribute to the fundamental understanding of HHG across different regimes.
There are no more papers matching your filters at the moment.