alphaXiv

History

Papers Benchmarks

TU Wien

2,041

07 Feb 2025

computer-science computer-vision-security artificial-intelligence

Through-Wall Imaging based on WiFi Channel State Information

TU Wien

Researchers at TU Wien developed a method to synthesize person-centric visual images directly from WiFi Channel State Information (CSI) obtained through walls. Their approach, using a multimodal Variational Autoencoder, produced perceptually clearer and more temporally coherent images, particularly with a concatenation and temporal encoding strategy.

2,015

09 Apr 2024

physics applied-physics instrumentation-and-detectors

Direct electrochemical reduction of graphene oxide thin film for aptamer-based selective and highly sensitive detection of Matrix metalloproteinase 2

TU Wien University of Novi Sad Biosense Institute – Research and Development Institute for Information Technologies in Biosystems

Simple and low-cost biosensing solutions are suitable for point-of-care applications aiming to overcome the gap between scientific concepts and technological production. To compete with sensitivity and selectivity of golden standards, such as liquid chromatography, the functionalization of biosensors is continuously optimized to enhance the signal and improve their performance, often leading to complex chemical assay development. In this research, the efforts are made on optimizing the methodology for electrochemical reduction of graphene oxide to produce thin film-modified gold electrodes. Under the employed specific conditions, 20 cycles of cyclic voltammetry (CV) are shown to be optimal for superior electrical activation of graphene oxide into electrochemically reduced graphene oxide (ERGO). This platform is further used to develop a matrix metalloproteinase 2 (MMP-2) biosensor, where specific anti-MMP2 aptamers are utilized as a biorecognition element. MMP-2 is a protein which is typically overexpressed in tumor tissues, with important roles in tumor invasion, metastasis as well as in tumor angiogenesis. Based on impedimetric measurements, we were able to detect as low as 3.32 pg/mL of MMP-2 in PBS with a dynamic range of 10 pg/mL - 10 ng/mL. Besides high specificity, ERGO-based aptasensor showed a potential of reuse due to demonstrated successful signal restoration after experimental detection of MMP-2.

135

27 May 2025

computer-science artificial-intelligence human-ai-interaction

Aligning Generalisation Between Humans and Machines

Recent advances in AI -- including generative approaches -- have resulted in technology that can support humans in scientific discovery and forming decisions, but may also disrupt democracies and target individuals. The responsible use of AI and its participation in human-AI teams increasingly shows the need for AI alignment, that is, to make AI systems act according to our preferences. A crucial yet often overlooked aspect of these interactions is the different ways in which humans and machines generalise. In cognitive science, human generalisation commonly involves abstraction and concept learning. In contrast, AI generalisation encompasses out-of-domain generalisation in machine learning, rule-based reasoning in symbolic AI, and abstraction in neurosymbolic AI. In this perspective paper, we combine insights from AI and cognitive science to identify key commonalities and differences across three dimensions: notions of, methods for, and evaluation of generalisation. We map the different conceptualisations of generalisation in AI and cognitive science along these three dimensions and consider their role for alignment in human-AI teaming. This results in interdisciplinary challenges across AI and cognitive science that must be tackled to provide a foundation for effective and cognitively supported alignment in human-AI teaming scenarios.

432

30 Jun 2025

computer-science artificial-intelligence machine-learning

Quantum computing and artificial intelligence: status and perspectives

CNRS Freie Universität Berlin

University of Oxford TU Dortmund University German Research Center for Artificial Intelligence (DFKI)University of Innsbruck Collège de France Max Planck Institute for the Science of Light Friedrich-Alexander-Universität Erlangen-Nürnberg Institut Polytechnique de Paris University of Latvia University of Turku Saarland University Fondazione Bruno Kessler TU Wien

Chalmers University of Technology Forschungszentrum Jülich University of Regensburg University of Florence University of Augsburg University of Gothenburg Leiden Institute of Physics Donostia International Physics Center Johannes Kepler University Linz Fraunhofer Heinrich-Hertz-Institute SAP SE Friedrich-Schiller-University Jena European Centre for Theoretical Studies in Nuclear Physics and Related Areas (ECT*)EPITA Research Lab Leiden Institute of Advanced Computer Science ÖAW Vienna Center for Quantum Science and Technology Atominstitut University of Applied Sciences Zittau/Görlitz IQOQI Vienna Fraunhofer IOSB-AST Universit PSL Inria Paris–Saclay Universit Paris Diderot `Ecole Polytechnique University of Naples “Federico II”INFN Sezione di Firenze

A collaborative white paper coordinated by the Quantum Community Network comprehensively analyzes the current status and future perspectives of Quantum Artificial Intelligence, categorizing its potential into "Quantum for AI" and "AI for Quantum" applications. It proposes a strategic research and development agenda to bolster Europe's competitive position in this rapidly converging technological domain.

1,090

17 Jun 2024

computer-science computer-vision-security computer-vision-and-pattern-recognition

A Hierarchical 3D Gaussian Representation for Real-Time Rendering of Very Large Datasets

Inria TU Wien Université Côte d’Azur

George Kopanas

This paper from Kerbl et al. introduces a hierarchical 3D Gaussian representation that facilitates real-time rendering of very large-scale environments. The approach handles kilometer-scale scenes with tens of thousands of input images by employing a chunk-based processing pipeline and an efficient level-of-detail system, consistently achieving over 30 FPS.

1,097

277

28 Apr 2025

computer-science robotics

REASSEMBLE: A Multimodal Dataset for Contact-rich Robotic Assembly and Disassembly

German Aerospace Center TU Wien

The REASSEMBLE dataset provides a comprehensive, multimodal resource for contact-rich, long-horizon robotic assembly and disassembly, utilizing the standardized NIST Assembly Task Board #1. It includes a unique combination of RGB, proprioceptive, force-torque, audio, and event camera data, along with detailed hierarchical annotations for multi-task learning and anomaly detection.

168

30 Apr 2025

agents computer-science artificial-intelligence

LLM-Empowered Embodied Agent for Memory-Augmented Task Planning in Household Robotics

TU Wien AITAustrianInstituteofTechnologyGmbH

An embodied robotic agent is developed that uses Large Language Models and Retrieval-Augmented Generation (RAG) for long-horizon task planning in dynamic household environments. This system allows robots to autonomously manage objects, execute natural language commands, and track object locations through a natural language-driven memory system.

144

24 Jun 2024

computer-science computer-vision-and-pattern-recognition efficient-transformers

Reducing the Memory Footprint of 3D Gaussian Splatting

Inria TU Wien Université Côte d’Azur

George Kopanas

Researchers from Inria, Université Côte d’Azur, and TU Wien introduced a method to reduce the memory footprint of 3D Gaussian Splatting by an average of 27 times while largely preserving visual quality and real-time rendering performance. This approach enables the practical deployment of high-fidelity 3D scenes on memory-constrained devices and for streaming applications.

140

25 Sep 2025

computer-science artificial-intelligence computer-vision-and-pattern-recognition

Marching Neurons: Accurate Surface Extraction for Neural Implicit Shapes

TU Wien Max-Planck Institute for Informatics

Accurate surface geometry representation is crucial in 3D visual computing. Explicit representations, such as polygonal meshes, and implicit representations, like signed distance functions, each have distinct advantages, making efficient conversions between them increasingly important. Conventional surface extraction methods for implicit representations, such as the widely used Marching Cubes algorithm, rely on spatial decomposition and sampling, leading to inaccuracies due to fixed and limited resolution. We introduce a novel approach for analytically extracting surfaces from neural implicit functions. Our method operates natively in parallel and can navigate large neural architectures. By leveraging the fact that each neuron partitions the domain, we develop a depth-first traversal strategy to efficiently track the encoded surface. The resulting meshes faithfully capture the full geometric information from the network without ad-hoc spatial discretization, achieving unprecedented accuracy across diverse shapes and network architectures while maintaining competitive speed.

211

09 Oct 2024

computer-science computer-vision-security computer-vision-and-pattern-recognition

StopThePop: Sorted Gaussian Splatting for View-Consistent Real-time Rendering

Huawei Technologies TU Wien Graz University of Technology

This paper introduces "StopThePop," a refined rendering pipeline for 3D Gaussian Splatting that eliminates visual popping artifacts and view inconsistencies caused by approximate sorting. It achieves this by employing a novel hierarchical rasterization approach that maintains comparable image quality and near real-time performance, being only 4% slower than original 3DGS on average, and up to 1.6x faster with opacity decay.

191

01 Oct 2025

computer-science machine-learning embedding-methods

Flock: A Knowledge Graph Foundation Model via Learning on Random Walks

KAIST

University of Oxford TU Wien AITHYRA

We study the problem of zero-shot link prediction on knowledge graphs (KGs), which requires models to generalize over novel entities and novel relations. Knowledge graph foundation models (KGFMs) address this task by enforcing equivariance over both nodes and relations, learning from structural properties of nodes and relations, which are then transferable to novel graphs with similar structural properties. However, the conventional notion of deterministic equivariance imposes inherent limits on the expressive power of KGFMs, preventing them from distinguishing structurally similar but semantically distinct relations. To overcome this limitation, we introduce probabilistic node-relation equivariance, which preserves equivariance in distribution while incorporating a principled randomization to break symmetries during inference. Building on this principle, we present Flock, a KGFM that iteratively samples random walks, encodes them into sequences via a recording protocol, embeds them with a sequence model, and aggregates representations of nodes and relations via learned pooling. Crucially, Flock respects probabilistic node-relation equivariance and is a universal approximator for isomorphism-invariant link-level functions over KGs. Empirically, Flock perfectly solves our new diagnostic dataset Petals where current KGFMs fail, and achieves state-of-the-art performances on entity- and relation prediction tasks on 54 KGs from diverse domains.

164

11 Mar 2025

agents computer-science artificial-intelligence

Agentic Bug Reproduction for Effective Automated Program Repair at Google

University of Illinois at Urbana-Champaign

Google TU Wien

Bug reports often lack sufficient detail for developers to reproduce and fix the underlying defects. Bug Reproduction Tests (BRTs), tests that fail when the bug is present and pass when it has been resolved, are crucial for debugging, but they are rarely included in bug reports, both in open-source and in industrial settings. Thus, automatically generating BRTs from bug reports has the potential to accelerate the debugging process and lower time to repair. This paper investigates automated BRT generation within an industry setting, specifically at Google, focusing on the challenges of a large-scale, proprietary codebase and considering real-world industry bugs extracted from Google's internal issue tracker. We adapt and evaluate a state-of-the-art BRT generation technique, LIBRO, and present our agent-based approach, BRT Agent, which makes use of a fine-tuned Large Language Model (LLM) for code editing. Our BRT Agent significantly outperforms LIBRO, achieving a 28% plausible BRT generation rate, compared to 10% by LIBRO, on 80 human-reported bugs from Google's internal issue tracker. We further investigate the practical value of generated BRTs by integrating them with an Automated Program Repair (APR) system at Google. Our results show that providing BRTs to the APR system results in 30% more bugs with plausible fixes. Additionally, we introduce Ensemble Pass Rate (EPR), a metric which leverages the generated BRTs to select the most promising fixes from all fixes generated by APR system. Our evaluation on EPR for Top-K and threshold-based fix selections demonstrates promising results and trade-offs. For example, EPR correctly selects a plausible fix from a pool of 20 candidates in 70% of cases, based on its top-1 ranking.

162

08 Mar 2025

computer-science robotics

FlowMP: Learning Motion Fields for Robot Planning with Conditional Flow Matching

University of Manchester TU Darmstadt TU Wien Hessian AI University of Texas at Arlington German Research Center for AI (DFKI)Austrian Institute of Technology (AIT) GmbH

FlowMP presents a conditional flow matching framework for robot motion planning, learning a direct mapping from a noise distribution to expert trajectories. This method explicitly models second-order dynamics, generating kinodynamically feasible paths up to 100 times faster than prior approaches while demonstrating superior scalability and smoothness for complex 3D robotic tasks compared to diffusion models.

30 Oct 2025

ai-for-health computer-science machine-learning

Curly Flow Matching for Learning Non-gradient Field Dynamics

University of Toronto

Université de Montréal

University of Oxford

Mila - Quebec AI Institute Vector Institute TU Wien AITHYRA Broad Institute of MIT and Harvard

Petrović et al. introduce CURLY FLOW MATCHING (CURLY-FM), a simulation-free machine learning framework that learns complex, non-gradient field dynamics by solving a Schrödinger bridge problem with a non-zero drift reference process. The method reconstructs periodic trajectories in diverse scientific domains like ocean currents and cell cycles, outperforming existing flow-matching and simulation-based techniques in efficiency and accuracy.

150

25 Sep 2025

adversarial-robustness attention-mechanisms computer-science

Differential Gated Self-Attention

TU Wien

Transformers excel across a large variety of tasks but remain susceptible to corrupted inputs, since standard self-attention treats all query-key interactions uniformly. Inspired by lateral inhibition in biological neural circuits and building on the recent use by the Differential Transformer's use of two parallel softmax subtraction for noise cancellation, we propose Multihead Differential Gated Self-Attention (M-DGSA) that learns per-head input-dependent gating to dynamically suppress attention noise. Each head splits into excitatory and inhibitory branches whose dual softmax maps are fused by a sigmoid gate predicted from the token embedding, yielding a context-aware contrast enhancement. M-DGSA integrates seamlessly into existing Transformer stacks with minimal computational overhead. We evaluate on both vision and language benchmarks, demonstrating consistent robustness gains over vanilla Transformer, Vision Transformer, and Differential Transformer baselines. Our contributions are (i) a novel input-dependent gating mechanism for self-attention grounded in lateral inhibition, (ii) a principled synthesis of biological contrast-enhancement and self-attention theory, and (iii) comprehensive experiments demonstrating noise resilience and cross-domain applicability.

1,478

24 Feb 2025

computer-science artificial-intelligence computation-and-language

LettuceDetect: A Hallucination Detection Framework for RAG Applications

TU Wien KR Labs

LettuceDetect is a framework designed to identify token-level hallucinations in Retrieval-Augmented Generation (RAG) systems by leveraging a ModernBERT backbone for token classification. The system achieves an F1 score of 79.22% for example-level hallucination detection and 58.93% for span-level detection on the RAGTruth benchmark, processing 30-60 examples per second.

137

05 Jun 2025

computer-science computer-vision-and-pattern-recognition generative-models

On-the-fly Reconstruction for Large-Scale Novel View Synthesis from Unposed Images

Inria TU Wien Université Côte d’Azur

This paper presents a method for on-the-fly large-scale novel view synthesis using 3D Gaussian Splatting, enabling real-time generation of 3D scene representations and camera poses as images are captured. It processes a 1.1km path dataset in 25 minutes, providing quality comparable to offline methods, thereby bridging the gap between high-quality offline reconstruction and fast real-time approaches.

196

10 Oct 2025

computer-science graphics

Variable-Rate Texture Compression: Real-Time Rendering with JPEG

TU Wien

This research from TU Wien presents a real-time deferred rendering pipeline that integrates JPEG-compressed textures directly onto modern GPUs by selectively decoding texture blocks on demand. The approach achieves a desktop rendering overhead of less than 0.3 ms and a VR overhead of 0.65 ms on an RTX 4090, demonstrating the feasibility of using variable-rate compression for high-fidelity graphics.

137

25 May 2025

computer-science robotics

ReFineVLA: Reasoning-Aware Teacher-Guided Transfer Fine-Tuning

Austrian Institute of Technology (AIT)TU Wien University of Texas at Arlington Max Planck Research School for Intelligent Systems (IMPRS-IS)VinRobotics

ReFineVLA enhances Vision-Language-Action models by integrating explicit multimodal reasoning through a teacher-guided fine-tuning framework. This approach improves robotic manipulation performance, particularly in complex tasks, and offers increased interpretability by enabling robots to provide structured rationales for their actions.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Through-Wall Imaging based on WiFi Channel State Information

Direct electrochemical reduction of graphene oxide thin film for aptamer-based selective and highly sensitive detection of Matrix metalloproteinase 2

Aligning Generalisation Between Humans and Machines

Quantum computing and artificial intelligence: status and perspectives

A Hierarchical 3D Gaussian Representation for Real-Time Rendering of Very Large Datasets

REASSEMBLE: A Multimodal Dataset for Contact-rich Robotic Assembly and Disassembly

LLM-Empowered Embodied Agent for Memory-Augmented Task Planning in Household Robotics

Reducing the Memory Footprint of 3D Gaussian Splatting

Marching Neurons: Accurate Surface Extraction for Neural Implicit Shapes

StopThePop: Sorted Gaussian Splatting for View-Consistent Real-time Rendering

Flock: A Knowledge Graph Foundation Model via Learning on Random Walks

Agentic Bug Reproduction for Effective Automated Program Repair at Google

FlowMP: Learning Motion Fields for Robot Planning with Conditional Flow Matching

Curly Flow Matching for Learning Non-gradient Field Dynamics

Differential Gated Self-Attention

LettuceDetect: A Hallucination Detection Framework for RAG Applications

Large Physics Models: Towards a collaborative approach with Large Language Models and Foundation Models

On-the-fly Reconstruction for Large-Scale Novel View Synthesis from Unposed Images

Variable-Rate Texture Compression: Real-Time Rendering with JPEG

ReFineVLA: Reasoning-Aware Teacher-Guided Transfer Fine-Tuning

Events

AI for Law

Personalize Your Feed