alphaXiv

History

Papers Benchmarks

Lancaster University

1,538

19 May 2024

other-condensed-matter physics instrumentation-and-detectors

QUEST-DMC: Background Modelling and Resulting Heat Deposit for a Superfluid Helium-3 Bolometer

University of Oxford Lancaster University University of Sussex Royal Holloway, University of London

We report the results of radioactivity assays and heat leak calculations for a range of common cryogenic materials, considered for use in the QUEST-DMC superfluid 3He dark matter detector. The bolometer, instrumented with nanomechanical resonators, will be sensitive to energy deposits from dark matter interactions. Events from radioactive decays and cosmic rays constitute a significant background and must be precisely modelled, using a combination of material screening and Monte Carlo simulations. However, the results presented here are of wider interest for experiments and quantum devices sensitive to minute heat leaks and spurious events, thus we present heat leak per unit mass or surface area for every material studied. This can inform material choices for other experiments, especially if underground operation is considered where the radiogenic backgrounds will dominate even at shallow depths.

419

20 Sep 2025

computer-science computation-and-language fine-tuning

Reinforcement Learning Meets Large Language Models: A Survey of Advancements and Applications Across the LLM Lifecycle

Tongji University

Fudan University

The Chinese University of Hong Kong Lancaster University The University of Toronto ByteDance SAIL Team

This survey offers a comprehensive review of how Reinforcement Learning (RL) is applied across the entire lifecycle of Large Language Models, from pre-training to alignment and reinforced reasoning. It particularly emphasizes the role of Reinforcement Learning with Verifiable Rewards (RLVR) in advancing LLM reasoning capabilities and compiles key datasets, benchmarks, and open-source tools for the field.

833

25 Aug 2025

agentic-frameworks agents ai-for-health

LLM-based Agentic Reasoning Frameworks: A Survey from Methods to Scenarios

University of Electronic Science and Technology of China

Beijing Jiaotong University Lancaster University Max-Planck Institute for Informatics

Researchers from multiple international institutions propose a unified methodological taxonomy and formal language for LLM-based agentic reasoning frameworks, systematically surveying their progress, application scenarios, and evaluation strategies across diverse domains like scientific research and healthcare. This work provides a structured view of single-agent, tool-based, and multi-agent approaches, clarifying the rapidly evolving landscape.

1,318

02 Dec 2015

computer-science computer-vision-security computer-vision-and-pattern-recognition

An Introduction to Convolutional Neural Networks

Lancaster University Aberystwyth University

This paper provides a concise yet comprehensive overview of Convolutional Neural Networks (CNNs), detailing their architecture, underlying principles, and practical design guidelines for image recognition tasks.

333

31 May 2024

computer-science computer-vision-security computer-vision-and-pattern-recognition

Deep Learning-Based Object Pose Estimation: A Comprehensive Survey

Tsinghua University Central South University Lancaster University Hunan University University of Trento The University of Western Australia

A comprehensive survey details the evolution of deep learning-based object pose estimation, covering instance-level, category-level, and unseen object methods. It analyzes current advancements, methodologies, and persistent challenges, while outlining future research directions to enhance generalization and robustness.

210

425

01 Apr 2025

computer-science computer-vision-and-pattern-recognition ensemble-methods

POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation

Zhejiang University

Peking University

Nanyang Technological University Lancaster University Singapore University of Technology and Design

Tencent

POPEN enhances LVLM-based reasoning segmentation by integrating human preferences into both training and inference stages, leading to reduced textual hallucinations and more precise segmentation masks. This framework achieves state-of-the-art performance across multiple datasets by optimizing for text fidelity and segmentation accuracy.

117

24 Oct 2024

computer-science computation-and-language multi-modal-learning

A Survey of Multimodal Sarcasm Detection

Michigan State University Lancaster University University of Surrey George Mason University

Sarcasm is a rhetorical device that is used to convey the opposite of the literal meaning of an utterance. Sarcasm is widely used on social media and other forms of computer-mediated communication motivating the use of computational models to identify it automatically. While the clear majority of approaches to sarcasm detection have been carried out on text only, sarcasm detection often requires additional information present in tonality, facial expression, and contextual images. This has led to the introduction of multimodal models, opening the possibility to detect sarcasm in multiple modalities such as audio, images, text, and video. In this paper, we present the first comprehensive survey on multimodal sarcasm detection - henceforth MSD - to date. We survey papers published between 2018 and 2023 on the topic, and discuss the models and datasets used for this task. We also present future research directions in MSD.

15 Jan 2025

computer-science computer-vision-and-pattern-recognition multi-modal-learning

Sports-QA: A Large-Scale Video Question Answering Benchmark for Complex and Professional Sports

Monash University

Sun Yat-Sen University University of Melbourne Lancaster University Singapore University of Technology and Design

University of Central Florida Max-Planck Institute for Informatics

Reasoning over sports videos for question answering is an important task with numerous applications, such as player training and information retrieval. However, this task has not been explored due to the lack of relevant datasets and the challenging nature it presents. Most datasets for video question answering (VideoQA) focus mainly on general and coarse-grained understanding of daily-life videos, which is not applicable to sports scenarios requiring professional action understanding and fine-grained motion analysis. In this paper, we introduce the first dataset, named Sports-QA, specifically designed for the sports VideoQA task. The Sports-QA dataset includes various types of questions, such as descriptions, chronologies, causalities, and counterfactual conditions, covering multiple sports. Furthermore, to address the characteristics of the sports VideoQA task, we propose a new Auto-Focus Transformer (AFT) capable of automatically focusing on particular scales of temporal information for question answering. We conduct extensive experiments on Sports-QA, including baseline studies and the evaluation of different methods. The results demonstrate that our AFT achieves state-of-the-art performance.

257

06 Aug 2024

active-learning bayesian-deep-learning computer-science

Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI

A consortium of 29 researchers from prominent academic and industry institutions argues that Bayesian Deep Learning is crucial for the future of large-scale AI. They contend that BDL inherently quantifies uncertainty, improves data efficiency and adaptability, and enhances model robustness, addressing key limitations of current deep learning models and facilitating more trustworthy AI systems.

167

26 Jun 2022

computer-science computer-vision-and-pattern-recognition efficient-transformers

UNetFormer: A UNet-like Transformer for Efficient Semantic Segmentation of Remote Sensing Urban Scene Imagery

Wuhan University Lancaster University University of Twente UK Centre for Ecology & Hydrology

Semantic segmentation of remotely sensed urban scene images is required in a wide range of practical applications, such as land cover mapping, urban change detection, environmental protection, and economic this http URL by rapid developments in deep learning technologies, the convolutional neural network (CNN) has dominated semantic segmentation for many years. CNN adopts hierarchical feature representation, demonstrating strong capabilities for local information extraction. However, the local property of the convolution layer limits the network from capturing the global context. Recently, as a hot topic in the domain of computer vision, Transformer has demonstrated its great potential in global information modelling, boosting many vision-related tasks such as image classification, object detection, and particularly semantic segmentation. In this paper, we propose a Transformer-based decoder and construct a UNet-like Transformer (UNetFormer) for real-time urban scene segmentation. For efficient segmentation, the UNetFormer selects the lightweight ResNet18 as the encoder and develops an efficient global-local attention mechanism to model both global and local information in the decoder. Extensive experiments reveal that our method not only runs faster but also produces higher accuracy compared with state-of-the-art lightweight models. Specifically, the proposed UNetFormer achieved 67.8% and 52.4% mIoU on the UAVid and LoveDA datasets, respectively, while the inference speed can achieve up to 322.4 FPS with a 512x512 input on a single NVIDIA GTX 3090 GPU. In further exploration, the proposed Transformer-based decoder combined with a Swin Transformer encoder also achieves the state-of-the-art result (91.3% F1 and 84.1% mIoU) on the Vaihingen dataset. The source code will be freely available at this https URL.

826

05 Jul 2025

ai-for-health bayesian-optimization computer-science

Return of the Latent Space COWBOYS: Re-thinking the use of VAEs for Bayesian Optimisation of Structured Spaces

University of Cambridge Lancaster University AstraZeneca

COWBOYS (Categorical Optimisation With Belief Of underlYing Structure), developed by researchers including those from AstraZeneca, introduces a Bayesian Optimization framework that re-thinks the use of Variational AutoEncoders (VAEs) by decoupling them from Gaussian Process (GP) surrogate modeling, allowing GPs to operate directly in the original structured data space. This approach achieves marked improvements in sample efficiency, particularly in low-data regimes, across various molecular design benchmarks compared to existing Latent Space Bayesian Optimization methods.

19 Sep 2023

computer-science artificial-intelligence computation-and-language

Model Leeching: An Extraction Attack Targeting LLMs

Lancaster University Mindgard

Researchers at Lancaster University and Mindgard developed "Model Leeching," a black-box extraction attack that distills task-specific knowledge from large, proprietary language models like ChatGPT-3.5-Turbo into smaller, local models. This method enabled high-fidelity replication of LLM behavior at a cost of only $50, and demonstrated an 11% increase in adversarial attack success against the target LLM by using the extracted model for attack staging.

216

31 Mar 2025

computer-science computer-vision-and-pattern-recognition domain-adaptation

FlexiMo: A Flexible Remote Sensing Foundation Model

Chinese Academy of Sciences Lancaster University Southeast University Helmholtz-Zentrum Dresden-Rossendorf Helmholtz-Institute Freiberg for Resource Technology

FlexiMo, a flexible foundation model developed by researchers including those from the Chinese Academy of Sciences, enables seamless processing of multi-source satellite imagery with varying spatial resolutions and spectral configurations. It achieves this by dynamically adapting to input characteristics without extensive parameter fine-tuning, demonstrating improved generalization across diverse tasks like scene classification, land cover classification, building segmentation, and cloud detection.

108

18 Jul 2024

computer-science computer-vision-security continual-learning

Recent Advances of Continual Learning in Computer Vision: An Overview

Lancaster University Singapore University of Technology and Design

In contrast to batch learning where all training data is available at once, continual learning represents a family of methods that accumulate knowledge and learn continuously with data available in sequential order. Similar to the human learning process with the ability of learning, fusing, and accumulating new knowledge coming at different time steps, continual learning is considered to have high practical significance. Hence, continual learning has been studied in various artificial intelligence tasks. In this paper, we present a comprehensive review of the recent progress of continual learning in computer vision. In particular, the works are grouped by their representative techniques, including regularization, knowledge distillation, memory, generative replay, parameter isolation, and a combination of the above techniques. For each category of these techniques, both its characteristics and applications in computer vision are presented. At the end of this overview, several subareas, where continuous knowledge accumulation is potentially helpful while continual learning has not been well studied, are discussed.

218

14 Jul 2025

adversarial-attacks adversarial-robustness ai-for-cybersecurity

Bypassing LLM Guardrails: An Empirical Analysis of Evasion Attacks against Prompt Injection and Jailbreak Detection Systems

Lancaster University Mindgard

Researchers empirically demonstrated that current Large Language Model (LLM) guardrail systems are highly vulnerable to both simple character injection and sophisticated adversarial machine learning evasion techniques. The study revealed that even widely used commercial and open-source guardrails can be bypassed with high success rates, up to 100% for certain methods like emoji smuggling, highlighting critical weaknesses in existing LLM protection mechanisms.

116

15 Jul 2024

computer-science computer-vision-and-pattern-recognition machine-learning

Learning to Unlearn for Robust Machine Unlearning

Lancaster University Singapore University of Technology and Design (SUTD)

The Learning-to-Unlearn (LTU) framework from the Singapore University of Technology and Design introduces a meta-learning approach for machine unlearning that efficiently removes data influence while preserving model utility. It achieves state-of-the-art performance across all unlearning metrics, notably outperforming baselines even when accessing only 30% of the original training data.

143

05 May 2025

computer-science artificial-intelligence computer-vision-and-pattern-recognition

TSTMotion: Training-free Scene-aware Text-to-motion Generation

Monash University University of Electronic Science and Technology of China Lancaster University Singapore University of Technology and Design

Text-to-motion generation has recently garnered significant research interest, primarily focusing on generating human motion sequences in blank backgrounds. However, human motions commonly occur within diverse 3D scenes, which has prompted exploration into scene-aware text-to-motion generation methods. Yet, existing scene-aware methods often rely on large-scale ground-truth motion sequences in diverse 3D scenes, which poses practical challenges due to the expensive cost. To mitigate this challenge, we are the first to propose a \textbf{T}raining-free \textbf{S}cene-aware \textbf{T}ext-to-\textbf{Motion} framework, dubbed as \textbf{TSTMotion}, that efficiently empowers pre-trained blank-background motion generators with the scene-aware capability. Specifically, conditioned on the given 3D scene and text description, we adopt foundation models together to reason, predict and validate a scene-aware motion guidance. Then, the motion guidance is incorporated into the blank-background motion generators with two modifications, resulting in scene-aware text-driven motion sequences. Extensive experiments demonstrate the efficacy and generalizability of our proposed framework. We release our code in \href{this https URL}{Project Page}.

25 Oct 2025

agents computer-science artificial-intelligence

Hierarchical Optimization via LLM-Guided Objective Evolution for Mobility-on-Demand Systems

Lancaster University Agency for Science, Technology and Research, Singapore Onto Innovation Inc.Morgan Stanley Asia Pte.

Online ride-hailing platforms aim to deliver efficient mobility-on-demand services, often facing challenges in balancing dynamic and spatially heterogeneous supply and demand. Existing methods typically fall into two categories: reinforcement learning (RL) approaches, which suffer from data inefficiency, oversimplified modeling of real-world dynamics, and difficulty enforcing operational constraints; or decomposed online optimization methods, which rely on manually designed high-level objectives that lack awareness of low-level routing dynamics. To address this issue, we propose a novel hybrid framework that integrates large language model (LLM) with mathematical optimization in a dynamic hierarchical system: (1) it is training-free, removing the need for large-scale interaction data as in RL, and (2) it leverages LLM to bridge cognitive limitations caused by problem decomposition by adaptively generating high-level objectives. Within this framework, LLM serves as a meta-optimizer, producing semantic heuristics that guide a low-level optimizer responsible for constraint enforcement and real-time decision execution. These heuristics are refined through a closed-loop evolutionary process, driven by harmony search, which iteratively adapts the LLM prompts based on feasibility and performance feedback from the optimization layer. Extensive experiments based on scenarios derived from both the New York and Chicago taxi datasets demonstrate the effectiveness of our approach, achieving an average improvement of 16% compared to state-of-the-art baselines.

27 Oct 2025

computer-science computer-vision-and-pattern-recognition domain-adaptation

Survey of Multimodal Geospatial Foundation Models: Techniques, Applications, and Challenges

Wuhan University Central South University

Peking University Lancaster University Hunan University Helmholtz-Zentrum Dresden-Rossendorf University of Extremadura Helmholtz-Institute Freiberg for Resource Technology

This survey provides a comprehensive review of multimodal geospatial foundation models (GFMs), detailing their core techniques, application domains, and persistent challenges. It establishes a modality-driven evaluation framework and charts future research directions to advance integrated Earth observation.

3,796

23 Mar 2025

computer-science computer-vision-and-pattern-recognition generative-models

LongDiff: Training-Free Long Video Generation in One Go

Monash University Lancaster University

A training-free framework enables generation of high-quality long videos from pre-trained short video diffusion models through position mapping and informative frame selection techniques, achieving superior performance across all VBench metrics while eliminating the need for additional training or data.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring