alphaXiv

History

Papers Benchmarks

University of Stirling

1,260

19 Feb 2025

agent-based-systems agentic-frameworks agents

Multi-Agent Risks from Advanced AI

Joel Lehman

Lewis Hammond

A landmark collaborative study from 44 researchers across 30 major institutions establishes the first comprehensive framework for understanding multi-agent AI risks, identifying three critical failure modes and seven key risk factors while providing concrete evidence from both historical examples and novel experiments to guide future safety efforts.

04 Sep 2025

computer-science computer-science-and-game-theory

The evolution of trust as a cognitive shortcut in repeated interactions

University of Lausanne Teesside University University of Stirling Universit Libre de Bruxelles Vrĳe Universiteit Brussel

Trust is often thought to increase cooperation. However, game-theoretic models often fail to distinguish between cooperative behaviour and trust. This makes it difficult to measure trust and determine its effect in different social dilemmas. We address this here by formalising trust as a cognitive shortcut in repeated games. This functions by avoiding checking a partner's actions once a threshold level of cooperativeness has been observed. We consider trust-based strategies that implement this heuristic, and systematically analyse their evolution across the space of two-player symmetric social dilemma games. We find that where it is costly to check whether another agent's actions were cooperative, as is the case in many real-world settings, then trust-based strategies can outcompete standard reciprocal strategies such as Tit-for-Tat in many social dilemmas. Moreover, the presence of trust increases the overall level of cooperation in the population, especially in cases where agents can make unintentional errors in their actions. This occurs even in the presence of strategies designed to build and then exploit trust. Overall, our results demonstrate the individual adaptive benefit to an agent of using a trust heuristic, and provide a formal theory for how trust can promote cooperation in different types of social interaction. We discuss the implications of this for interactions between humans and artificial intelligence agents.

440

13 Feb 2024

computer-science artificial-intelligence human-ai-interaction

Large Language Models for the Automated Analysis of Optimization Algorithms

Artificial Intelligence Research Institute (IIIA-CSIC)University of Stirling

Camilo Chacón Sartori

Researchers integrated Large Language Models into STNWeb, a web-based tool for generating Search Trajectory Networks, to automate the analysis of optimization algorithms. The system provides natural language reports explaining algorithm behavior, recommends optimal parameters for STN generation, and generates plots, making complex analyses more accessible.

149

11 Apr 2025

agent-based-systems computer-science artificial-intelligence

Do LLMs trust AI regulation? Emerging behaviour of game-theoretic LLM agents

University of Amsterdam Universidade de Lisboa Luxembourg Institute of Science and Technology

University of St Andrews University of Birmingham University of Trento Teesside University University of Stirling Universit e Libre de Bruxelles Vrĳe Universiteit Brussel

This research applies Large Language Models as strategic agents within an evolutionary game theory model to simulate interactions between users, developers, and regulators in AI governance. It systematically observes emergent behaviors, compares them to theoretical predictions, and offers insights into how AI agents might respond to regulatory guidelines.

02 Mar 2025

computer-science artificial-intelligence computer-vision-and-pattern-recognition

Cross Modality Medical Image Synthesis for Improving Liver Segmentation

Hamad Bin Khalifa University COMSATS University Islamabad University of Stirling

Deep learning-based computer-aided diagnosis (CAD) of medical images requires large datasets. However, the lack of large publicly available labeled datasets limits the development of deep learning-based CAD systems. Generative Adversarial Networks (GANs), in particular, CycleGAN, can be used to generate new cross-domain images without paired training data. However, most CycleGAN-based synthesis methods lack the potential to overcome alignment and asymmetry between the input and generated data. We propose a two-stage technique for the synthesis of abdominal MRI using cross-modality translation of abdominal CT. We show that the synthetic data can help improve the performance of the liver segmentation network. We increase the number of abdominal MRI images through cross-modality image transformation of unpaired CT images using a CycleGAN inspired deformation invariant network called EssNet. Subsequently, we combine the synthetic MRI images with the original MRI images and use them to improve the accuracy of the U-Net on a liver segmentation task. We train the U-Net on real MRI images and then on real and synthetic MRI images. Consequently, by comparing both scenarios, we achieve an improvement in the performance of U-Net. In summary, the improvement achieved in the Intersection over Union (IoU) is 1.17%. The results show potential to address the data scarcity challenge in medical imaging.

13 Mar 2025

active-learning computer-science computer-vision-and-pattern-recognition

Better, Not Just More: Data-Centric Machine Learning for Earth Observation

University of Bonn University of Sheffield University of Twente German Aerospace Center (DLR)UiT The Arctic University of Norway University of the Bundeswehr Munich Wageningen University University of Stirling CentraleSup ´elec Forschungszentrum J ¨ulich ' Ecole Polytechnique F 'ed 'erale de Lausanne

Recent developments and research in modern machine learning have led to substantial improvements in the geospatial field. Although numerous deep learning architectures and models have been proposed, the majority of them have been solely developed on benchmark datasets that lack strong real-world relevance. Furthermore, the performance of many methods has already saturated on these datasets. We argue that a shift from a model-centric view to a complementary data-centric perspective is necessary for further improvements in accuracy, generalization ability, and real impact on end-user applications. Furthermore, considering the entire machine learning cycle-from problem definition to model deployment with feedback-is crucial for enhancing machine learning models that can be reliable in unforeseen situations. This work presents a definition as well as a precise categorization and overview of automated data-centric learning approaches for geospatial data. It highlights the complementary role of data-centric learning with respect to model-centric in the larger machine learning deployment cycle. We review papers across the entire geospatial field and categorize them into different groups. A set of representative experiments shows concrete implementation examples. These examples provide concrete steps to act on geospatial data with data-centric machine learning approaches.

07 Jun 2019

instrumentation-and-methods-for-astrophysics image-and-video-processing electrical-engineering

Standardized spectral and radiometric calibration of consumer cameras

Leiden University Universidad Complutense de Madrid University of Stirling DDQ Apps VITO, Flemish Institute for Technological Research

Consumer cameras, particularly onboard smartphones and UAVs, are now commonly used as scientific instruments. However, their data processing pipelines are not optimized for quantitative radiometry and their calibration is more complex than that of scientific cameras. The lack of a standardized calibration methodology limits the interoperability between devices and, in the ever-changing market, ultimately the lifespan of projects using them. We present a standardized methodology and database (SPECTACLE) for spectral and radiometric calibrations of consumer cameras, including linearity, bias variations, read-out noise, dark current, ISO speed and gain, flat-field, and RGB spectral response. This includes golden standard ground-truth methods and do-it-yourself methods suitable for non-experts. Applying this methodology to seven popular cameras, we found high linearity in RAW but not JPEG data, inter-pixel gain variations >400% correlated with large-scale bias and read-out noise patterns, non-trivial ISO speed normalization functions, flat-field correction factors varying by up to 2.79 over the field of view, and both similarities and differences in spectral response. Moreover, these results differed wildly between camera models, highlighting the importance of standardization and a centralized database.

129

12 Mar 2025

computer-science artificial-intelligence computer-science-and-game-theory

Media and responsible AI governance: a game-theoretic and LLM analysis

University of Amsterdam Universidade de Lisboa Luxembourg Institute of Science and Technology

University of St Andrews University of Birmingham University of Trento Teesside University University of Stirling Universit e Libre de Bruxelles Vrĳe Universiteit Brussel

This paper presents a four-actor model for AI governance that incorporates media, or the 'commentariat,' as an independent actor, examining its role in shaping public perception and trust in AI systems. The research uses evolutionary game theory and LLM-based simulations to show that conditions incentivizing thorough media investigation can lead to safe AI development and user trust, even in the absence of robust formal regulation.

04 Jul 2025

agents computer-science artificial-intelligence

Behaviour Space Analysis of LLM-driven Meta-heuristic Discovery

Leiden University University of Stirling

We investigate the behaviour space of meta-heuristic optimisation algorithms automatically generated by Large Language Model driven algorithm discovery methods. Using the Large Language Evolutionary Algorithm (LLaMEA) framework with a GPT o4-mini LLM, we iteratively evolve black-box optimisation heuristics, evaluated on 10 functions from the BBOB benchmark suite. Six LLaMEA variants, featuring different mutation prompt strategies, are compared and analysed. We log dynamic behavioural metrics including exploration, exploitation, convergence and stagnation measures, for each run, and analyse these via visual projections and network-based representations. Our analysis combines behaviour-based projections, Code Evolution Graphs built from static code features, performance convergence curves, and behaviour-based Search Trajectory Networks. The results reveal clear differences in search dynamics and algorithm structures across LLaMEA configurations. Notably, the variant that employs both a code simplification prompt and a random perturbation prompt in a 1+1 elitist evolution strategy, achieved the best performance, with the highest Area Over the Convergence Curve. Behaviour-space visualisations show that higher-performing algorithms exhibit more intensive exploitation behaviour and faster convergence with less stagnation. Our findings demonstrate how behaviour-space analysis can explain why certain LLM-designed heuristics outperform others and how LLM-driven algorithm discovery navigates the open-ended and complex search space of algorithms. These findings provide insights to guide the future design of adaptive LLM-driven algorithm generators.

01 Aug 2023

ai-for-health computer-science machine-learning

Mining the contribution of intensive care clinical course to outcome after traumatic brain injury

University of Cambridge

Johns Hopkins University Karolinska Institutet Leiden University Medical Center Humanitas University University of Stirling Cambridge Centre for Artificial Intelligence in Medicine

Existing methods to characterise the evolving condition of traumatic brain injury (TBI) patients in the intensive care unit (ICU) do not capture the context necessary for individualising treatment. Here, we integrate all heterogenous data stored in medical records (1,166 pre-ICU and ICU variables) to model the individualised contribution of clinical course to six-month functional outcome on the Glasgow Outcome Scale - Extended (GOSE). On a prospective cohort (n=1,550, 65 centres) of TBI patients, we train recurrent neural network models to map a token-embedded time series representation of all variables (including missing values) to an ordinal GOSE prognosis every two hours. The full range of variables explains up to 52% (95% CI: 50%-54%) of the ordinal variance in functional outcome. Up to 91% (95% CI: 90%-91%) of this explanation is derived from pre-ICU and admission information (i.e., static variables). Information collected in the ICU (i.e., dynamic variables) increases explanation (by up to 5% [95% CI: 4%-6%]), though not enough to counter poorer overall performance in longer-stay (>5.75 days) patients. Highest-contributing variables include physician-based prognoses, CT features, and markers of neurological function. Whilst static information currently accounts for the majority of functional outcome explanation after TBI, data-driven analysis highlights investigative avenues to improve dynamic characterisation of longer-stay patients. Moreover, our modelling strategy proves useful for converting large patient records into interpretable time series with missing data integration and minimal processing.

28 Jan 2023

computer-science conversational-ai artificial-intelligence

Truth Machines: Synthesizing Veracity in AI Language Models

University of Queensland Western Sydney University University of Stirling

Liam Magee

As AI technologies are rolled out into healthcare, academia, human resources, law, and a multitude of other domains, they become de-facto arbiters of truth. But truth is highly contested, with many different definitions and approaches. This article discusses the struggle for truth in AI systems and the general responses to date. It then investigates the production of truth in InstructGPT, a large language model, highlighting how data harvesting, model architectures, and social feedback mechanisms weave together disparate understandings of veracity. It conceptualizes this performance as an operationalization of truth, where distinct, often conflicting claims are smoothly synthesized and confidently presented into truth-statements. We argue that these same logics and inconsistencies play out in Instruct's successor, ChatGPT, reiterating truth as a non-trivial problem. We suggest that enriching sociality and thickening "reality" are two promising vectors for enhancing the truth-evaluating capacities of future language models. We conclude, however, by stepping back to consider AI truth-telling as a social practice: what kind of "truth" do we as listeners desire?

07 Mar 2020

bayesian-deep-learning computer-science contrastive-learning

The Variational InfoMax Learning Objective

University of Stirling

Bayesian Inference and Information Bottleneck are the two most popular objectives for neural networks, but they can be optimised only via a variational lower bound: the Variational Information Bottleneck (VIB). In this manuscript we show that the two objectives are actually equivalent to the InfoMax: maximise the information between the data and the labels. The InfoMax representation of the two objectives is not relevant only per se, since it helps to understand the role of the network capacity, but also because it allows us to derive a variational objective, the Variational InfoMax (VIM), that maximises them directly without resorting to any lower bound. The theoretical improvement of VIM over VIB is highlighted by the computational experiments, where the model trained by VIM improves the VIB model in three different tasks: accuracy, robustness to noise and representation quality.

12 Feb 2019

computer-science computation-and-language computer-vision-and-pattern-recognition

Multimodal Sentiment Analysis: Addressing Key Issues and Setting up the Baselines

National University of Singapore

Nanyang Technological University NTU University of Stirling Instituto Politecnico Nacional

We compile baselines, along with dataset split, for multimodal sentiment analysis. In this paper, we explore three different deep-learning based architectures for multimodal sentiment classification, each improving upon the previous. Further, we evaluate these architectures with multiple datasets with fixed train/test partition. We also discuss some major issues, frequently ignored in multimodal sentiment analysis research, e.g., role of speaker-exclusive models, importance of different modalities, and generalizability. This framework illustrates the different facets of analysis to be considered while performing multimodal sentiment analysis and, hence, serves as a new benchmark for future research in this emerging field.

02 Mar 2019

computer-science computer-vision-and-pattern-recognition image-segmentation

Spatio-Temporal Vegetation Pixel Classification By Using Convolutional Networks

University of Campinas Universidade Federal de Minas Gerais Universidade Estadual Paulista University of Stirling

Plant phenology studies rely on long-term monitoring of life cycles of plants. High-resolution unmanned aerial vehicles (UAVs) and near-surface technologies have been used for plant monitoring, demanding the creation of methods capable of locating and identifying plant species through time and space. However, this is a challenging task given the high volume of data, the constant data missing from temporal dataset, the heterogeneity of temporal profiles, the variety of plant visual patterns, and the unclear definition of individuals' boundaries in plant communities. In this letter, we propose a novel method, suitable for phenological monitoring, based on Convolutional Networks (ConvNets) to perform spatio-temporal vegetation pixel-classification on high resolution images. We conducted a systematic evaluation using high-resolution vegetation image datasets associated with the Brazilian Cerrado biome. Experimental results show that the proposed approach is effective, overcoming other spatio-temporal pixel-classification strategies.

03 Oct 2015

computer-science information-theory adaptation-and-self-organizing-systems

Partial Information Decomposition as a Unified Approach to the Specification of Neural Goal Functions

The University of Sydney University of Glasgow Max-Planck-Institute for Dynamics and Self-Organization Goethe-University University of Stirling

In many neural systems anatomical motifs are present repeatedly, but despite their structural similarity they can serve very different tasks. A prime example for such a motif is the canonical microcircuit of six-layered neo-cortex, which is repeated across cortical areas, and is involved in a number of different tasks (e.g.sensory, cognitive, or motor tasks). This observation has spawned interest in finding a common underlying principle, a 'goal function', of information processing implemented in this structure. By definition such a goal function, if universal, cannot be cast in processing-domain specific language (e.g. 'edge filtering', 'working memory'). Thus, to formulate such a principle, we have to use a domain-independent framework. Information theory offers such a framework. However, while the classical framework of information theory focuses on the relation between one input and one output (Shannon's mutual information), we argue that neural information processing crucially depends on the combination of \textit{multiple} inputs to create the output of a processor. To account for this, we use a very recent extension of Shannon Information theory, called partial information decomposition (PID). PID allows to quantify the information that several inputs provide individually (unique information), redundantly (shared information) or only jointly (synergistic information) about the output. First, we review the framework of PID. Then we apply it to reevaluate and analyze several earlier proposals of information theoretic neural goal functions (predictive coding, infomax, coherent infomax, efficient coding). We find that PID allows to compare these goal functions in a common framework, and also provides a versatile approach to design new goal functions from first principles. Building on this, we design and analyze a novel goal function, called 'coding with synergy'. [...]

12 Feb 2014

computer-science artificial-intelligence neural-and-evolutionary-computing

Local Optima Networks: A New Model of Combinatorial Fitness Landscapes

University of Lausanne University of Stirling University of Nice–Sophia Antipolis Inria Lille Nord Europe

This chapter overviews a recently introduced network-based model of combinatorial landscapes: Local Optima Networks (LON). The model compresses the information given by the whole search space into a smaller mathematical object that is a graph having as vertices the local optima and as edges the possible weighted transitions between them. Two definitions of edges have been proposed: basin-transition and escape-edges, which capture relevant topological features of the underlying search spaces. This network model brings a new set of metrics to characterize the structure of combinatorial landscapes, those associated with the science of complex networks. These metrics are described, and results are presented of local optima network extraction and analysis for two selected combinatorial landscapes: NK landscapes and the quadratic assignment problem. Network features are found to correlate with and even predict the performance of heuristic search algorithms operating on these problems.

16 Aug 2018

computer-science cryptography-and-security

Statistical Analysis Driven Optimized Deep Learning System for Intrusion Detection

Glasgow Caledonian University University Mediterranea of Reggio Calabria University of Stirling Institute of Applied Technology

Attackers have developed ever more sophisticated and intelligent ways to hack information and communication technology systems. The extent of damage an individual hacker can carry out upon infiltrating a system is well understood. A potentially catastrophic scenario can be envisaged where a nation-state intercepting encrypted financial data gets hacked. Thus, intelligent cybersecurity systems have become inevitably important for improved protection against malicious threats. However, as malware attacks continue to dramatically increase in volume and complexity, it has become ever more challenging for traditional analytic tools to detect and mitigate threat. Furthermore, a huge amount of data produced by large networks has made the recognition task even more complicated and challenging. In this work, we propose an innovative statistical analysis driven optimized deep learning system for intrusion detection. The proposed intrusion detection system (IDS) extracts optimized and more correlated features using big data visualization and statistical analysis methods (human-in-the-loop), followed by a deep autoencoder for potential threat detection. Specifically, a pre-processing module eliminates the outliers and converts categorical variables into one-hot-encoded vectors. The feature extraction module discard features with null values and selects the most significant features as input to the deep autoencoder model (trained in a greedy-wise manner). The NSL-KDD dataset from the Canadian Institute for Cybersecurity is used as a benchmark to evaluate the feasibility and effectiveness of the proposed architecture. Simulation results demonstrate the potential of our proposed system and its outperformance as compared to existing state-of-the-art methods and recently published novel approaches. Ongoing work includes further optimization and real-time evaluation of our proposed IDS.

27 Jan 2020

computer-science computer-vision-security computer-vision-and-pattern-recognition

Towards Open-Set Semantic Segmentation of Aerial Images

Universidade Federal de Minas Gerais University of Stirling

Classical and more recently deep computer vision methods are optimized for visible spectrum images, commonly encoded in grayscale or RGB colorspaces acquired from smartphones or cameras. A more uncommon source of images exploited in the remote sensing field are satellite and aerial images. However, the development of pattern recognition approaches for these data is relatively recent, mainly due to the limited availability of this type of images, as until recently they were used exclusively for military purposes. Access to aerial imagery, including spectral information, has been increasing mainly due to the low cost of drones, cheapening of imaging satellite launch costs, and novel public datasets. Usually remote sensing applications employ computer vision techniques strictly modeled for classification tasks in closed set scenarios. However, real-world tasks rarely fit into closed set contexts, frequently presenting previously unknown classes, characterizing them as open set scenarios. Focusing on this problem, this is the first paper to study and develop semantic segmentation techniques for open set scenarios applied to remote sensing images. The main contributions of this paper are: 1) a discussion of related works in open set semantic segmentation, showing evidence that these techniques can be adapted for open set remote sensing tasks; 2) the development and evaluation of a novel approach for open set semantic segmentation. Our method yielded competitive results when compared to closed set methods for the same dataset.

31 Jul 2018

computer-science computer-vision-and-pattern-recognition machine-learning

Lip-Reading Driven Deep Learning Approach for Speech Enhancement

University of Nottingham University of Stirling

This paper proposes a novel lip-reading driven deep learning framework for speech enhancement. The proposed approach leverages the complementary strengths of both deep learning and analytical acoustic modelling (filtering based approach) as compared to recently published, comparatively simpler benchmark approaches that rely only on deep learning. The proposed audio-visual (AV) speech enhancement framework operates at two levels. In the first level, a novel deep learning-based lip-reading regression model is employed. In the second level, lip-reading approximated clean-audio features are exploited, using an enhanced, visually-derived Wiener filter (EVWF), for the clean audio power spectrum estimation. Specifically, a stacked long-short-term memory (LSTM) based lip-reading regression model is designed for clean audio features estimation using only temporal visual features considering different number of prior visual frames. For clean speech spectrum estimation, a new filterbank-domain EVWF is formulated, which exploits estimated speech features. The proposed EVWF is compared with conventional Spectral Subtraction and Log-Minimum Mean-Square Error methods using both ideal AV mapping and LSTM driven AV mapping. The potential of the proposed speech enhancement framework is evaluated under different dynamic real-world commercially-motivated scenarios (e.g. cafe, public transport, pedestrian area) at different SNR levels (ranging from low to high SNRs) using benchmark Grid and ChiME3 corpora. For objective testing, perceptual evaluation of speech quality is used to evaluate the quality of restored speech. For subjective testing, the standard mean-opinion-score method is used with inferential statistics. Comparative simulation results demonstrate significant lip-reading and speech enhancement improvement in terms of both speech quality and speech intelligibility.

20 Aug 2024

computer-science artificial-intelligence machine-learning

An Overlooked Role of Context-Sensitive Dendrites

University of Oxford University of Wolverhampton University of Stirling

To date, most dendritic studies have predominantly focused on the apical zone of pyramidal two-point neurons (TPNs) receiving only feedback (FB) connections from higher perceptual layers and using them for learning. Recent cellular neurophysiology and computational neuroscience studies suggests that the apical input (context), coming from feedback and lateral connections, is multifaceted and far more diverse, with greater implications for ongoing learning and processing in the brain than previously realized. In addition to the FB, the apical tuft receives signals from neighboring cells of the same network as proximal (P) context, other parts of the brain as distal (D) context, and overall coherent information across the network as universal (U) context. The integrated context (C) amplifies and suppresses the transmission of coherent and conflicting feedforward (FF) signals, respectively. Specifically, we show that complex context-sensitive (CS)-TPNs flexibly integrate C moment-by-moment with the FF somatic current at the soma such that the somatic current is amplified when both feedforward (FF) and C are coherent; otherwise, it is attenuated. This generates the event only when the FF and C currents are coherent, which is then translated into a singlet or a burst based on the FB information. Spiking simulation results show that this flexible integration of somatic and contextual currents enables the propagation of more coherent signals (bursts), making learning faster with fewer neurons. Similar behavior is observed when this functioning is used in conventional artificial networks, where orders of magnitude fewer neurons are required to process vast amounts of heterogeneous real-world audio-visual (AV) data trained using backpropagation (BP). The computational findings presented here demonstrate the universality of CS-TPNs, suggesting a dendritic narrative that was previously overlooked.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring