alphaXiv

History

Papers Benchmarks

London South Bank University

18 Sep 2025

computer-science computers-and-society

AI and the Future of Academic Peer Review

University of Cambridge

National University of Singapore

University of Oxford

University of Copenhagen Yonsei University Hong Kong Baptist University London South Bank University Chinese EQUATOR Centre

A comprehensive, ethics-informed framework outlines the integration of AI, especially Large Language Models, into academic peer review, systematically addressing challenges and opportunities. It demonstrates how AI can mitigate persistent issues like lengthy publication delays and reviewer burden, offering specific safeguards against concerns such as hallucination and bias, and suggesting advanced AI architectures for improved performance.

1,003

16 May 2025

agents ai-for-genomics chain-of-thought

34 Examples of LLM Applications in Materials Science and Chemistry: Towards Automation, Assistants, Agents, and Accelerated Scientific Discovery

Magdalena Lederbauer

This paper presents a comprehensive review of 34 distinct applications of Large Language Models (LLMs) in materials science and chemistry, developed during a hackathon. It demonstrates how LLMs can be utilized across the entire research lifecycle, from data management and knowledge extraction to molecular design and automation, showcasing their versatility when augmented with techniques like RAG and tool-calling.

23 Oct 2025

agent-based-systems agentic-frameworks agents

ComProScanner: A multi-agent based framework for composition-property structured data extraction from scientific literature

King’s College London London South Bank University

ComProScanner, a multi-agent framework, automates the extraction of structured chemical composition, property, and synthesis data from scientific literature, providing high accuracy for complex materials like piezoelectric ceramics and uncovering over 99% of materials not present in existing databases.

08 Jun 2023

computer-science computer-vision-and-pattern-recognition machine-learning

Sequence-to-Sequence Model with Transformer-based Attention Mechanism and Temporal Pooling for Non-Intrusive Load Monitoring

London South Bank University Qom University of Technology

This paper presents a novel Sequence-to-Sequence (Seq2Seq) model based on a transformer-based attention mechanism and temporal pooling for Non-Intrusive Load Monitoring (NILM) of smart buildings. The paper aims to improve the accuracy of NILM by using a deep learning-based method. The proposed method uses a Seq2Seq model with a transformer-based attention mechanism to capture the long-term dependencies of NILM data. Additionally, temporal pooling is used to improve the model's accuracy by capturing both the steady-state and transient behavior of appliances. The paper evaluates the proposed method on a publicly available dataset and compares the results with other state-of-the-art NILM techniques. The results demonstrate that the proposed method outperforms the existing methods in terms of both accuracy and computational efficiency.

10 Jun 2007

soft-condensed-matter physics biological-physics

Water's Hydrogen Bond Strength

London South Bank University

Water is necessary both for the evolution of life and its continuance. It possesses particular properties that cannot be found in other materials and that are required for life-giving processes. These properties are brought about by the hydrogen bonded environment particularly evident in liquid water. Each liquid water molecule is involved in about four hydrogen bonds with strengths considerably less than covalent bonds but considerably greater than the natural thermal energy. These hydrogen bonds are roughly tetrahedrally arranged such that when strongly formed the local clustering expands, decreasing the density. Such low density structuring naturally occurs at low and supercooled temperatures and gives rise to many physical and chemical properties that evidence the particular uniqueness of liquid water. If aqueous hydrogen bonds were actually somewhat stronger then water would behave similar to a glass, whereas if they were weaker then water would be a gas and only exist as a liquid at sub-zero temperatures. The overall conclusion of this investigation is that water's hydrogen bond strength is poised centrally within a narrow window of its suitability for life.

12 Feb 2025

ai-for-health computer-science artificial-intelligence

HistoSmith: Single-Stage Histology Image-Label Generation via Conditional Latent Diffusion for Enhanced Cell Segmentation and Classification

University of Padova London South Bank University

Precise segmentation and classification of cell instances are vital for analyzing the tissue microenvironment in histology images, supporting medical diagnosis, prognosis, treatment planning, and studies of brain cytoarchitecture. However, the creation of high-quality annotated datasets for training remains a major challenge. This study introduces a novel single-stage approach (HistoSmith) for generating image-label pairs to augment histology datasets. Unlike state-of-the-art methods that utilize diffusion models with separate components for label and image generation, our approach employs a latent diffusion model to learn the joint distribution of cellular layouts, classification masks, and histology images. This model enables tailored data generation by conditioning on user-defined parameters such as cell types, quantities, and tissue types. Trained on the Conic H&E histopathology dataset and the Nissl-stained CytoDArk0 dataset, the model generates realistic and diverse labeled samples. Experimental results demonstrate improvements in cell instance segmentation and classification, particularly for underrepresented cell types like neutrophils in the Conic dataset. These findings underscore the potential of our approach to address data scarcity challenges.

14 Jun 2024

computer-science distributed-parallel-and-cluster-computing machine-learning

Architectural Blueprint For Heterogeneity-Resilient Federated Learning

London South Bank University

This paper proposes a novel three tier architecture for federated learning to optimize edge computing environments. The proposed architecture addresses the challenges associated with client data heterogeneity and computational constraints. It introduces a scalable, privacy preserving framework that enhances the efficiency of distributed machine learning. Through experimentation, the paper demonstrates the architecture capability to manage non IID data sets more effectively than traditional federated learning models. Additionally, the paper highlights the potential of this innovative approach to significantly improve model accuracy, reduce communication overhead, and facilitate broader adoption of federated learning technologies.

04 Feb 2025

ai-for-health computer-science computer-vision-security

Mind the Gap: Evaluating Patch Embeddings from General-Purpose and Histopathology Foundation Models for Cell Segmentation and Classification

University of Padova London South Bank University

Recent advancements in foundation models have transformed computer vision, driving significant performance improvements across diverse domains, including digital histopathology. However, the advantages of domain-specific histopathology foundation models over general-purpose models for specialized tasks such as cell analysis remain underexplored. This study investigates the representation learning gap between these two categories by analyzing multi-level patch embeddings applied to cell instance segmentation and classification. We implement an encoder-decoder architecture with a consistent decoder and various encoders. These include convolutional, vision transformer (ViT), and hybrid encoders pre-trained on ImageNet-22K or LVD-142M, representing general-purpose foundation models. These are compared against ViT encoders from the recently released UNI, Virchow2, and Prov-GigaPath foundation models, trained on patches extracted from hundreds of thousands of histopathology whole-slide images. The decoder integrates patch embeddings from different encoder depths via skip connections to generate semantic and distance maps. These maps are then post-processed to create instance segmentation masks where each label corresponds to an individual cell and to perform cell-type classification. All encoders remain frozen during training to assess their pre-trained feature extraction capabilities. Using the PanNuke and CoNIC histopathology datasets, and the newly introduced Nissl-stained CytoDArk0 dataset for brain cytoarchitecture studies, we evaluate instance-level detection, segmentation accuracy, and cell-type classification. This study provides insights into the comparative strengths and limitations of general-purpose vs. histopathology foundation models, offering guidance for model selection in cell-focused histopathology and brain cytoarchitecture analysis workflows.

08 Feb 2025

ai-for-health computer-science computer-vision-security

CISCA and CytoDArk0: a Cell Instance Segmentation and Classification method for histo(patho)logical image Analyses and a new, open, Nissl-stained dataset for brain cytoarchitecture studies

University of Padova London South Bank University

Delineating and classifying individual cells in microscopy tissue images is inherently challenging yet remains essential for advancements in medical and neuroscientific research. In this work, we propose a new deep learning framework, CISCA, for automatic cell instance segmentation and classification in histological slices. At the core of CISCA is a network architecture featuring a lightweight U-Net with three heads in the decoder. The first head classifies pixels into boundaries between neighboring cells, cell bodies, and background, while the second head regresses four distance maps along four directions. The outputs from the first and second heads are integrated through a tailored post-processing step, which ultimately produces the segmentation of individual cells. The third head enables the simultaneous classification of cells into relevant classes, if required. We demonstrate the effectiveness of our method using four datasets, including CoNIC, PanNuke, and MoNuSeg, which are publicly available H&Estained datasets that cover diverse tissue types and magnifications. In addition, we introduce CytoDArk0, the first annotated dataset of Nissl-stained histological images of the mammalian brain, containing nearly 40k annotated neurons and glia cells, aimed at facilitating advancements in digital neuropathology and brain cytoarchitecture studies. We evaluate CISCA against other state-of-the-art methods, demonstrating its versatility, robustness, and accuracy in segmenting and classifying cells across diverse tissue types, magnifications, and staining techniques. This makes CISCA well-suited for detailed analyses of cell morphology and efficient cell counting in both digital pathology workflows and brain cytoarchitecture research.

08 Jun 2023

computer-science machine-learning signal-processing

Non-Intrusive Load Monitoring (NILM) using Deep Neural Networks: A Review

London South Bank University Qom University of Technology

Demand-side management now encompasses more residential loads. To efficiently apply demand response strategies, it's essential to periodically observe the contribution of various domestic appliances to total energy consumption. Non-intrusive load monitoring (NILM), also known as load disaggregation, is a method for decomposing the total energy consumption profile into individual appliance load profiles within the household. It has multiple applications in demand-side management, energy consumption monitoring, and analysis. Various methods, including machine learning and deep learning, have been used to implement and improve NILM algorithms. This paper reviews some recent NILM methods based on deep learning and introduces the most accurate methods for residential loads. It summarizes public databases for NILM evaluation and compares methods using standard performance metrics.

01 Nov 2024

ai-for-health computer-science computer-vision-and-pattern-recognition

Automated Classification of Cell Shapes: A Comparative Evaluation of Shape Descriptors

University of Padova London South Bank University

This study addresses the challenge of classifying cell shapes from noisy contours, such as those obtained through cell instance segmentation of histological images. We assess the performance of various features for shape classification, including Elliptical Fourier Descriptors, curvature features, and lower dimensional representations. Using an annotated synthetic dataset of noisy contours, we identify the most suitable shape descriptors and apply them to a set of real images for qualitative analysis. Our aim is to provide a comprehensive evaluation of descriptors for classifying cell shapes, which can support cell type identification and tissue characterization-critical tasks in both biological research and histopathological assessments.

11 Aug 2023

computer-science machine-learning neural-and-evolutionary-computing

Parametric Leaky Tanh: A New Hybrid Activation Function for Deep Learning

London South Bank University IST College

Activation functions (AFs) are crucial components of deep neural networks (DNNs), having a significant impact on their performance. An activation function in a DNN is typically a smooth, nonlinear function that transforms an input signal into an output signal for the subsequent layer. In this paper, we propose the Parametric Leaky Tanh (PLTanh), a novel hybrid activation function designed to combine the strengths of both the Tanh and Leaky ReLU (LReLU) activation functions. PLTanh is differentiable at all points and addresses the 'dying ReLU' problem by ensuring a non-zero gradient for negative inputs, consistent with the behavior of LReLU. By integrating the unique advantages of these two diverse activation functions, PLTanh facilitates the learning of more intricate nonlinear relationships within the network. This paper presents an empirical evaluation of PLTanh against established activation functions, namely ReLU, LReLU, and ALReLU utilizing five diverse datasets.

26 Nov 2023

computer-science computer-vision-and-pattern-recognition machine-learning

Revealing Cortical Layers In Histological Brain Images With Self-Supervised Graph Convolutional Networks Applied To Cell-Graphs

University of Padova London South Bank University

Identifying cerebral cortex layers is crucial for comparative studies of the cytoarchitecture aiming at providing insights into the relations between brain structure and function across species. The absence of extensive annotated datasets typically limits the adoption of machine learning approaches, leading to the manual delineation of cortical layers by neuroanatomists. We introduce a self-supervised approach to detect layers in 2D Nissl-stained histological slices of the cerebral cortex. It starts with the segmentation of individual cells and the creation of an attributed cell-graph. A self-supervised graph convolutional network generates cell embeddings that encode morphological and structural traits of the cellular environment and are exploited by a community detection algorithm for the final layering. Our method, the first self-supervised of its kind with no spatial transcriptomics data involved, holds the potential to accelerate cytoarchitecture analyses, sidestepping annotation needs and advancing cross-species investigation.

01 Nov 2024

clustering-algorithms computer-science computer-vision-security

Automated Classification of Cell Shapes: A Comparative Evaluation of Shape Descriptors

University of Padova London South Bank University

06 May 2025

ai-for-genomics computer-science machine-learning

Improving Omics-Based Classification: The Role of Feature Selection and Synthetic Data Generation

University of Padova London South Bank University I4 Consulting Srl

Given the increasing complexity of omics datasets, a key challenge is not only improving classification performance but also enhancing the transparency and reliability of model decisions. Effective model performance and feature selection are fundamental for explainability and reliability. In many cases, high dimensional omics datasets suffer from limited number of samples due to clinical constraints, patient conditions, phenotypes rarity and others conditions. Current omics based classification models often suffer from narrow interpretability, making it difficult to discern meaningful insights where trust and reproducibility are critical. This study presents a machine learning based classification framework that integrates feature selection with data augmentation techniques to achieve high standard classification accuracy while ensuring better interpretability. Using the publicly available dataset (E MTAB 8026), we explore a bootstrap analysis in six binary classification scenarios to evaluate the proposed model's behaviour. We show that the proposed pipeline yields cross validated perfomance on small dataset that is conserved when the trained classifier is applied to a larger test set. Our findings emphasize the fundamental balance between accuracy and feature selection, highlighting the positive effect of introducing synthetic data for better generalization, even in scenarios with very limited samples availability.

09 Jan 2014

clustering-algorithms computer-science artificial-intelligence

Fighting Sample Degeneracy and Impoverishment in Particle Filters: A Review of Intelligent Approaches

Northwestern Polytechnical University University of Salamanca London South Bank University

During the last two decades there has been a growing interest in Particle Filtering (PF). However, PF suffers from two long-standing problems that are referred to as sample degeneracy and impoverishment. We are investigating methods that are particularly efficient at Particle Distribution Optimization (PDO) to fight sample degeneracy and impoverishment, with an emphasis on intelligence choices. These methods benefit from such methods as Markov Chain Monte Carlo methods, Mean-shift algorithms, artificial intelligence algorithms (e.g., Particle Swarm Optimization, Genetic Algorithm and Ant Colony Optimization), machine learning approaches (e.g., clustering, splitting and merging) and their hybrids, forming a coherent standpoint to enhance the particle filter. The working mechanism, interrelationship, pros and cons of these approaches are provided. In addition, Approaches that are effective for dealing with high-dimensionality are reviewed. While improving the filter performance in terms of accuracy, robustness and convergence, it is noted that advanced techniques employed in PF often causes additional computational requirement that will in turn sacrifice improvement obtained in real life filtering. This fact, hidden in pure simulations, deserves the attention of the users and designers of new filters.

04 Dec 2013

computer-science logic-in-computer-science software-engineering

Certifying Machine Code Safe from Hardware Aliasing: RISC is not necessarily risky

University of Birmingham London South Bank University

Sometimes machine code turns out to be a better target for verification than source code. RISC machine code is especially advantaged with respect to source code in this regard because it has only two instructions that access memory. That architecture forms the basis here for an inference system that can prove machine code safe against `hardware aliasing', an effect that occurs in embedded systems. There are programming memes that ensure code is safe from hardware aliasing, but we want to certify that a given machine code is provably safe.

04 May 2022

computer-science artificial-intelligence explainable-ai

Visual Knowledge Discovery with Artificial Intelligence: Challenges and Future Directions

Central Washington University Darmstadt University of Applied Sciences London South Bank University Transilvania University ISEL-Instituto Superior de Engenharia de Lisboa NOVALINCS

This volume is devoted to the emerging field of Integrated Visual Knowledge Discovery that combines advances in Artificial Intelligence/Machine Learning (AI/ML) and Visualization/Visual Analytics. Chapters included are extended versions of the selected AI and Visual Analytics papers and related symposia at the recent International Information Visualization Conferences (IV2019 and IV2020). AI/ML face a long-standing challenge of explaining models to humans. Models explanation is fundamentally human activity, not only an algorithmic one. In this chapter we aim to present challenges and future directions within the field of Visual Analytics, Visual Knowledge Discovery and AI/ML, and to discuss the role of visualization in visual AI/ML. In addition, we describe progress in emerging Full 2D ML, natural language processing, and AI/ML in multidimensional data aided by visual means.

29 Aug 2021

general-finance statistical-finance quantitative-finance

Stock index futures trading impact on spot price volatility. The CSI 300 studied with a TGARCH model

University of Leicester London South Bank University Bucharest University of Economic Studies

A TGARCH modeling is argued to be the optimal basis for investigating the impact of index futures trading on spot price variability. We discuss the CSI-300 index (China-Shanghai-Shenzhen-300-Stock Index) as a test case. The results prove that the introduction of CSI-300 index futures (CSI-300-IF) trading significantly reduces the volatility in the corresponding spot market. It is also found that there is a stationary equilibrium relationship between the CSI-300 spot and CCSI-300-IF markets. A bidirectional Granger causality is also detected. ''Finally'', it is deduced that spot prices are predicted with greater accuracy over a 3 or 4 lag day time span.

05 May 2025

ai-for-health computer-science machine-learning

Uncovering Population PK Covariates from VAE-Generated Latent Spaces

University of Padova London South Bank University

Population pharmacokinetic (PopPK) modelling is a fundamental tool for understanding drug behaviour across diverse patient populations and enabling personalized dosing strategies to improve therapeutic outcomes. A key challenge in PopPK analysis lies in identifying and modelling covariates that influence drug absorption, as these relationships are often complex and nonlinear. Traditional methods may fail to capture hidden patterns within the data. In this study, we propose a data-driven, model-free framework that integrates Variational Autoencoders (VAEs) deep learning model and LASSO regression to uncover key covariates from simulated tacrolimus pharmacokinetic (PK) profiles. The VAE compresses high-dimensional PK signals into a structured latent space, achieving accurate reconstruction with a mean absolute percentage error (MAPE) of 2.26%. LASSO regression is then applied to map patient-specific covariates to the latent space, enabling sparse feature selection through L1 regularization. This approach consistently identifies clinically relevant covariates for tacrolimus including SNP, age, albumin, and hemoglobin which are retained across the tested regularization strength levels, while effectively discarding non-informative features. The proposed VAE-LASSO methodology offers a scalable, interpretable, and fully data-driven solution for covariate selection, with promising applications in drug development and precision pharmacotherapy.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

AI and the Future of Academic Peer Review

34 Examples of LLM Applications in Materials Science and Chemistry: Towards Automation, Assistants, Agents, and Accelerated Scientific Discovery

ComProScanner: A multi-agent based framework for composition-property structured data extraction from scientific literature

Sequence-to-Sequence Model with Transformer-based Attention Mechanism and Temporal Pooling for Non-Intrusive Load Monitoring

Water's Hydrogen Bond Strength

HistoSmith: Single-Stage Histology Image-Label Generation via Conditional Latent Diffusion for Enhanced Cell Segmentation and Classification

Architectural Blueprint For Heterogeneity-Resilient Federated Learning

Mind the Gap: Evaluating Patch Embeddings from General-Purpose and Histopathology Foundation Models for Cell Segmentation and Classification

CISCA and CytoDArk0: a Cell Instance Segmentation and Classification method for histo(patho)logical image Analyses and a new, open, Nissl-stained dataset for brain cytoarchitecture studies

Non-Intrusive Load Monitoring (NILM) using Deep Neural Networks: A Review

Automated Classification of Cell Shapes: A Comparative Evaluation of Shape Descriptors

Parametric Leaky Tanh: A New Hybrid Activation Function for Deep Learning

Revealing Cortical Layers In Histological Brain Images With Self-Supervised Graph Convolutional Networks Applied To Cell-Graphs

Automated Classification of Cell Shapes: A Comparative Evaluation of Shape Descriptors

Improving Omics-Based Classification: The Role of Feature Selection and Synthetic Data Generation

Fighting Sample Degeneracy and Impoverishment in Particle Filters: A Review of Intelligent Approaches

Certifying Machine Code Safe from Hardware Aliasing: RISC is not necessarily risky

Visual Knowledge Discovery with Artificial Intelligence: Challenges and Future Directions

Stock index futures trading impact on spot price volatility. The CSI 300 studied with a TGARCH model

Uncovering Population PK Covariates from VAE-Generated Latent Spaces

Events

AI for Law

Personalize Your Feed