alphaXiv

History

Papers Benchmarks

Kennesaw State University

327

14 Aug 2024

computer-science cryptography-and-security physics

Bridging Quantum Computing and Differential Privacy: Insights into Quantum Computing Privacy

University of Houston Kennesaw State University University of Science & Technology of China

While quantum computing has strong potential in data-driven fields, the privacy issue of sensitive or valuable information involved in the quantum algorithm should be considered. Differential privacy (DP), which is a fundamental privacy tool widely used in the classical scenario, has been extended to the quantum domain, i.e., quantum differential privacy (QDP). QDP may become one of the most promising approaches toward privacy-preserving quantum computing since it is not only compatible with classical DP mechanisms but also achieves privacy protection by exploiting unavoidable quantum noise in noisy intermediate-scale quantum (NISQ) devices. This paper provides an overview of the various implementations of QDP and their performance in terms of privacy parameters under the DP setting. Specifically, we propose a taxonomy of QDP techniques, categorizing the literature on whether internal or external randomization is used as a source to achieve QDP and how these implementations are applied to each phase of the quantum algorithm. We also discuss challenges and future directions for QDP. By summarizing recent advancements, we hope to provide a comprehensive, up-to-date review for researchers venturing into this field.

15 Sep 2025

ai-for-health attention-mechanisms computer-science

A Geometric Graph-Based Deep Learning Model for Drug-Target Affinity Prediction

Kennesaw State University University of Tennessee, Knoxville

In structure-based drug design, accurately estimating the binding affinity between a candidate ligand and its protein receptor is a central challenge. Recent advances in artificial intelligence, particularly deep learning, have demonstrated superior performance over traditional empirical and physics-based methods for this task, enabled by the growing availability of structural and experimental affinity data. In this work, we introduce DeepGGL, a deep convolutional neural network that integrates residual connections and an attention mechanism within a geometric graph learning framework. By leveraging multiscale weighted colored bipartite subgraphs, DeepGGL effectively captures fine-grained atom-level interactions in protein-ligand complexes across multiple scales. We benchmarked DeepGGL against established models on CASF-2013 and CASF-2016, where it achieved state-of-the-art performance with significant improvements across diverse evaluation metrics. To further assess robustness and generalization, we tested the model on the CSAR-NRC-HiQ dataset and the PDBbind v2019 holdout set. DeepGGL consistently maintained high predictive accuracy, highlighting its adaptability and reliability for binding affinity prediction in structure-based drug discovery.

14 Oct 2025

computer-science computation-and-language fine-tuning

COSTAR-A: A prompting framework for enhancing Large Language Model performance on Point-of-View questions

Kennesaw State University

Large Language Models (LLMs) are highly sensitive to prompt design, and making optimized prompting techniques is crucial for generating consistent, high-quality outputs. In this study, we introduce COSTAR-A, a novel prompt engineering framework that enhances the existing COSTAR method, which stands for Context, Objective, Style, Tone, Audience, and Response, by adding the 'Answer' component at the end. We demonstrate that while the original COSTAR framework improves prompt clarity and aligns outputs for larger LLMs, its performance is less consistent with smaller, locally optimized models, particularly in tasks that require more directive or constrained outputs. Through a series of controlled prompt-output assessments with smaller (at most 8 billion parameters), fine-tuned models, we found that COSTAR-A can enhance the output structure and decisiveness of localized LLMs for certain tasks, although its effectiveness varies across models and use cases. Notably, the Llama 3.1-8B model exhibited performance improvements when prompted with COSTAR-A compared to COSTAR alone. These findings emphasize the adaptability and scalability of COSTAR-A as a prompting framework, particularly in computationally efficient AI deployments on resource-constrained hardware.

357

24 Apr 2025

computer-science conversational-ai artificial-intelligence

Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark

Tsinghua University Tencent Inc Kennesaw State University

Multimodal language analysis is a rapidly evolving field that leverages multiple modalities to enhance the understanding of high-level semantics underlying human conversational utterances. Despite its significance, little research has investigated the capability of multimodal large language models (MLLMs) to comprehend cognitive-level semantics. In this paper, we introduce MMLA, a comprehensive benchmark specifically designed to address this gap. MMLA comprises over 61K multimodal utterances drawn from both staged and real-world scenarios, covering six core dimensions of multimodal semantics: intent, emotion, dialogue act, sentiment, speaking style, and communication behavior. We evaluate eight mainstream branches of LLMs and MLLMs using three methods: zero-shot inference, supervised fine-tuning, and instruction tuning. Extensive experiments reveal that even fine-tuned models achieve only about 60%~70% accuracy, underscoring the limitations of current MLLMs in understanding complex human language. We believe that MMLA will serve as a solid foundation for exploring the potential of large language models in multimodal language analysis and provide valuable resources to advance this field. The datasets and code are open-sourced at this https URL

22 Jul 2023

astrophysics-of-galaxies physics

M- $σ$ relations across space and time

Kennesaw State University California State University, Northridge

Feedback from active galactic nuclei (AGN) has long been invoked to explain the correlation between black hole mass and stellar velocity dispersion (M-

\sigma

) discovered in low redshift galaxies. We describe the time evolution of AGN in the M-

\sigma

plane based on our gap model (Garofalo, Evans \& Sambruna 2010) for black hole accretion and jet formation illustrating a fundamental difference between jetted and non-jetted AGN. While the latter tend to evolve diagonally upward with black hole mass increasing along with stellar dispersion, we show that jetted AGN tend on average to move initially more upwards because their effect on velocity dispersion is weaker than for non-jetted AGN. But this initial phase is followed by a shift in the nature of the feedback, from positive to negative, a transition that is more dramatic on average in denser cluster environments. The feedback gets its kick from tilted jets which shut down star formation but increase velocity dispersion values. As this change in the nature of the feedback takes tens of millions to hundreds of millions of years, these cluster, merger-triggered jetted AGN, will evolve more upwards for up to order

10^{8}

years, followed by an extremely long phase in which low excitation progressively slows black hole growth but dramatically affects stellar dispersion. As a result, powerful jetted AGN evolve for most of their lives almost horizontally on the M-

\sigma

plane. The prediction is that strongest AGN feedback on stellar dispersion is a late universe phenomenon with M87 a good example. We show how jetted and non-jetted AGN parallel the Sersic and core-Sersic galaxy paths in the M-

\sigma

plane found by Sahu et al (2019) and to a prediction that jetted quasars are not core-Sersic galaxies as found for lower redshift jetted AGN.

12 Sep 2025

computer-science computer-vision-and-pattern-recognition

SFD-Mamba2Net: Structure-Guided Frequency-Enhanced Dual-Stream Mamba2 Network for Coronary Artery Segmentation

Kennesaw State University Michigan Technological University Sichuan Normal University The First Affiliated Hospital with Nanjing Medical University Education Big Data Collaborative Innovation Center of Sichuan 2011

Background: Coronary Artery Disease (CAD) is one of the leading causes of death worldwide. Invasive Coronary Angiography (ICA), regarded as the gold standard for CAD diagnosis, necessitates precise vessel segmentation and stenosis detection. However, ICA images are typically characterized by low contrast, high noise levels, and complex, fine-grained vascular structures, which pose significant challenges to the clinical adoption of existing segmentation and detection methods. Objective: This study aims to improve the accuracy of coronary artery segmentation and stenosis detection in ICA images by integrating multi-scale structural priors, state-space-based long-range dependency modeling, and frequency-domain detail enhancement strategies. Methods: We propose SFD-Mamba2Net, an end-to-end framework tailored for ICA-based vascular segmentation and stenosis detection. In the encoder, a Curvature-Aware Structural Enhancement (CASE) module is embedded to leverage multi-scale responses for highlighting slender tubular vascular structures, suppressing background interference, and directing attention toward vascular regions. In the decoder, we introduce a Progressive High-Frequency Perception (PHFP) module that employs multi-level wavelet decomposition to progressively refine high-frequency details while integrating low-frequency global structures. Results and Conclusions: SFD-Mamba2Net consistently outperformed state-of-the-art methods across eight segmentation metrics, and achieved the highest true positive rate and positive predictive value in stenosis detection.

09 Jun 2025

ai-for-health computer-science computation-and-language

Benchmarking Foundation Speech and Language Models for Alzheimer's Disease and Related Dementia Detection from Spontaneous Speech

Georgia Institute of Technology Kennesaw State University University of Texas Rio Grande Valley Shandong Mental Health Center

Background: Alzheimer's disease and related dementias (ADRD) are progressive neurodegenerative conditions where early detection is vital for timely intervention and care. Spontaneous speech contains rich acoustic and linguistic markers that may serve as non-invasive biomarkers for cognitive decline. Foundation models, pre-trained on large-scale audio or text data, produce high-dimensional embeddings encoding contextual and acoustic features. Methods: We used the PREPARE Challenge dataset, which includes audio recordings from over 1,600 participants with three cognitive statuses: healthy control (HC), mild cognitive impairment (MCI), and Alzheimer's Disease (AD). We excluded non-English, non-spontaneous, or poor-quality recordings. The final dataset included 703 (59.13%) HC, 81 (6.81%) MCI, and 405 (34.06%) AD cases. We benchmarked a range of open-source foundation speech and language models to classify cognitive status into the three categories. Results: The Whisper-medium model achieved the highest performance among speech models (accuracy = 0.731, AUC = 0.802). Among language models, BERT with pause annotation performed best (accuracy = 0.662, AUC = 0.744). ADRD detection using state-of-the-art automatic speech recognition (ASR) model-generated audio embeddings outperformed others. Including non-semantic features like pause patterns consistently improved text-based classification. Conclusion: This study introduces a benchmarking framework using foundation models and a clinically relevant dataset. Acoustic-based approaches -- particularly ASR-derived embeddings -- demonstrate strong potential for scalable, non-invasive, and cost-effective early detection of ADRD.

28 Oct 2025

mathematics optimization-and-control

Beyond Convexity: Proximal-Perturbed Lagrangian Methods for Efficient Functional Constrained Optimization

Purdue University Kennesaw State University Illinois State University

Non-convex functional constrained optimization problems have gained substantial attention in machine learning and data science, addressing broad requirements that typically go beyond the often performance-centric objectives. An influential class of algorithms for functional constrained problems is the class of primal-dual methods which has been extensively analyzed for convex problems. Nonetheless, the investigation of their efficacy for non-convex problems is under-explored. This paper develops a primal-dual algorithmic framework for solving such non-convex problems. This framework is built upon a novel form of the Lagrangian function, termed the {\em Proximal-Perturbed Augmented Lagrangian}, which enables the development of simple first-order algorithms that converge to a stationary solution under mild conditions. Notably, we study this framework under both non-smoothness and smoothness of the constraint function and provide three key contributions: (i) a simple algorithm that does not require the continuous adjustment of the penalty parameter; (ii) a non-asymptotic iteration complexity of

\widetilde{\mathcal{O}}(1/\epsilon^2)

; and (iii) extensive experimental results demonstrating the effectiveness of the proposed framework in terms of computational cost and performance, outperforming related approaches that use regularization (penalization) techniques and/or standard Lagrangian relaxation across diverse non-convex problems.

05 Feb 2023

computer-science artificial-intelligence human-computer-interaction

LiteVR: Interpretable and Lightweight Cybersickness Detection using Explainable AI

University of Missouri Kennesaw State University University of Texas at San Antonio

Cybersickness is a common ailment associated with virtual reality (VR) user experiences. Several automated methods exist based on machine learning (ML) and deep learning (DL) to detect cybersickness. However, most of these cybersickness detection methods are perceived as computationally intensive and black-box methods. Thus, those techniques are neither trustworthy nor practical for deploying on standalone energy-constrained VR head-mounted devices (HMDs). In this work, we present an explainable artificial intelligence (XAI)-based framework, LiteVR, for cybersickness detection, explaining the model's outcome and reducing the feature dimensions and overall computational costs. First, we develop three cybersickness DL models based on long-term short-term memory (LSTM), gated recurrent unit (GRU), and multilayer perceptron (MLP). Then, we employed a post-hoc explanation, such as SHapley Additive Explanations (SHAP), to explain the results and extract the most dominant features of cybersickness. Finally, we retrain the DL models with the reduced number of features. Our results show that eye-tracking features are the most dominant for cybersickness detection. Furthermore, based on the XAI-based feature ranking and dimensionality reduction, we significantly reduce the model's size by up to 4.3x, training time by up to 5.6x, and its inference time by up to 3.8x, with higher cybersickness detection accuracy and low regression error (i.e., on Fast Motion Scale (FMS)). Our proposed lite LSTM model obtained an accuracy of 94% in classifying cybersickness and regressing (i.e., FMS 1-10) with a Root Mean Square Error (RMSE) of 0.30, which outperforms the state-of-the-art. Our proposed LiteVR framework can help researchers and practitioners analyze, detect, and deploy their DL-based cybersickness detection models in standalone VR HMDs.

02 Sep 2025

adversarial-attacks ai-for-cybersecurity computer-science

A Survey: Towards Privacy and Security in Mobile Large Language Models

Kennesaw State University Southwest Jiaotong University Georgia State University Nexa AI

Mobile Large Language Models (LLMs) are revolutionizing diverse fields such as healthcare, finance, and education with their ability to perform advanced natural language processing tasks on-the-go. However, the deployment of these models in mobile and edge environments introduces significant challenges related to privacy and security due to their resource-intensive nature and the sensitivity of the data they process. This survey provides a comprehensive overview of privacy and security issues associated with mobile LLMs, systematically categorizing existing solutions such as differential privacy, federated learning, and prompt encryption. Furthermore, we analyze vulnerabilities unique to mobile LLMs, including adversarial attacks, membership inference, and side-channel attacks, offering an in-depth comparison of their effectiveness and limitations. Despite recent advancements, mobile LLMs face unique hurdles in achieving robust security while maintaining efficiency in resource-constrained environments. To bridge this gap, we propose potential applications, discuss open challenges, and suggest future research directions, paving the way for the development of trustworthy, privacy-compliant, and scalable mobile LLM systems.

14 May 2025

computer-science machine-learning imitation-learning

Improving Network Threat Detection by Knowledge Graph, Large Language Model, and Imbalanced Learning

New York University Kennesaw State University Hewlett Packard Enterprise

Network threat detection has been challenging due to the complexities of attack activities and the limitation of historical threat data to learn from. To help enhance the existing practices of using analytics, machine learning, and artificial intelligence methods to detect the network threats, we propose an integrated modelling framework, where Knowledge Graph is used to analyze the users' activity patterns, Imbalanced Learning techniques are used to prune and weigh Knowledge Graph, and LLM is used to retrieve and interpret the users' activities from Knowledge Graph. The proposed framework is applied to Agile Threat Detection through Online Sequential Learning. The preliminary results show the improved threat capture rate by 3%-4% and the increased interpretabilities of risk predictions based on the users' activities.

08 Feb 2022

computer-science computer-vision-and-pattern-recognition machine-learning

The EMory BrEast imaging Dataset (EMBED): A Racially Diverse, Granular Dataset of 3.5M Screening and Diagnostic Mammograms

Georgia Institute of Technology

Emory University Kennesaw State University

Developing and validating artificial intelligence models in medical imaging requires datasets that are large, granular, and diverse. To date, the majority of publicly available breast imaging datasets lack in one or more of these areas. Models trained on these data may therefore underperform on patient populations or pathologies that have not previously been encountered. The EMory BrEast imaging Dataset (EMBED) addresses these gaps by providing 3650,000 2D and DBT screening and diagnostic mammograms for 116,000 women divided equally between White and African American patients. The dataset also contains 40,000 annotated lesions linked to structured imaging descriptors and 61 ground truth pathologic outcomes grouped into six severity classes. Our goal is to share this dataset with research partners to aid in development and validation of breast AI models that will serve all patients fairly and help decrease bias in medical AI.

09 Jun 2025

computer-science continual-learning computation-and-language

ETT-CKGE: Efficient Task-driven Tokens for Continual Knowledge Graph Embedding

Michigan State University

University of Michigan Mayo Clinic Kennesaw State University Bowling Green State University University of North Carolina, Charlotte University of Alabama – Birmingham

Continual Knowledge Graph Embedding (CKGE) seeks to integrate new knowledge while preserving past information. However, existing methods struggle with efficiency and scalability due to two key limitations: (1) suboptimal knowledge preservation between snapshots caused by manually designed node/relation importance scores that ignore graph dependencies relevant to the downstream task, and (2) computationally expensive graph traversal for node/relation importance calculation, leading to slow training and high memory overhead. To address these limitations, we introduce ETT-CKGE (Efficient, Task-driven, Tokens for Continual Knowledge Graph Embedding), a novel task-guided CKGE method that leverages efficient task-driven tokens for efficient and effective knowledge transfer between snapshots. Our method introduces a set of learnable tokens that directly capture task-relevant signals, eliminating the need for explicit node scoring or traversal. These tokens serve as consistent and reusable guidance across snapshots, enabling efficient token-masked embedding alignment between snapshots. Importantly, knowledge transfer is achieved through simple matrix operations, significantly reducing training time and memory usage. Extensive experiments across six benchmark datasets demonstrate that ETT-CKGE consistently achieves superior or competitive predictive performance, while substantially improving training efficiency and scalability compared to state-of-the-art CKGE methods. The code is available at: this https URL

27 Apr 2025

ai-for-health attention-mechanisms computer-science

Myocardial Region-guided Feature Aggregation Net for Automatic Coronary artery Segmentation and Stenosis Assessment using Coronary Computed Tomography Angiography

Shanghai University of Finance and Economics Kennesaw State University Michigan Technological University Zhengzhou University of Light Industry

Coronary artery disease (CAD) remains a leading cause of mortality worldwide, requiring accurate segmentation and stenosis detection using Coronary Computed Tomography angiography (CCTA). Existing methods struggle with challenges such as low contrast, morphological variability and small vessel segmentation. To address these limitations, we propose the Myocardial Region-guided Feature Aggregation Net, a novel U-shaped dual-encoder architecture that integrates anatomical prior knowledge to enhance robustness in coronary artery segmentation. Our framework incorporates three key innovations: (1) a Myocardial Region-guided Module that directs attention to coronary regions via myocardial contour expansion and multi-scale feature fusion, (2) a Residual Feature Extraction Encoding Module that combines parallel spatial channel attention with residual blocks to enhance local-global feature discrimination, and (3) a Multi-scale Feature Fusion Module for adaptive aggregation of hierarchical vascular features. Additionally, Monte Carlo dropout f quantifies prediction uncertainty, supporting clinical interpretability. For stenosis detection, a morphology-based centerline extraction algorithm separates the vascular tree into anatomical branches, enabling cross-sectional area quantification and stenosis grading. The superiority of MGFA-Net was demonstrated by achieving an Dice score of 85.04%, an accuracy of 84.24%, an HD95 of 6.1294 mm, and an improvement of 5.46% in true positive rate for stenosis detection compared to3D U-Net. The integrated segmentation-to-stenosis pipeline provides automated, clinically interpretable CAD assessment, bridging deep learning with anatomical prior knowledge for precision medicine. Our code is publicly available at this http URL

15 Jul 2024

computer-science human-computer-interaction multimedia

An Avalanche of Images on Telegram Preceded Russia's Full-Scale Invasion of Ukraine

University of Notre Dame Kennesaw State University Colby College

Governments use propaganda, including through visual content -- or Politically Salient Image Patterns (PSIP) -- on social media, to influence and manipulate public opinion. In the present work, we collected Telegram post-history of from 989 Russian milbloggers to better understand the social and political narratives that circulated online in the months surrounding Russia's 2022 full-scale invasion of Ukraine. Overall, we found an 8,925% increase (p<0.001) in the number of posts and a 5,352% increase (p<0.001) in the number of images posted by these accounts in the two weeks prior to the invasion. We also observed a similar increase in the number and intensity of politically salient manipulated images that circulated on Telegram. Although this paper does not evaluate malice or coordination in these activities, we do conclude with a call for further research into the role that manipulated visual media has in the lead-up to instability events and armed conflict.

04 Oct 2024

high-energy-physics-phenomenology physics

A general mass variable flavor number scheme for $Z$ boson production in association with a heavy quark at hadron colliders

Michigan State University Southern Methodist University Florida State University Kennesaw State University University at Buffalo, the State University of New York

We present a methodology to streamline implementation of massive-quark radiative contributions in calculations with a variable number of active partons in proton-proton collisions. The methodology introduces \textit{subtraction} and \textit{residual} heavy-quark parton distribution functions (PDFs) to implement calculations in the Aivazis-Collins-Olness-Tung (ACOT) factorization scheme and its simplified realization in various processes up to the next-to-the-next-to-leading order in the QCD coupling strength. Interpolation tables for bottom-quark subtraction and residual distributions for CT18 NLO and NNLO PDF ensembles are provided in the common LHAPDF6 format. A numerical calculation of

Z

-boson production with at least one

b

jet at the Large Hadron Collider beyond the lowest order in QCD is considered for illustration purposes.

09 Sep 2024

attention-mechanisms computer-science computer-vision-and-pattern-recognition

3D Lymphoma Segmentation on PET/CT Images via Multi-Scale Information Fusion with Cross-Attention

Peking University Kennesaw State University Michigan Technological University Zhengzhou University of Light Industry

Background: Accurate segmentation of diffuse large B-cell lymphoma (DLBCL) lesions is challenging due to their complex patterns in medical imaging. Objective: This study aims to develop a precise segmentation method for DLBCL using 18F-Fluorodeoxyglucose (FDG) positron emission tomography (PET) and computed tomography (CT) images. Methods: We propose a 3D dual-branch encoder segmentation method using shifted window transformers and a Multi-Scale Information Fusion (MSIF) module. To enhance feature integration, the MSIF module performs multi-scale feature fusion using cross-attention mechanisms with a shifted window framework. A gated neural network within the MSIF module dynamically balances the contributions from each modality. The model was optimized using the Dice Similarity Coefficient (DSC) loss function. Additionally, we computed the total metabolic tumor volume (TMTV) and performed statistical analyses. Results: The model was trained and validated on a dataset of 165 DLBCL patients using 5-fold cross-validation, achieving a DSC of 0.7512. Statistical analysis showed a significant improvement over comparative methods (p < 0.05). Additionally, a Pearson correlation coefficient of 0.91 and an R^2 of 0.89 were observed when comparing manual annotations to segmentation results for TMTV measurement. Conclusion: This study presents an effective automatic segmentation method for DLBCL that leverages the complementary strengths of PET and CT imaging. Our method has the potential to improve diagnostic interpretations and assist in treatment planning for DLBCL patients.

27 Mar 2024

computer-science artificial-intelligence computer-vision-and-pattern-recognition

FTBC: Forward Temporal Bias Correction for Optimizing ANN-SNN Conversion

Mohamed bin Zayed University of Artificial Intelligence Kennesaw State University City University of Macau ProtagoLabs Inc.

Spiking Neural Networks (SNNs) offer a promising avenue for energy-efficient computing compared with Artificial Neural Networks (ANNs), closely mirroring biological neural processes. However, this potential comes with inherent challenges in directly training SNNs through spatio-temporal backpropagation -- stemming from the temporal dynamics of spiking neurons and their discrete signal processing -- which necessitates alternative ways of training, most notably through ANN-SNN conversion. In this work, we introduce a lightweight Forward Temporal Bias Correction (FTBC) technique, aimed at enhancing conversion accuracy without the computational overhead. We ground our method on provided theoretical findings that through proper temporal bias calibration the expected error of ANN-SNN conversion can be reduced to be zero after each time step. We further propose a heuristic algorithm for finding the temporal bias only in the forward pass, thus eliminating the computational burden of backpropagation and we evaluate our method on CIFAR-10/100 and ImageNet datasets, achieving a notable increase in accuracy on all datasets. Codes are released at a GitHub repository.

09 Sep 2022

computer-science computers-and-society

Impacts and Integration of Remote-First Working Environments

Kennesaw State University University of Guelph

Due to the Covid-19 pandemic in 2020 or other business decisions, remote work is becoming increasingly popular. "Remote first" working environments exist within companies where most employees work remotely. This paper takes a deep dive into the remote-first mentality. It investigates its effects on employees at varying stages in their careers, day-to-day productivity, and working relationships with team members. We found that the remote-first mentality most impacts seasoned employees and managers, potentially due to trouble adjusting to a new way of working compared to the rest of their careers and the "always on" mentality associated with working from home. Regarding productivity, we found that while software development productivity appears unimpacted, the effectiveness of communication and employee wellbeing saw declines which are generally associated with lowered productivity. Finally, we looked closer at the communication side of things and how remote work impacts relationship building. We found that the most significant impacts on relationship building centered around "trust" and "credibility" being harder to build due to a lack of non-verbal cues during social interactions.

31 Jan 2025

computer-science artificial-intelligence machine-learning

Activation Sparsity Opportunities for Compressing General Large Language Models

Kennesaw State University

Deploying local AI models, such as Large Language Models (LLMs), to edge devices can substantially enhance devices' independent capabilities, alleviate the server's burden, and lower the response time. Owing to these tremendous potentials, many big tech companies have released several lightweight Small Language Models (SLMs) to bridge this gap. However, we still have huge motivations to deploy more powerful (LLMs) AI models on edge devices and enhance their smartness level. Unlike the conventional approaches for AI model compression, we investigate activation sparsity. The activation sparsity method is orthogonal and combinable with existing techniques to maximize the compression rate while maintaining great accuracy. LLMs' Feed-Forward Network (FFN) components, which typically comprise a large proportion of parameters (around 2/3), ensure that our FFN optimizations would have a better chance of achieving effective compression. Moreover, our findings are beneficial to general LLMs and are not restricted to ReLU-based models. This work systematically investigates the tradeoff between enforcing activation sparsity and perplexity (accuracy) on state-of-the-art LLMs. Our empirical analysis demonstrates that we can obtain around 50% of main memory and computing reductions for critical FFN components with negligible accuracy degradation. This extra 50% sparsity does not naturally exist in the current LLMs, which require tuning LLMs' activation outputs by injecting zero-enforcing thresholds. To obtain the benefits of activation sparsity, we provide a guideline for the system architect for LLM prediction and prefetching. The success prediction allows the system to prefetch the necessary weights while omitting the inactive ones and their successors, therefore lowering cache and memory pollution and reducing LLM execution time on resource-constrained edge devices.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Bridging Quantum Computing and Differential Privacy: Insights into Quantum Computing Privacy

A Geometric Graph-Based Deep Learning Model for Drug-Target Affinity Prediction

COSTAR-A: A prompting framework for enhancing Large Language Model performance on Point-of-View questions

Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark

M- $σ$ relations across space and time

SFD-Mamba2Net: Structure-Guided Frequency-Enhanced Dual-Stream Mamba2 Network for Coronary Artery Segmentation

Benchmarking Foundation Speech and Language Models for Alzheimer's Disease and Related Dementia Detection from Spontaneous Speech

Beyond Convexity: Proximal-Perturbed Lagrangian Methods for Efficient Functional Constrained Optimization

LiteVR: Interpretable and Lightweight Cybersickness Detection using Explainable AI

A Survey: Towards Privacy and Security in Mobile Large Language Models

Improving Network Threat Detection by Knowledge Graph, Large Language Model, and Imbalanced Learning

The EMory BrEast imaging Dataset (EMBED): A Racially Diverse, Granular Dataset of 3.5M Screening and Diagnostic Mammograms

ETT-CKGE: Efficient Task-driven Tokens for Continual Knowledge Graph Embedding

Myocardial Region-guided Feature Aggregation Net for Automatic Coronary artery Segmentation and Stenosis Assessment using Coronary Computed Tomography Angiography

An Avalanche of Images on Telegram Preceded Russia's Full-Scale Invasion of Ukraine

A general mass variable flavor number scheme for $Z$ boson production in association with a heavy quark at hadron colliders

3D Lymphoma Segmentation on PET/CT Images via Multi-Scale Information Fusion with Cross-Attention

FTBC: Forward Temporal Bias Correction for Optimizing ANN-SNN Conversion

Impacts and Integration of Remote-First Working Environments

Activation Sparsity Opportunities for Compressing General Large Language Models

Events

AI for Law

Personalize Your Feed

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Bridging Quantum Computing and Differential Privacy: Insights into Quantum Computing Privacy

A Geometric Graph-Based Deep Learning Model for Drug-Target Affinity Prediction

COSTAR-A: A prompting framework for enhancing Large Language Model performance on Point-of-View questions

Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark

M-σσσ relations across space and time

SFD-Mamba2Net: Structure-Guided Frequency-Enhanced Dual-Stream Mamba2 Network for Coronary Artery Segmentation

Benchmarking Foundation Speech and Language Models for Alzheimer's Disease and Related Dementia Detection from Spontaneous Speech

Beyond Convexity: Proximal-Perturbed Lagrangian Methods for Efficient Functional Constrained Optimization

LiteVR: Interpretable and Lightweight Cybersickness Detection using Explainable AI

A Survey: Towards Privacy and Security in Mobile Large Language Models

Improving Network Threat Detection by Knowledge Graph, Large Language Model, and Imbalanced Learning

The EMory BrEast imaging Dataset (EMBED): A Racially Diverse, Granular Dataset of 3.5M Screening and Diagnostic Mammograms

ETT-CKGE: Efficient Task-driven Tokens for Continual Knowledge Graph Embedding

Myocardial Region-guided Feature Aggregation Net for Automatic Coronary artery Segmentation and Stenosis Assessment using Coronary Computed Tomography Angiography

An Avalanche of Images on Telegram Preceded Russia's Full-Scale Invasion of Ukraine

A general mass variable flavor number scheme for ZZZ boson production in association with a heavy quark at hadron colliders

3D Lymphoma Segmentation on PET/CT Images via Multi-Scale Information Fusion with Cross-Attention

FTBC: Forward Temporal Bias Correction for Optimizing ANN-SNN Conversion

Impacts and Integration of Remote-First Working Environments

Activation Sparsity Opportunities for Compressing General Large Language Models

Events

AI for Law

Personalize Your Feed

M- $σ$ relations across space and time

A general mass variable flavor number scheme for $Z$ boson production in association with a heavy quark at hadron colliders