alphaXiv

History

Papers Benchmarks

Michigan State University

2,033

13 Jun 2024

computer-science emerging-technologies networking-and-internet-architecture

Harnessing Quantum Entanglement: Comprehensive Strategies for Enhanced Communication and Beyond in Quantum Networks

Michigan State University

Amit Kumar Bhuyan and Hrishikesh Dutta's review from Michigan State University offers a comprehensive, holistic overview of quantum communication, detailing its theoretical foundations, practical protocols, and network architectures. The work synthesizes advancements across the field, underscoring quantum entanglement as the driving force behind secure and efficient information exchange in quantum networks.

915

01 Feb 2024

computer-science artificial-intelligence machine-learning

Cumulative Distribution Function based General Temporal Point Processes

Michigan State University

City University of Hong Kong Jinan University HIT (Shenzhen)

Temporal Point Processes (TPPs) hold a pivotal role in modeling event sequences across diverse domains, including social networking and e-commerce, and have significantly contributed to the advancement of recommendation systems and information retrieval strategies. Through the analysis of events such as user interactions and transactions, TPPs offer valuable insights into behavioral patterns, facilitating the prediction of future trends. However, accurately forecasting future events remains a formidable challenge due to the intricate nature of these patterns. The integration of Neural Networks with TPPs has ushered in the development of advanced deep TPP models. While these models excel at processing complex and nonlinear temporal data, they encounter limitations in modeling intensity functions, grapple with computational complexities in integral computations, and struggle to capture long-range temporal dependencies effectively. In this study, we introduce the CuFun model, representing a novel approach to TPPs that revolves around the Cumulative Distribution Function (CDF). CuFun stands out by uniquely employing a monotonic neural network for CDF representation, utilizing past events as a scaling factor. This innovation significantly bolsters the model's adaptability and precision across a wide range of data scenarios. Our approach addresses several critical issues inherent in traditional TPP modeling: it simplifies log-likelihood calculations, extends applicability beyond predefined density function forms, and adeptly captures long-range temporal patterns. Our contributions encompass the introduction of a pioneering CDF-based TPP model, the development of a methodology for incorporating past event information into future event prediction, and empirical validation of CuFun's effectiveness through extensive experimentation on synthetic and real-world datasets.

8,349

08 Jan 2025

computer-science computation-and-language information-retrieval

Retrieval-Augmented Generation with Graphs (GraphRAG)

Michigan State University

Meta Pacific Northwest National Laboratory Amazon University of Oregon Snap Inc.Hippocratic AI The Home Depot

Adobe

Wang Yu

Kai Guo

A comprehensive survey details the field of Retrieval-Augmented Generation with Graphs (GraphRAG), proposing a unified framework for integrating graph-structured data into RAG systems and specializing its application across ten distinct domains, providing a structured understanding of current techniques and future research directions.

358

23 Jan 2025

computer-science machine-learning facial-recognition

Coseparable Nonnegative Tensor Factorization With T-CUR Decomposition

Michigan State University

Fudan University

Nonnegative Matrix Factorization (NMF) is an important unsupervised learning method to extract meaningful features from data. To address the NMF problem within a polynomial time framework, researchers have introduced a separability assumption, which has recently evolved into the concept of coseparability. This advancement offers a more efficient core representation for the original data. However, in the real world, the data is more natural to be represented as a multi-dimensional array, such as images or videos. The NMF's application to high-dimensional data involves vectorization, which risks losing essential multi-dimensional correlations. To retain these inherent correlations in the data, we turn to tensors (multidimensional arrays) and leverage the tensor t-product. This approach extends the coseparable NMF to the tensor setting, creating what we term coseparable Nonnegative Tensor Factorization (NTF). In this work, we provide an alternating index selection method to select the coseparable core. Furthermore, we validate the t-CUR sampling theory and integrate it with the tensor Discrete Empirical Interpolation Method (t-DEIM) to introduce an alternative, randomized index selection process. These methods have been tested on both synthetic and facial analysis datasets. The results demonstrate the efficiency of coseparable NTF when compared to coseparable NMF.

3,378

30 Sep 2024

computer-science computation-and-language

TrustLLM: Trustworthiness in Large Language Models

Michigan State University

University of Illinois at Urbana-Champaign

University of California, Santa Barbara

Harvard University

UCLA

Carnegie Mellon University

University of Notre Dame

University of Southern California

UC Berkeley

Georgia Institute of Technology

Stanford University Illinois Institute of Technology

Texas A&M University

Yale University

Northwestern University

University of Georgia

Microsoft

Columbia University Lehigh University University of Illinois Chicago

Johns Hopkins University

University of Maryland

University of Wisconsin-Madison Massachusetts General Hospital

Mohamed bin Zayed University of Artificial Intelligence Salesforce Research Institut Polytechnique de Paris

Duke University

Virginia Tech William & Mary Florida International University UNC-Chapel Hill CISPA Lawrence Livermore National Laboratory Samsung IBM Research AI Drexel University University of Tennessee, Knoxville

Meng Jiang

Quanxin Mei

The TRUSTLLM framework and benchmark offer a comprehensive system for evaluating the trustworthiness of large language models across six key dimensions. This work reveals that while proprietary models generally exhibit higher trustworthiness, open-source models can also achieve strong performance in specific areas, highlighting challenges like 'over-alignment' and data leakage.

515

9,890

01 Aug 2025

computer-science artificial-intelligence computation-and-language

A Survey on Post-training of Large Language Models

Michigan State University

University of Illinois at Urbana-Champaign

University of Georgia Lehigh University

The University of Hong Kong

Huazhong University of Science and Technology Salesforce Research University of Illinois at Chicago

Duke University Jilin University

Southern University of Science and Technology Worcester Polytechnic Institute LinkedIn Corporation Squirrel Ai Learning

qin chen

This survey offers the first comprehensive review of Post-training Language Models (PoLMs), systematically classifying methods, datasets, and applications within a novel intellectual framework. It traces the evolution of LLMs across five core paradigms—Fine-tuning, Alignment, Reasoning, Efficiency, and Integration & Adaptation—and identifies critical future research directions.

289

27 Oct 2025

computer-science artificial-intelligence computation-and-language

A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications

Michigan State University

Microsoft

The Pennsylvania State University Amazon IBM T.J. Watson Research Center The University of Utah

Researchers from Penn State, in collaboration with industry partners, provide the first comprehensive survey of Reinforcement Learning-based agentic search, systematically organizing its foundational concepts, functional roles, optimization strategies, and applications. This work clarifies the interplay between RL and agentic LLMs, delineating current capabilities, evaluation methods, and critical future research directions.

1,103

16 Mar 2025

computer-science computation-and-language machine-learning

SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression

Michigan State University

The Ohio State University

SVD-LLM introduces a post-training compression method for Large Language Models that employs a truncation-aware Singular Value Decomposition and a sequential parameter update strategy to maintain accuracy at high compression ratios. The method achieves significant perplexity reduction and inference efficiency, outperforming existing SVD-based, pruning, and 1-bit quantization approaches while also enabling simultaneous model and KV cache compression.

115

209

03 Oct 2025

computer-science artificial-intelligence embedding-methods

Understanding Generative Recommendation with Semantic IDs from a Model-scaling View

Michigan State University Snap Inc.

Researchers from Snap Inc. and Michigan State University systematically investigated the model scaling behaviors of two generative recommendation paradigms: Semantic ID-based GR and LLM-as-RS. Their findings reveal that SID-based GR suffers from fundamental bottlenecks preventing effective scaling, while LLM-as-RS demonstrates superior and consistent performance improvements with increased model size, achieving up to 20% better Recall@5 than SID-based GR.

11,234

24 Oct 2025

adversarial-attacks agentic-frameworks agents

Memory Injection Attacks on LLM Agents via Query-Only Interaction

Michigan State University

University of Georgia Singapore Management University

Researchers developed MINJA, a method for memory injection attacks on Large Language Model (LLM) agents through query-only interaction, without requiring direct memory access or modification of victim queries. This approach successfully compromises agents' long-term memory, achieving an average injection success rate of 98.2% and an attack success rate of 76.8% across various agents, LLMs, and tasks, and demonstrated resilience against common defense strategies.

463

10 Oct 2025

agentic-frameworks agents chain-of-thought

How Memory Management Impacts LLM Agents: An Empirical Study of Experience-Following Behavior

Michigan State University

Harvard University

University of Georgia University of Minnesota Twin Cities

An empirical study clarifies the impact of memory addition and deletion on the long-term performance of Large Language Model (LLM) agents, identifying the "experience-following property" and demonstrating how reliable trajectory evaluators are essential for managing self-generated, noisy experiences to achieve consistent self-improvement.

4,461

23 May 2024

computer-science artificial-intelligence computation-and-language

Efficient Large Language Models: A Survey

Michigan State University

Imperial College London

University of Michigan

Google Research

The Ohio State University Boson AI Amazon AWS AI

This survey paper systematically reviews and categorizes a wide array of techniques designed to enhance the efficiency of Large Language Models (LLMs), addressing their substantial computational and memory demands. It establishes a three-tiered taxonomy that encompasses model-centric, data-centric, and framework-level approaches, providing a comprehensive resource for researchers and practitioners.

1,094

1,082

17 Oct 2025

computer-science information-retrieval

RAG vs. GraphRAG: A Systematic Evaluation and Key Insights

Michigan State University

Meta University of Oregon

Wang Yu

Kai Guo

This research systematically evaluates Retrieval-Augmented Generation (RAG) and various GraphRAG methods on text-based question answering and summarization tasks using established benchmarks. It provides empirical evidence differentiating their strengths, showing RAG excels at direct factual recall while specific GraphRAG variants perform better on multi-hop reasoning, and demonstrates improved performance with hybrid strategies.

398

04 Apr 2024

computer-science artificial-intelligence machine-learning

SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation

Michigan State University IBM Research

University of Pennsylvania

SA LUN (Saliency Unlearning) introduces a principled framework that uses gradient-based weight saliency to selectively update model parameters, achieving efficient and stable machine unlearning in both image classification and image generation tasks. It consistently reduces the performance gap with exact retraining and effectively prevents generative models from creating undesirable content.

112

1,023

06 Dec 2024

computer-science computation-and-language machine-learning

Rethinking Machine Unlearning for Large Language Models

Michigan State University

Stanford University IBM Research University of North Carolina at Chapel Hill

MIT

University of California, Santa Cruz

Stephen Casper

Ken Liu

We explore machine unlearning (MU) in the domain of large language models (LLMs), referred to as LLM unlearning. This initiative aims to eliminate undesirable data influence (e.g., sensitive or illegal information) and the associated model capabilities, while maintaining the integrity of essential knowledge generation and not affecting causally unrelated information. We envision LLM unlearning becoming a pivotal element in the life-cycle management of LLMs, potentially standing as an essential foundation for developing generative AI that is not only safe, secure, and trustworthy, but also resource-efficient without the need of full retraining. We navigate the unlearning landscape in LLMs from conceptual formulation, methodologies, metrics, and applications. In particular, we highlight the often-overlooked aspects of existing LLM unlearning research, e.g., unlearning scope, data-model interaction, and multifaceted efficacy assessment. We also draw connections between LLM unlearning and related areas such as model editing, influence functions, model explanation, adversarial training, and reinforcement learning. Furthermore, we outline an effective assessment framework for LLM unlearning and explore its applications in copyright and privacy safeguards and sociotechnical harm reduction.

889

05 Aug 2025

agentic-frameworks agents computer-science

A Survey of WebAgents: Towards Next-Generation AI Agents for Web Automation with Large Foundation Models

Michigan State University

City University of Hong Kong

The Hong Kong Polytechnic University University of Illinois at Chicago

A comprehensive survey categorizes existing research on LFM-empowered WebAgents by analyzing their architectures, training methodologies, and trustworthiness dimensions. It consolidates fragmented knowledge, highlighting advancements in web task automation and identifying critical gaps in areas like safety, privacy, and generalizability.

590

22 Oct 2025

computer-science contrastive-learning artificial-intelligence

Embedding in Recommender Systems: A Survey

Michigan State University

City University of Hong Kong Hong Kong Polytechnic University Baidu Inc

Recommender systems have become an essential component of many online platforms, providing personalized recommendations to users. A crucial aspect is embedding techniques that convert the high-dimensional discrete features, such as user and item IDs, into low-dimensional continuous vectors, which can enhance the recommendation performance. Embedding techniques have revolutionized the capture of complex entity relationships, generating significant research interest. This survey presents a comprehensive analysis of recent advances in recommender system embedding techniques. We examine centralized embedding approaches across matrix, sequential, and graph structures. In matrix-based scenarios, collaborative filtering generates embeddings that effectively model user-item preferences, particularly in sparse data environments. For sequential data, we explore various approaches including recurrent neural networks and self-supervised methods such as contrastive and generative learning. In graph-structured contexts, we analyze techniques like node2vec that leverage network relationships, along with applicable self-supervised methods. Our survey addresses critical scalability challenges in embedding methods and explores innovative directions in recommender systems. We introduce emerging approaches, including AutoML, hashing techniques, and quantization methods, to enhance performance while reducing computational complexity. Additionally, we examine the promising role of Large Language Models (LLMs) in embedding enhancement. Through detailed discussion of various architectures and methodologies, this survey aims to provide a thorough overview of state-of-the-art embedding techniques in recommender systems, while highlighting key challenges and future research directions.

513

17 Apr 2025

computer-science computation-and-language computer-vision-and-pattern-recognition

Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Under Ambiguities

Michigan State University

University of Waterloo

University of Michigan Vector Institute

Freda Shi

Ziqiao “Martin” Ma

A new evaluation protocol assesses Vision-Language Models' spatial understanding, revealing that current models predominantly rely on English-centric egocentric perspectives for spatial reasoning and lack robustness and flexibility in adopting alternative Frames of Reference.

360

27 May 2025

adversarial-robustness computer-science computation-and-language

Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond

Michigan State University IBM Research

University of Minnesota Amazon

The LLM unlearning technique has recently been introduced to comply with data regulations and address the safety and ethical concerns of LLMs by removing the undesired data-model influence. However, state-of-the-art unlearning methods face a critical vulnerability: they are susceptible to ``relearning'' the removed information from a small number of forget data points, known as relearning attacks. In this paper, we systematically investigate how to make unlearned models robust against such attacks. For the first time, we establish a connection between robust unlearning and sharpness-aware minimization (SAM) through a unified robust optimization framework, in an analogy to adversarial training designed to defend against adversarial attacks. Our analysis for SAM reveals that smoothness optimization plays a pivotal role in mitigating relearning attacks. Thus, we further explore diverse smoothing strategies to enhance unlearning robustness. Extensive experiments on benchmark datasets, including WMDP and MUSE, demonstrate that SAM and other smoothness optimization approaches consistently improve the resistance of LLM unlearning to relearning attacks. Notably, smoothness-enhanced unlearning also helps defend against (input-level) jailbreaking attacks, broadening our proposal's impact in robustifying LLM unlearning. Codes are available at this https URL

1,660

29 Dec 2024

computer-science computation-and-language computer-vision-and-pattern-recognition

Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models

Michigan State University

University of Michigan University of North Carolina at Chapel Hill University of Adelaide

Parisa Kordjamshidi

A comprehensive survey reviews Vision-and-Language Navigation (VLN) research, detailing the integration of foundation models like LLMs and VLMs into VLN systems and outlining the resulting progress, opportunities, and challenges in the field. The paper highlights how these models enhance multimodal comprehension, reasoning, and generalization across various VLN tasks, particularly through improved world modeling, human instruction interpretation, and agent planning capabilities.

185

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Harnessing Quantum Entanglement: Comprehensive Strategies for Enhanced Communication and Beyond in Quantum Networks

Cumulative Distribution Function based General Temporal Point Processes

Retrieval-Augmented Generation with Graphs (GraphRAG)

Coseparable Nonnegative Tensor Factorization With T-CUR Decomposition

TrustLLM: Trustworthiness in Large Language Models

A Survey on Post-training of Large Language Models

A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications

SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression

Understanding Generative Recommendation with Semantic IDs from a Model-scaling View

Memory Injection Attacks on LLM Agents via Query-Only Interaction

How Memory Management Impacts LLM Agents: An Empirical Study of Experience-Following Behavior

Efficient Large Language Models: A Survey

RAG vs. GraphRAG: A Systematic Evaluation and Key Insights

SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation

Rethinking Machine Unlearning for Large Language Models

A Survey of WebAgents: Towards Next-Generation AI Agents for Web Automation with Large Foundation Models

Embedding in Recommender Systems: A Survey

Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Under Ambiguities

Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond

Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models

Events

AI for Law

Personalize Your Feed