alphaXiv

History

Papers Benchmarks

Henan Polytechnic University

207

16 Oct 2025

attention-mechanisms computer-science artificial-intelligence

EDIT: Enhancing Vision Transformers by Mitigating Attention Sink through an Encoder-Decoder Architecture

Henan Polytechnic University

In this paper, we propose EDIT (Encoder-Decoder Image Transformer), a novel architecture designed to mitigate the attention sink phenomenon observed in Vision Transformer models. Attention sink occurs when an excessive amount of attention is allocated to the [CLS] token, distorting the model's ability to effectively process image patches. To address this, we introduce a layer-aligned encoder-decoder architecture, where the encoder utilizes self-attention to process image patches, while the decoder uses cross-attention to focus on the [CLS] token. Unlike traditional encoder-decoder framework, where the decoder depends solely on high-level encoder representations, EDIT allows the decoder to extract information starting from low-level features, progressively refining the representation layer by layer. EDIT is naturally interpretable demonstrated through sequential attention maps, illustrating the refined, layer-by-layer focus on key image features. Experiments on ImageNet-1k and ImageNet-21k, along with transfer learning tasks, show that EDIT achieves consistent performance improvements over DeiT3 models. These results highlight the effectiveness of EDIT's design in addressing attention sink and improving visual feature extraction.

06 Oct 2025

computer-science information-theory

Learning Function-to-Function Mappings: A Fourier Neural Operator for Next-Generation MIMO Systems

Nanyang Technological University China Mobile Research Institute Beijing University of Posts and Telecommunications Central China Normal University Henan Polytechnic University

This paper presents Fourier Neural Operators (FNOs) as a method for physical layer processing in next-generation MIMO systems, addressing challenges like near-field propagation and continuous apertures by learning function-to-function mappings. It demonstrates FNOs' ability to accurately model holographic MIMO channels within 2.04 ms and to achieve superior channel estimation with lower Normalized Mean Squared Error (NMSE) for flexible intelligent metasurfaces compared to existing methods.

16 Sep 2025

computer-science information-theory

Channel Estimation for Rydberg Atomic Quantum Receivers

Kyung Hee University Central China Normal University

Queen Mary University of London Henan Polytechnic University Laval University

The advent of Rydberg atomic quantum receivers (RAQRs) offers a new solution for the evolution of wireless transceiver architecture, promising unprecedented sensitivity and immunity to thermal noise. However, RAQRs introduce a unique non-linear signal model based on biased phase retrieval, which complicates fundamental channel estimation tasks. Traditional iterative algorithms often struggle in low signal-to-noise regimes and fail to capture complex and non-ideal system characteristics. To address this, we propose a novel model-driven deep learning framework for channel estimation in RAQRs. Specifically, we propose a Transformer-based unrolling architecture, termed URformer, which is derived by unrolling a stabilized variant of the expectation-maximization Gerchberg-Saxton (EM-GS) algorithm. Specifically, each layer of the proposed URformer incorporates three trainable modules: 1) a learnable filter implemented by a neural network that replaces the fixed Bessel function ratio in the classic EM-GS algorithm; 2) a trainable gating mechanism that adaptively combines classic and model-based updates to ensure training stability; and 3) a efficient channel Transformer block that learns to correct residual errors by capturing non-local dependencies across the channel matrix. Numerical results demonstrate that the proposed URformer significantly outperforms classic iterative algorithms and conventional black-box neural networks with less pilot overhead.

01 Jul 2025

computer-science information-theory signal-processing

Wireless AI Evolution: From Statistical Learners to Electromagnetic-Guided Foundation Models

Tsinghua University

Nanyang Technological University National University of Defense Technology Central China Normal University Henan Polytechnic University

While initial applications of artificial intelligence (AI) in wireless communications over the past decade have demonstrated considerable potential using specialized models for targeted communication tasks, the revolutionary demands of sixth-generation (6G) networks for holographic communications, ubiquitous sensing, and native intelligence are propelling a necessary evolution towards AI-native wireless networks. The arrival of large AI models paves the way for the next phase of Wireless AI, driven by wireless foundation models (WFMs). In particular, pre-training on universal electromagnetic (EM) principles equips WFMs with the essential adaptability for a multitude of demanding 6G applications. However, existing large AI models face critical limitations, including pre-training strategies disconnected from EM-compliant constraints leading to physically inconsistent predictions, a lack of embedded understanding of wave propagation physics, and the inaccessibility of massive labeled datasets for comprehensive EM-aware training. To address these challenges, this article presents an electromagnetic information theory-guided self-supervised pre-training (EIT-SPT) framework designed to systematically inject EM physics into WFMs. The EIT-SPT framework aims to infuse WFMs with intrinsic EM knowledge, thereby enhancing their physical consistency, generalization capabilities across varied EM landscapes, and overall data efficiency. Building upon the proposed EIT-SPT framework, this article first elaborates on diverse potential applications in 6G scenarios of WFMs, then validates the efficacy of the proposed framework through illustrative case studies, and finally summarizes critical open research challenges and future directions for WFMs.

14 Dec 2017

computer-science computer-vision-and-pattern-recognition geometric-deep-learning

A Performance Evaluation of Local Features for Image Based 3D Reconstruction

Chinese Academy of Sciences

EPFL Stevens Institute of Technology Henan Polytechnic University

This paper performs a comprehensive and comparative evaluation of the state of the art local features for the task of image based 3D reconstruction. The evaluated local features cover the recently developed ones by using powerful machine learning techniques and the elaborately designed handcrafted features. To obtain a comprehensive evaluation, we choose to include both float type features and binary ones. Meanwhile, two kinds of datasets have been used in this evaluation. One is a dataset of many different scene types with groundtruth 3D points, containing images of different scenes captured at fixed positions, for quantitative performance evaluation of different local features in the controlled image capturing situations. The other dataset contains Internet scale image sets of several landmarks with a lot of unrelated images, which is used for qualitative performance evaluation of different local features in the free image collection situations. Our experimental results show that binary features are competent to reconstruct scenes from controlled image sequences with only a fraction of processing time compared to use float type features. However, for the case of large scale image set with many distracting images, float type features show a clear advantage over binary ones.

17 Aug 2025

chain-of-thought computer-science computer-vision-and-pattern-recognition

DeCoT: Decomposing Complex Instructions for Enhanced Text-to-Image Generation with Large Language Models

Henan Polytechnic University

Despite remarkable advancements, current Text-to-Image (T2I) models struggle with complex, long-form textual instructions, frequently failing to accurately render intricate details, spatial relationships, or specific constraints. This limitation is highlighted by benchmarks such as LongBench-T2I, which reveal deficiencies in handling composition, specific text, and fine textures. To address this, we propose DeCoT (Decomposition-CoT), a novel framework that leverages Large Language Models (LLMs) to significantly enhance T2I models' understanding and execution of complex instructions. DeCoT operates in two core stages: first, Complex Instruction Decomposition and Semantic Enhancement, where an LLM breaks down raw instructions into structured, actionable semantic units and clarifies ambiguities; second, Multi-Stage Prompt Integration and Adaptive Generation, which transforms these units into a hierarchical or optimized single prompt tailored for existing T2I models. Extensive experiments on the LongBench-T2I dataset demonstrate that DeCoT consistently and substantially improves the performance of leading T2I models across all evaluated dimensions, particularly in challenging aspects like "Text" and "Composition". Quantitative results, validated by multiple MLLM evaluators (Gemini-2.0-Flash and InternVL3-78B), show that DeCoT, when integrated with Infinity-8B, achieves an average score of 3.52, outperforming the baseline Infinity-8B (3.44). Ablation studies confirm the critical contribution of each DeCoT component and the importance of sophisticated LLM prompting. Furthermore, human evaluations corroborate these findings, indicating superior perceptual quality and instruction fidelity. DeCoT effectively bridges the gap between high-level user intent and T2I model requirements, leading to more faithful and accurate image generation.

15 Aug 2025

signal-processing electrical-engineering

When Vision-Language Model (VLM) Meets Beam Prediction: A Multimodal Contrastive Learning Framework

Singapore University of Technology and Design Beijing University of Posts and Telecommunications Kyung Hee University Central China Normal University Henan Polytechnic University

As the real propagation environment becomes in creasingly complex and dynamic, millimeter wave beam prediction faces huge challenges. However, the powerful cross modal representation capability of vision-language model (VLM) provides a promising approach. The traditional methods that rely on real-time channel state information (CSI) are computationally expensive and often fail to maintain accuracy in such environments. In this paper, we present a VLM-driven contrastive learning based multimodal beam prediction framework that integrates multimodal data via modality-specific encoders. To enforce cross-modal consistency, we adopt a contrastive pretraining strategy to align image and LiDAR features in the latent space. We use location information as text prompts and connect it to the text encoder to introduce language modality, which further improves cross-modal consistency. Experiments on the DeepSense-6G dataset show that our VLM backbone provides additional semantic grounding. Compared with existing methods, the overall distance-based accuracy score (DBA-Score) of 0.9016, corresponding to 1.46% average improvement.

08 Apr 2025

computer-science information-theory signal-processing

Recent Advances in Near-Field Beam Training and Channel Estimation for XL-MIMO Systems

Central China Normal University University of Nottingham Zhengzhou University Western University Henan Polytechnic University Laval University Hunan Institute of Science and Technology The University of Dublin

Extremely large-scale multiple-input multiple-output (XL-MIMO) is a key technology for next-generation wireless communication systems. By deploying significantly more antennas than conventional massive MIMO systems, XL-MIMO promises substantial improvements in spectral efficiency. However, due to the drastically increased array size, the conventional planar wave channel model is no longer accurate, necessitating a transition to a near-field spherical wave model. This shift challenges traditional beam training and channel estimation methods, which were designed for planar wave propagation. In this article, we present a comprehensive review of state-of-the-art beam training and channel estimation techniques for XL-MIMO systems. We analyze the fundamental principles, key methodologies, and recent advancements in this area, highlighting their respective strengths and limitations in addressing the challenges posed by the near-field propagation environment. Furthermore, we explore open research challenges that remain unresolved to provide valuable insights for researchers and engineers working toward the development of next-generation XL-MIMO communication systems.

12 Dec 2024

computer-science computation-and-language information-extraction

Reasoning-Aware Query-Focused Summarization over Multi-Table Data

Henan Polytechnic University

Query-focused summarization over multi-table data is a challenging yet critical task for extracting precise and relevant information from structured data. Existing methods often rely on complex preprocessing steps and struggle to generalize across domains or handle the logical reasoning required for multi-table queries. In this paper, we propose QueryTableSummarizer++, an end-to-end generative framework leveraging large language models (LLMs) enhanced with table-aware pre-training, query-aligned fine-tuning, and reinforcement learning with feedback. Our method eliminates the need for intermediate serialization steps and directly generates query-relevant summaries. Experiments on a benchmark dataset demonstrate that QueryTableSummarizer++ significantly outperforms state-of-the-art baselines in terms of BLEU, ROUGE, and F1-score. Additional analyses highlight its scalability, generalization across domains, and robust handling of complex queries. Human evaluation further validates the superior quality and practical applicability of the generated summaries, establishing QueryTableSummarizer++ as a highly effective solution for multi-table summarization tasks.

13 Oct 2025

physics quantum-physics

Classical representation of local Clifford operators

Zhejiang University of Science and Technology Henan Polytechnic University College of Science, Wuxi University

It is known that every (single-qudit) Clifford operator maps the full set of generalized Pauli matrices (GPMs) to itself under unitary conjugation, which is an important quantum operation and plays a crucial role in quantum computation and information. However, in many quantum information processing tasks, it is required that a specific set of GPMs be mapped to another such set under conjugation, instead of the entire set. We formalize this by introducing local Clifford operator, which maps a given

n

-GPM set to another such set under unitary conjugation. We establish necessary and sufficient conditions for such an operator to transform a pair of GPMs, showing that these local Clifford operators admit a classical matrix representation, analogous to the classical (or symplectic) representation of standard (single-qudit) Clifford operators. Furthermore, we demonstrate that any local Clifford operator acting on an

n

-GPM (

n\geq 2

) set can be decomposed into a product of standard Clifford operators and a local Clifford operator acting on a pair of GPMs. This decomposition provides a complete classical characterization of unitary conjugation mappings between

n

-GPM sets. As a key application, we use this framework to address the local unitary equivalence (LU-equivalence) of sets of generalized Bell states (GBSs). We prove that the 31 equivalence classes of

4

-GBS sets in bipartite system

\mathbb{C}^{6}\otimes \mathbb{C}^{6}

previously identified via Clifford operators are indeed distinct under LU-equivalence, confirming that this classification is complete.

19 Jan 2021

computer-science information-theory

IRS-Empowered Wireless Communications: State-of-the-Art, Key Techniques, and Open Issues

Pusan National University University of Saskatchewan Henan Polytechnic University Memorial University Universit Laval

In this article, we overview intelligent reflecting surface (IRS)-empowered wireless communication systems. We first present the fundamentals of IRS-assisted wireless transmission. On this basis, we explore the integration of IRS with various advanced transmission technologies, such as millimeter wave, non-orthogonal multiple access, and physical layer security. Following this, we discuss the effects of hardware impairments and imperfect channel-state-information on the IRS system performance. Finally, we highlight several open issues to be addressed.

01 Aug 2025

computer-science information-theory

Channel Estimation for Flexible Intelligent Metasurfaces: From Model-Based Approaches to Neural Operators

Tsinghua University

Nanyang Technological University Beijing University of Posts and Telecommunications Central China Normal University Henan Polytechnic University China Unicom Group

Flexible intelligent metasurfaces (FIMs) offer a new solution for wireless communications by introducing morphological degrees of freedom, dynamically morphing their three-dimensional shape to ensure multipath signals interfere constructively. However, realizing the desired performance gains in FIM systems is critically dependent on acquiring accurate channel state information across a continuous and high-dimensional deformation space. Therefore, this paper investigates this fundamental channel estimation problem for FIM assisted millimeter-wave communication systems. First, we develop model-based frameworks that structure the problem as either function approximation using interpolation and kernel methods or as a sparse signal recovery problem that leverages the inherent angular sparsity of millimeter-wave channels. To further advance the estimation capability beyond explicit assumptions in model-based channel estimation frameworks, we propose a deep learning-based framework using a Fourier neural operator (FNO). By parameterizing a global convolution operator in the Fourier domain, we design an efficient FNO architecture to learn the continuous operator that maps FIM shapes to channel responses with mesh-independent properties. Furthermore, we exploit a hierarchical FNO (H-FNO) architecture to efficiently capture the multi-scale features across a hierarchy of spatial resolutions. Numerical results demonstrate that the proposed H-FNO significantly outperforms the model-based benchmarks in estimation accuracy and pilot efficiency. In particular, the interpretability analysis show that the proposed H-FNO learns an anisotropic spatial filter adapted to the physical geometry of FIM and is capable of accurately reconstructing the non-linear channel response across the continuous deformation space.

20 Jun 2025

computer-science networking-and-internet-architecture electrical-engineering

RL-based Adaptive Task Offloading in Mobile-Edge Computing for Future IoT Networks

University of Luxembourg Universiti Kebangsaan Malaysia Henan Polytechnic University King Fahd University of Petroleum & Minerals Middle Technical University

The Internet of Things (IoT) has been increasingly used in our everyday lives as well as in numerous industrial applications. However, due to limitations in computing and power capabilities, IoT devices need to send their respective tasks to cloud service stations that are usually located at far distances. Having to transmit data far distances introduces challenges for services that require low latency such as industrial control in factories and plants as well as artificial intelligence assisted autonomous driving. To solve this issue, mobile edge computing (MEC) is deployed at the networks edge to reduce transmission time. In this regard, this study proposes a new offloading scheme for MEC-assisted ultra dense cellular networks using reinforcement learning (RL) techniques. The proposed scheme enables efficient resource allocation and dynamic offloading decisions based on varying network conditions and user demands. The RL algorithm learns from the networks historical data and adapts the offloading decisions to optimize the networks overall performance. Non-orthogonal multiple access is also adopted to improve resource utilization among the IoT devices. Simulation results demonstrate that the proposed scheme outperforms other stateof the art offloading algorithms in terms of energy efficiency, network throughput, and user satisfaction.

04 Jun 2025

computer-science computer-vision-and-pattern-recognition image-and-video-processing

Rapid Bone Scintigraphy Enhancement via Semantic Prior Distillation from Segment Anything Model

Sun Yat-Sen University

Shanghai Jiao Tong University Shanghai University

HKUST Hunan University Henan Polytechnic University Ruijin Hospital

Rapid bone scintigraphy is crucial for diagnosing skeletal disorders and detecting tumor metastases in children, as it shortens scan duration and reduces discomfort. However, accelerated acquisition often degrades image quality, impairing the visibility of fine anatomical details and potentially compromising diagnosis. To overcome this limitation, we introduce the first application of SAM-based semantic priors for medical image restoration, utilizing the Segment Anything Model (SAM) to enhance pediatric rapid bone scintigraphy. Our approach employs two cascaded networks,

f^{IR1}

and

f^{IR2}

, supported by three specialized modules: a Semantic Prior Integration (SPI) module, a Semantic Knowledge Distillation (SKD) module, and a Semantic Consistency Module (SCM). The SPI and SKD modules inject domain-specific semantic cues from a fine-tuned SAM, while the SCM preserves coherent semantic feature representations across both cascaded stages. Moreover, we present RBS, a novel Rapid Bone Scintigraphy dataset comprising paired standard (20 cm/min) and rapid (40 cm/min) scans from 137 pediatric patients aged 0.5 - 16 years, making it the first dataset tailored for pediatric rapid bone scintigraphy restoration. Extensive experiments on both a public endoscopic dataset and our RBS dataset demonstrate that our method consistently surpasses existing techniques in PSNR, SSIM, FID, and LPIPS metrics.

02 Jul 2024

computer-science computer-vision-and-pattern-recognition multi-modal-learning

Improving Visual Storytelling with Multimodal Large Language Models

Henan Polytechnic University

Visual storytelling is an emerging field that combines images and narratives to create engaging and contextually rich stories. Despite its potential, generating coherent and emotionally resonant visual stories remains challenging due to the complexity of aligning visual and textual information. This paper presents a novel approach leveraging large language models (LLMs) and large vision-language models (LVLMs) combined with instruction tuning to address these challenges. We introduce a new dataset comprising diverse visual stories, annotated with detailed captions and multimodal elements. Our method employs a combination of supervised and reinforcement learning to fine-tune the model, enhancing its narrative generation capabilities. Quantitative evaluations using GPT-4 and qualitative human assessments demonstrate that our approach significantly outperforms existing models, achieving higher scores in narrative coherence, relevance, emotional depth, and overall quality. The results underscore the effectiveness of instruction tuning and the potential of LLMs/LVLMs in advancing visual storytelling.

23 Sep 2014

high-energy-physics-phenomenology physics

Higgs couplings and Naturalness in the littlest Higgs model with T-parity at the LHC and TLEP

Henan Normal University Henan Polytechnic University

Motivated by the recent LHC Higgs data and null results in searches for any new physics, we investigate the Higgs couplings and naturalness in the littlest Higgs model with T-parity. By performing the global fit of the latest Higgs data, electroweak precise observables and

R_{b}

measurements, we find that the scale

f

can be excluded up to 600 GeV at

2\sigma

confidence level. The expected Higgs coupling measurements at the future collider TLEP will improve this lower limit to above 3 TeV. Besides, the top parnter mass

m_{T_{+}}

can be excluded up to 880 GeV at

2\sigma

confidence level. The future HL-LHC can constrain this mass in the region

m_{T_{+}} &lt; 2.2

TeV corresponding to the fine-tuning being lager than 1%.

07 Dec 2023

clustering-algorithms computer-science computer-vision-and-pattern-recognition

Robust Learning Based Condition Diagnosis Method for Distribution Network Switchgear

Henan Polytechnic University Nanjing University of Finance & Economics Jiaozuo Power Supply Bureau

This paper introduces a robust, learning-based method for diagnosing the state of distribution network switchgear, which is crucial for maintaining the power quality for end users. Traditional diagnostic models often rely heavily on expert knowledge and lack robustness. To address this, our method incorporates an expanded feature vector that includes environmental data, temperature readings, switch position, motor operation, insulation conditions, and local discharge information. We tackle the issue of high dimensionality through feature mapping. The method introduces a decision radius to categorize unlabeled samples and updates the model parameters using a combination of supervised and unsupervised loss, along with a consistency regularization function. This approach ensures robust learning even with a limited number of labeled samples. Comparative analysis demonstrates that this method significantly outperforms existing models in both accuracy and robustness.

12 Dec 2022

signal-processing electrical-engineering

IRS-aided UAV for Future Wireless Communications: A Survey and Research Opportunities

Nazarbayev University Henan Polytechnic University Umm Al-Qura University Manchester M University Hama University

Both unmanned aerial vehicles (UAVs) and intelligent reflecting surfaces (IRS) are gaining traction as transformative technologies for upcoming wireless networks. The IRS-aided UAV communication, which introduces IRSs into UAV communications, has emerged in an effort to improve the system performance while also overcoming UAV communication constraints and issues. The purpose of this paper is to provide a comprehensive overview of IRSassisted UAV communications. First, we provide five examples of how IRSs and UAVs can be combined to achieve unrivaled potential in difficult situations. The technological features of the most recent relevant researches on IRS-aided UAV communications from the perspective of the main performance criteria, i.e., energy efficiency, security, spectral efficiency, etc. Additionally, previous research studies on technology adoption as machine learning algorithms. Lastly, some promising research directions and open challenges for IRS-aided UAV communication are presented.

03 Feb 2020

materials-science physics

Giant magnetoresistance in antiferromagnetic Mn $_2$ Au-based tunnel junction

Zhengzhou University Henan University Henan Polytechnic University

Recent studies on the electrical switching of tetragonal antiferromagnet (AFM) via N{é}el spin-orbit torque have paved the way for the economic use of antiferromagnetic materials. The most difficult obstacle that presently limits the application of antiferromagnetic materials in spintronics, especially in memory storage applications, could be the small and fragile magnetoresistance (MR) in the AFM-based nanostructure. In this study, we investigated the spin transports in Mn

_2

Au-based tunnel junctions based onthe first-principle scattering theory. Giant MRs more than

1000\%

are predicted in some Fe/MgO/Ag/Mn

_2

Au/Ta junctions that are about the same order as that in an MgO-based ferromagnetic tunnel junction with same barrier thickness. The interplay of the spin filtering effect, the quantum well resonant states, and the interfacial resonant states could be responsible for the unusual giant and robust MRs observed in these Mn

_2

Au-based junctions.

27 Jul 2025

computer-science information-theory

Rotatable RIS Assisted Physical Layer Multicasting

Yonsei University National University of Defense Technology Singapore University of Technology and Design Central China Normal University Henan Polytechnic University

Reconfigurable Intelligent Surfaces (RIS) dynamically control signal propagation to enhance wireless communications. This paper presents a novel framework for rotatable RIS assisted physical-layer multicast systems, aiming to maximize the sum of minimum multicast rates via joint optimization of base station beamforming, RIS phase shifts, and orientation. Unlike unicast or non-rotatable setups, the rotatable RIS adapts orientation to align signals with user groups, improving fairness and rates for weak users. An alternating optimization approach combines convex optimization for beamforming/phase shifts with exhaustive search and particle swarm optimization (PSO) for orientation. Majorization-Minimization-based algorithms solve subproblems iteratively. Simulation results show the framework achieves 24.1% rate improvement via exhaustive search and 20.0% via PSO over the non-rotatable RIS baseline, with PSO performance close to the exhaustive search upper bound, highlighting the benefits of physical-layer multicast and orientation optimization.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

EDIT: Enhancing Vision Transformers by Mitigating Attention Sink through an Encoder-Decoder Architecture

Learning Function-to-Function Mappings: A Fourier Neural Operator for Next-Generation MIMO Systems

Channel Estimation for Rydberg Atomic Quantum Receivers

Wireless AI Evolution: From Statistical Learners to Electromagnetic-Guided Foundation Models

A Performance Evaluation of Local Features for Image Based 3D Reconstruction

DeCoT: Decomposing Complex Instructions for Enhanced Text-to-Image Generation with Large Language Models

When Vision-Language Model (VLM) Meets Beam Prediction: A Multimodal Contrastive Learning Framework

Recent Advances in Near-Field Beam Training and Channel Estimation for XL-MIMO Systems

Reasoning-Aware Query-Focused Summarization over Multi-Table Data

Classical representation of local Clifford operators

IRS-Empowered Wireless Communications: State-of-the-Art, Key Techniques, and Open Issues

Channel Estimation for Flexible Intelligent Metasurfaces: From Model-Based Approaches to Neural Operators

RL-based Adaptive Task Offloading in Mobile-Edge Computing for Future IoT Networks

Rapid Bone Scintigraphy Enhancement via Semantic Prior Distillation from Segment Anything Model

Improving Visual Storytelling with Multimodal Large Language Models

Higgs couplings and Naturalness in the littlest Higgs model with T-parity at the LHC and TLEP

Robust Learning Based Condition Diagnosis Method for Distribution Network Switchgear

IRS-aided UAV for Future Wireless Communications: A Survey and Research Opportunities

Giant magnetoresistance in antiferromagnetic Mn $_2$ Au-based tunnel junction

Rotatable RIS Assisted Physical Layer Multicasting

Events

AI for Law

Personalize Your Feed

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

EDIT: Enhancing Vision Transformers by Mitigating Attention Sink through an Encoder-Decoder Architecture

Learning Function-to-Function Mappings: A Fourier Neural Operator for Next-Generation MIMO Systems

Channel Estimation for Rydberg Atomic Quantum Receivers

Wireless AI Evolution: From Statistical Learners to Electromagnetic-Guided Foundation Models

A Performance Evaluation of Local Features for Image Based 3D Reconstruction

DeCoT: Decomposing Complex Instructions for Enhanced Text-to-Image Generation with Large Language Models

When Vision-Language Model (VLM) Meets Beam Prediction: A Multimodal Contrastive Learning Framework

Recent Advances in Near-Field Beam Training and Channel Estimation for XL-MIMO Systems

Reasoning-Aware Query-Focused Summarization over Multi-Table Data

Classical representation of local Clifford operators

IRS-Empowered Wireless Communications: State-of-the-Art, Key Techniques, and Open Issues

Channel Estimation for Flexible Intelligent Metasurfaces: From Model-Based Approaches to Neural Operators

RL-based Adaptive Task Offloading in Mobile-Edge Computing for Future IoT Networks

Rapid Bone Scintigraphy Enhancement via Semantic Prior Distillation from Segment Anything Model

Improving Visual Storytelling with Multimodal Large Language Models

Higgs couplings and Naturalness in the littlest Higgs model with T-parity at the LHC and TLEP

Robust Learning Based Condition Diagnosis Method for Distribution Network Switchgear

IRS-aided UAV for Future Wireless Communications: A Survey and Research Opportunities

Giant magnetoresistance in antiferromagnetic Mn2_22​Au-based tunnel junction

Rotatable RIS Assisted Physical Layer Multicasting

Events

AI for Law

Personalize Your Feed

Giant magnetoresistance in antiferromagnetic Mn $_2$ Au-based tunnel junction