information-theory
We study the following distribution clustering problem: Given a hidden partition of kk distributions into two groups, such that the distributions within each group are the same, and the two distributions associated with the two clusters are ε\varepsilon-far in total variation, the goal is to recover the partition. We establish upper and lower bounds on the sample complexity for two fundamental cases: (1) when one of the cluster's distributions is known, and (2) when both are unknown. Our upper and lower bounds characterize the sample complexity's dependence on the domain size nn, number of distributions kk, size rr of one of the clusters, and distance ε\varepsilon. In particular, we achieve tightness with respect to (n,k,r,ε)(n,k,r,\varepsilon) (up to an O(logk)O(\log k) factor) for all regimes.
Expectation Propagation (EP) is a widely used message-passing algorithm that decomposes a global inference problem into multiple local ones. It approximates marginal distributions (beliefs) using intermediate functions (messages). While beliefs must be proper probability distributions that integrate to one, messages may have infinite integral values. In Gaussian-projected EP, such messages take a Gaussian form and appear as if they have "negative" variances. Although allowed within the EP framework, these negative-variance messages can impede algorithmic progress. In this paper, we investigate EP in linear models and analyze the relationship between the corresponding beliefs. Based on the analysis, we propose both non-persistent and persistent approaches that prevent the algorithm from being blocked by messages with infinite integral values. Furthermore, by examining the relationship between the EP messages in linear models, we develop an additional approach that avoids the occurrence of messages with infinite integral values.
Semantic Soft Bootstrapping (SSB), an RL-free self-distillation framework developed at the University of Maryland, enhances large language model reasoning by having the model act as both teacher and student. It boosted pass@1 accuracy on the MATH500 benchmark by 10.6% and on AIME2024 by 10% over a GRPO baseline, while utilizing a smaller dataset and maintaining concise response lengths.
When should an autonomous agent commit resources to a task? We introduce the Agent Capability Problem (ACP), a framework for predicting whether an agent can solve a problem under resource constraints. Rather than relying on empirical heuristics, ACP frames problem-solving as information acquisition: an agent requires \Itotal\Itotal bits to identify a solution and gains \Istep\Istep bits per action at cost \Cstep\Cstep, yielding an effective cost \Ceff=(\Itotal/\Istep),\Cstep\Ceff = (\Itotal/\Istep), \Cstep that predicts resource requirements before search. We prove that \Ceff\Ceff lower-bounds expected cost and provide tight probabilistic upper bounds. Experimental validation shows that ACP predictions closely track actual agent performance, consistently bounding search effort while improving efficiency over greedy and random strategies. The framework generalizes across LLM-based and agentic workflows, linking principles from active learning, Bayesian optimization, and reinforcement learning through a unified information-theoretic lens. \
We propose a decoder for quantum low density parity check (LDPC) codes based on a beam search heuristic guided by belief propagation (BP). Our beam search decoder applies to all quantum LDPC codes and achieves different speed-accuracy tradeoffs by tuning its parameters such as the beam width. We perform numerical simulations under circuit level noise for the [[144,12,12]][[144, 12, 12]] bivariate bicycle (BB) code at noise rate p=103p=10^{-3} to estimate the logical error rate and the 99.9 percentile runtime and we compare with the BP-OSD decoder which has been the default quantum LDPC decoder for the past six years. A variant of our beam search decoder with a beam width of 64 achieves a 17×17\times reduction in logical error rate. With a beam width of 8, we reach the same logical error rate as BP-OSD with a 26.2×26.2\times reduction in the 99.9 percentile runtime. We identify the beam search decoder with beam width of 32 as a promising candidate for trapped ion architectures because it achieves a 5.6×5.6\times reduction in logical error rate with a 99.9 percentile runtime per syndrome extraction round below 1ms at p=5×104p=5 \times10^{-4}. Remarkably, this is achieved in software on a single core, without any parallelization or specialized hardware (FPGA, ASIC), suggesting one might only need three 32-core CPUs to decode a trapped ion quantum computer with 1000 logical qubits.
This paper provides a fundamental characterization of the discrete ambiguity functions (AFs) of random communication waveforms under arbitrary orthonormal modulation with random constellation symbols, which serve as a key metric for evaluating the delay-Doppler sensing performance in future ISAC applications. A unified analytical framework is developed for two types of AFs, namely the discrete periodic AF (DP-AF) and the fast-slow time AF (FST-AF), where the latter may be seen as a small-Doppler approximation of the DP-AF. By analyzing the expectation of squared AFs, we derive exact closed-form expressions for both the expected sidelobe level (ESL) and the expected integrated sidelobe level (EISL) under the DP-AF and FST-AF formulations. For the DP-AF, we prove that the normalized EISL is identical for all orthogonal waveforms. To gain structural insights, we introduce a matrix representation based on the finite Weyl-Heisenberg (WH) group, where each delay-Doppler shift corresponds to a WH operator acting on the ISAC signal. This WH-group viewpoint yields sharp geometric constraints on the lowest sidelobes: The minimum ESL can only occur along a one-dimensional cut or over a set of widely dispersed delay-Doppler bins. Consequently, no waveform can attain the minimum ESL over any compact two-dimensional region, leading to a no-optimality (no-go) result under the DP-AF framework. For the FST-AF, the closed-form ESL and EISL expressions reveal a constellation-dependent regime governed by its kurtosis: The OFDM modulation achieves the minimum ESL for sub-Gaussian constellations, whereas the OTFS waveform becomes optimal for super-Gaussian constellations. Finally, four representative waveforms, namely, SC, OFDM, OTFS, and AFDM, are examined under both frameworks, and all theoretical results are verified through numerical examples.
Generative models have shown immense potential for wireless communication by learning complex channel data distributions. However, the iterative denoising process associated with these models imposes a significant challenge in latency-sensitive wireless communication scenarios, particularly in channel estimation. To address this challenge, we propose a novel solution for one-step generative channel estimation. Our approach bypasses the time-consuming iterative steps of conventional models by directly learning the average velocity field. Through extensive simulations, we validate the effectiveness of our proposed method over existing state-of-the-art diffusion-based approach. Specifically, our scheme achieves a normalized mean squared error up to 2.65 dB lower than the diffusion method and reduces latency by around 90%, demonstrating the potential of our method to enhance channel estimation performance.
The growing presence of unauthorized drones poses significant threats to public safety, underscoring the need for aerial surveillance solutions. This work proposes a cell-free integrated sensing and communication (ISAC) framework enabling drone detection within the existing communication network infrastructure, while maintaining communication services. The system exploits the spatial diversity and coordination of distributed access points (APs) in a cell-free massive MIMO architecture to detect aerial passive targets. To evaluate sensing performance, we introduce two key metrics: age of sensing (AoS), capturing the freshness of sensing information, and sensing coverage. The proposed AoS metric includes not only the transmission delays as in the existing models, but also the processing for sensing and networking delay, which are critical in dynamic environments like drone detection. We introduce an ambiguity parameter quantifying the similarity between the target-to-receiver channels for two hotspots and develop a novel network configuration strategy, including hotspot grouping, AP clustering, and sensing pilot assignment, leveraging simultaneous multi-point sensing to minimize AoS. Our results show that the best trade-off between AoS and sensing coverage is achieved when the number of hotspots sharing the same time/frequency resource matches the number of sensing pilots, indicating ambiguity as the primary factor limiting the sensing performance.
We extend the existing skew polynomial representations of matrix algebras which are direct sum of matrix spaces over division rings. In this representation, the sum-rank distance between two tuples of matrices is captured by a weight function on their associated skew polynomials, defined through degrees and greatest common right divisors with the polynomial that defines the representation. We exploit this representation to construct new families of maximum sum-rank distance (MSRD) codes over finite and infinite fields, and over division rings. These constructions generalize many of the known existing constructions of MSRD codes as well as of optimal codes in the rank and in the Hamming metric. As a byproduct, in the case of finite fields we obtain new families of MDS codes which are linear over a subfield and whose length is close to the field size.
We study the problem of certifying local Hamiltonians from real-time access to their dynamics. Given oracle access to eitHe^{-itH} for an unknown kk-local Hamiltonian HH and a fully specified target Hamiltonian H0H_0, the goal is to decide whether HH is exactly equal to H0H_0 or differs from H0H_0 by at least ε\varepsilon in normalized Frobenius norm, while minimizing the total evolution time. We introduce the first intolerant Hamiltonian certification protocol that achieves optimal performance for all constant-locality Hamiltonians. For general nn-qubit, kk-local, traceless Hamiltonians, our procedure uses O(ck/ε)O(c^k/\varepsilon) total evolution time for a universal constant cc, and succeeds with high probability. In particular, for O(1)O(1)-local Hamiltonians, the total evolution time becomes Θ(1/ε)\Theta(1/\varepsilon), matching the known Ω(1/ε)\Omega(1/\varepsilon) lower bounds and achieving the gold-standard Heisenberg-limit scaling. Prior certification methods either relied on implementing inverse evolution of HH, required controlled access to eitHe^{-itH}, or achieved near-optimal guarantees only in restricted settings such as the Ising case (k=2k=2). In contrast, our algorithm requires neither inverse evolution nor controlled operations: it uses only forward real-time dynamics and achieves optimal intolerant certification for all constant-locality Hamiltonians.
Graph Neural Networks (GNNs) have become a powerful tool for modeling and analyzing data with graph structures. The wide adoption in numerous applications underscores the value of these models. However, the complexity of these methods often impedes understanding their decision-making processes. Current Explainable AI (XAI) methods struggle to untangle the intricate relationships and interactions within graphs. Several methods have tried to bridge this gap via a post-hoc approach or self-interpretable design. Most of them focus on graph structure analysis to determine essential patterns that correlate with prediction outcomes. While post-hoc explanation methods are adaptable, they require extra computational resources and may be less reliable due to limited access to the model's internal workings. Conversely, Interpretable models can provide immediate explanations, but their generalizability to different scenarios remains a major concern. To address these shortcomings, this thesis seeks to develop a novel XAI framework tailored for graph-based machine learning. The proposed framework aims to offer adaptable, computationally efficient explanations for GNNs, moving beyond individual feature analysis to capture how graph structure influences predictions.
We consider the fundamental problem of balanced kk-means clustering. In particular, we introduce an optimal transport approach to alternating minimization called BalLOT, and we show that it delivers a fast and effective solution to this problem. We establish this with a variety of numerical experiments before proving several theoretical guarantees. First, we prove that for generic data, BalLOT produces integral couplings at each step. Next, we perform a landscape analysis to provide theoretical guarantees for both exact and partial recoveries of planted clusters under the stochastic ball model. Finally, we propose initialization schemes that achieve one-step recovery of planted clusters.
Accurate channel state information (CSI) underpins reliable and efficient wireless communication. However, acquiring CSI via pilot estimation incurs substantial overhead, especially in massive multiple-input multiple-output (MIMO) systems operating in high-Doppler environments. By leveraging the growing availability of environmental sensing data, this treatise investigates pilot-free channel inference that estimates complete CSI directly from multimodal observations, including camera images, LiDAR point clouds, and GPS coordinates. In contrast to prior studies that rely on predefined channel models, we develop a data-driven framework that formulates the sensing-to-channel mapping as a cross-modal flow matching problem. The framework fuses multimodal features into a latent distribution within the channel domain, and learns a velocity field that continuously transforms the latent distribution toward the channel distribution. To make this formulation tractable and efficient, we reformulate the problem as an equivalent conditional flow matching objective and incorporate a modality alignment loss, while adopting low-latency inference mechanisms to enable real-time CSI estimation. In experiments, we build a procedural data generator based on Sionna and Blender to support realistic modeling of sensing scenes and wireless propagation. System-level evaluations demonstrate significant improvements over pilot- and sensing-based benchmarks in both channel estimation accuracy and spectral efficiency for the downstream beamforming task.
Evaluating faithfulness of Large Language Models (LLMs) to a given task is a complex challenge. We propose two new unsupervised metrics for faithfulness evaluation using insights from information theory and thermodynamics. Our approach treats an LLM as a bipartite information engine where hidden layers act as a Maxwell demon controlling transformations of context CC into answer AA via prompt QQ. We model Question-Context-Answer (QCA) triplets as probability distributions over shared topics. Topic transformations from CC to QQ and AA are modeled as transition matrices Q{\bf Q} and A{\bf A} encoding the query goal and actual result, respectively. Our semantic faithfulness (SF) metric quantifies faithfulness for any given QCA triplet by the Kullback-Leibler (KL) divergence between these matrices. Both matrices are inferred simultaneously via convex optimization of this KL divergence, and the final SF metric is obtained by mapping the minimal divergence onto the unit interval [0,1], where higher scores indicate greater faithfulness. Furthermore, we propose a thermodynamics-based semantic entropy production (SEP) metric in answer generation, and show that high faithfulness generally implies low entropy production. The SF and SEP metrics can be used jointly or separately for LLM evaluation and hallucination control. We demonstrate our framework on LLM summarization of corporate SEC 10-K filings.
Deep neural receivers (NeuralRxs) for Orthogonal Frequency Division Multiplexing (OFDM) signals are proposed for enhanced decoding performance compared to their signal-processing based counterparts. However, the existing architectures ignore the required number of epochs for training convergence and floating-point operations (FLOPs), which increase significantly with improving performance. To tackle these challenges, we propose a new residual network (ResNet) block design for OFDM NeuralRx. Specifically, we leverage small kernel sizes and dilation rates to lower the number of FLOPs (NFLOPs) and uniform channel sizes to reduce the memory access cost (MAC). The ResNet block is designed with novel channel split and shuffle blocks, element-wise additions are removed, with Gaussian error linear unit (GELU) activations. Extensive simulations show that our proposed NeuralRx reduces NFLOPs and improves training convergence while improving the decoding accuracy.
In this work, we propose a novel energy-efficient spiking neural network (SNN)-based receiver for 5G-NR OFDM system, called neuromorphic receiver (NeuromorphicRx), replacing the channel estimation, equalization and symbol demapping blocks. We leverage domain knowledge to design the input with spiking encoding and propose a deep convolutional SNN with spike-element-wise residual connections. We integrate an SNN with artificial neural network (ANN) hybrid architecture to obtain soft outputs and employ surrogate gradient descent for training. We focus on generalization across diverse scenarios and robustness through quantized aware training. We focus on interpretability of NeuromorphicRx for 5G-NR signals and perform detailed ablation study for 5G-NR signals. Our extensive numerical simulations show that NeuromorphicRx is capable of achieving significant block error rate performance gain compared to 5G-NR receivers and similar performance compared to its ANN-based counterparts with 7.6x less energy consumption.
Movable antenna (MA) technology exhibits great promise for enhancing the sensing capabilities of future sixth-generation (6G) networks due to its capability to alter antenna array geometry. With the growing prevalence of near-field propagation at ultra-high frequencies, this paper focuses on the application of one-dimensional (1D) and two-dimensional (2D) MA arrays for near-field sensing to jointly estimate the angle and distance information about a target. First, for the 1D MA array scenario, to gain insights into MA-enhanced near-field sensing, we investigate two simplified cases with only angle-of-arrival (AoA) or distance estimation, respectively, assuming that the other information is already known. The worst-case Cramer-Rao bounds (CRBs) on the mean square errors (MSEs) of the AoA estimation and the distance estimation are derived in these two cases. Then, we jointly optimize the positions of the MAs within the 1D array to minimize these CRBs and derive their closed-form solutions, which yield an identical array geometry to MA-enhanced far-field sensing. For the more challenging joint AoA and distance estimation, since the associated worst-case CRB is a highly complex and non-convex function with respect to the MA positions, a discrete sampling-based approach is proposed to sequentially update the MA positions and obtain an efficient suboptimal solution. Furthermore, we investigate the worst-case CRB minimization problems for a 2D MA array under various conditions and extend our proposed algorithms to solve them efficiently. Numerical results demonstrate that the proposed MA-enhanced near-field sensing scheme dramatically outperforms conventional fixed-position antennas (FPAs). Moreover, the joint angle and distance estimation results in a different array geometry from that in the individual estimation of angle/distance or far-field sensing.
This letter investigates channel estimation for ultra-massive multiple-input multiple-output (MIMO) communications. We propose a joint low-rank and sparse Bayesian estimation (LRSBE) algorithm for spatial non-stationary ultra-massive channels by exploiting the low-rankness and sparsity in the beam domain. Specifically, the channel estimation integrates sparse Bayesian learning and soft-threshold gradient descent within the expectation-maximization framework. Simulation results show that the proposed algorithm significantly outperforms the state-of-the-art alternatives under different signal-to-noise ratio conditions in terms of estimation accuracy and overall complexity.
In this paper, we propose a double-edge-assisted computation offloading and resource allocation scheme tailored for space-air-marine integrated networks (SAMINs). Specifically, we consider a scenario where both unmanned aerial vehicles (UAVs) and a low earth orbit (LEO) satellite are equipped with edge servers, providing computing services for maritime autonomous surface ships (MASSs). Partial computation workloads of MASSs can be offloaded to both UAVs and the LEO satellite, concurrently, for processing via a multi-access approach. To minimize the energy consumption of SAMINs under latency constraints, we formulate an optimization problem and propose energy efficient algorithms to jointly optimize offloading mode, offloading volume, and computing resource allocation of the LEO satellite and the UAVs, respectively. We further exploit an alternating optimization (AO) method and a layered approach to decompose the original problem to attain the optimal solutions. Finally, we conduct simulations to validate the effectiveness and efficiency of the proposed scheme in comparison with benchmark algorithms.
Researchers from Shanghai Jiao Tong University developed an "Information Physics of Intelligence" framework, unifying logical depth, Shannon entropy, and thermodynamic principles to establish fundamental limits and optimal strategies for balancing information storage and computation. Their work introduces a critical frequency threshold and an Energy-Time-Space Triality Bound, demonstrating optimal strategies can yield up to 54.7% latency reduction and 53% energy savings in hybrid knowledge systems.
There are no more papers matching your filters at the moment.