Indian Statistical Institute
Limit theorems for the magnetization in the pp-spin Curie-Weiss model, for p3p \geq 3, has been derived recently by Mukherjee et al. (2021). In this paper, we strengthen these results by proving Cramér-type moderate deviation theorems and Berry-Esseen bounds for the magnetization (suitably centered and scaled). In particular, we show that the rate of convergence is O(N12)O(N^{-\frac{1}{2}}) when the magnetization has asymptotically Gaussian fluctuations, and it is O(N14)O(N^{-\frac{1}{4}}) when the fluctuations are non-Gaussian. As an application, we derive a Berry-Esseen bound for the maximum pseudolikelihood estimate of the inverse temperature in pp-spin Curie-Weiss model with no external field, for all points in the parameter space where consistent estimation is possible.
A comprehensive survey and benchmark evaluated 18 state-of-the-art activation functions across diverse network architectures and data modalities, including image, text, and speech. The study reveals that the optimal activation function depends on the specific network and data type, demonstrating that newer adaptive functions can outperform traditional ones in certain scenarios while highlighting trade-offs between accuracy and computational efficiency.
53
Changepoint localization aims to provide confidence sets for a changepoint (if one exists). Existing methods either relying on strong parametric assumptions or providing only asymptotic guarantees or focusing on a particular kind of change(e.g., change in the mean) rather than the entire distributional change. A method (possibly the first) to achieve distribution-free changepoint localization with finite-sample validity was recently introduced by \cite{dandapanthula2025conformal}. However, while they proved finite sample coverage, there was no analysis of set size. In this work, we provide rigorous theoretical guarantees for their algorithm. We also show the consistency of a point estimator for change, and derive its convergence rate without distributional assumptions. Along that line, we also construct a distribution-free consistent test to assess whether a particular time point is a changepoint or not. Thus, our work provides unified distribution-free guarantees for changepoint detection, localization, and testing. In addition, we present various finite sample and asymptotic properties of the conformal pp-value in the distribution change setup, which provides a theoretical foundation for many applications of the conformal pp-value. As an application of these properties, we construct distribution-free consistent tests for exchangeability against distribution-change alternatives and a new, computationally tractable method of optimizing the powers of conformal tests. We run detailed simulation studies to corroborate the performance of our methods and theoretical results. Together, our contributions offer a comprehensive and theoretically principled approach to distribution-free changepoint inference, broadening both the scope and credibility of conformal methods in modern changepoint analysis.
Despite their impressive performance on multi-modal tasks, large vision-language models (LVLMs) tend to suffer from hallucinations. An important type is object hallucination, where LVLMs generate objects that are inconsistent with the images shown to the model. Existing works typically attempt to quantify object hallucinations by detecting and measuring the fraction of hallucinated objects in generated captions. Additionally, more recent work also measures object hallucinations by directly querying the LVLM with binary questions about the presence of likely hallucinated objects based on object statistics like top-k frequent objects and top-k co-occurring objects. In this paper, we present Context-Aware Object Similarities (CAOS), a novel approach for evaluating object hallucination in LVLMs using object statistics as well as the generated captions. CAOS uniquely integrates object statistics with semantic relationships between objects in captions and ground-truth data. Moreover, existing approaches usually only detect and measure hallucinations belonging to a predetermined set of in-domain objects (typically the set of all ground-truth objects for the training dataset) and ignore generated objects that are not part of this set, leading to under-evaluation. To address this, we further employ language model--based object recognition to detect potentially out-of-domain hallucinated objects and use an ensemble of LVLMs for verifying the presence of such objects in the query image. CAOS also examines the sequential dynamics of object generation, shedding light on how the order of object appearance influences hallucinations, and employs word embedding models to analyze the semantic reasons behind hallucinations. CAOS aims to offer a nuanced understanding of the hallucination tendencies of LVLMs by providing a systematic framework to identify and interpret object hallucinations.
In this paper we propose a two-field model of warm inflation motivated from a heterotic string construction. The model contains an axion and a dilaton-like field. We show that while warm inflation can take place in the axion-field direction, thermal corrections coming from the radiation gauge fields, which couples to both the axion and the dilaton, prevent warm inflation to happen in the dilaton-field direction. We explore the background dynamics for different parameters, and identify a diversity of dynamical behaviors allowed in this model, denoting different regimes of warm inflation.
Possible interaction between dark energy and dark matter has previously shown promise in alleviating the clustering tension, without exacerbating the Hubble tension, when Baryon Acoustic Oscillations (BAO) data from the Sloan Digital Sky Survey (SDSS) DR16 is combined with Cosmic Microwave Background (CMB) and Type-Ia Supernovae (SNIa) data sets. With the recent Dark Energy Spectroscopic Instrument (DESI) BAO DR2, there is now a compelling need to re-evaluate this scenario. We combine DESI DR2 with Planck 2018 and Pantheon+ SNIa data sets to constrain interacting dark matter dark energy models, accounting for interaction effects in both the background and perturbation sectors. Our results exhibit similar trends to those observed with SDSS, albeit with improved precision, reinforcing the consistency between the two BAO data sets. In addition to offering a resolution to the S8S_8 tension, in the phantom-limit, the dark energy equation of state exhibits an early-phantom behaviour, aligning with DESI DR2 findings, before transitioning to w1w\sim-1 at lower redshifts, regardless of the DE parametrization. However, the statistical significance of excluding w=1w=-1 is reduced compared to their non-interacting counterparts.
Crosswords are a form of word puzzle that require a solver to demonstrate a high degree of proficiency in natural language understanding, wordplay, reasoning, and world knowledge, along with adherence to character and length constraints. In this paper we tackle the challenge of solving crosswords with large language models (LLMs). We demonstrate that the current generation of language models shows significant competence at deciphering cryptic crossword clues and outperforms previously reported state-of-the-art (SoTA) results by a factor of 2-3 in relevant benchmarks. We also develop a search algorithm that builds off this performance to tackle the problem of solving full crossword grids with out-of-the-box LLMs for the very first time, achieving an accuracy of 93% on New York Times crossword puzzles. Additionally, we demonstrate that LLMs generalize well and are capable of supporting answers with sound rationale.
2
Debiasing group graphical lasso estimates enables statistical inference when multiple Gaussian graphical models share a common sparsity pattern. We analyze the estimation properties of group graphical lasso, establishing convergence rates and model selection consistency under irrepresentability conditions. Based on these results, we construct debiased estimators that are asymptotically Gaussian, allowing hypothesis testing for linear combinations of precision matrix entries across populations. We also investigate regimes where irrepresentibility conditions does not hold, showing that consistency can still be attained in moderately high-dimensional settings. Simulation studies confirm the theoretical results, and applications to real datasets demonstrate the practical utility of the method.
Text-to-story visualization is challenging due to the need for consistent interaction among multiple characters across frames. Existing methods struggle with character consistency, leading to artifact generation and inaccurate dialogue rendering, which results in disjointed storytelling. In response, we introduce TaleDiffusion, a novel framework for generating multi-character stories with an iterative process, maintaining character consistency, and accurate dialogue assignment via postprocessing. Given a story, we use a pre-trained LLM to generate per-frame descriptions, character details, and dialogues via in-context learning, followed by a bounded attention-based per-box mask technique to control character interactions and minimize artifacts. We then apply an identity-consistent self-attention mechanism to ensure character consistency across frames and region-aware cross-attention for precise object placement. Dialogues are also rendered as bubbles and assigned to characters via CLIPSeg. Experimental results demonstrate that TaleDiffusion outperforms existing methods in consistency, noise reduction, and dialogue rendering.
Researchers at the Indian Statistical Institute demonstrated that transformer encoders can exactly simulate arbitrary attention mechanisms, including fundamental matrix operations and activations, within the Restricted Access Sequence Processing (RASP) framework, thereby providing algorithmic guarantees for computations previously only known to be learnable.
Swarmalators, entities that combine the properties of swarming particles with synchronized oscillations, represent a novel and growing area of research in the study of collective behavior. This review provides a comprehensive overview of the current state of swarmalator research, focusing on the interplay between spatial organization and temporal synchronization. After a brief introduction to synchronization and swarming as separate phenomena, we discuss the various mathematical models that have been developed to describe swarmalator systems, highlighting the key parameters that govern their dynamics. The review also discusses the emergence of complex patterns, such as clustering, phase waves, and synchronized states, and how these patterns are influenced by factors such as interaction range, coupling strength, and frequency distribution. Recently, some minimal models were proposed that are solvable and mimic real-world phenomena. The effect of predators in the swarmalator dynamics is also discussed. Finally, we explore potential applications in fields ranging from robotics to biological systems, where understanding the dual nature of swarming and synchronization could lead to innovative solutions. By synthesizing recent advances and identifying open challenges, this review aims to provide a foundation for future research in this interdisciplinary field.
We do a careful investigation of the prospects of dark energy (DE) interacting with cold dark matter in alleviating the S8S_8 clustering tension. To this end, we consider various well-known parametrizations of the DE equation of state (EoS) and consider perturbations in both the dark sectors, along with an interaction term. Moreover, we perform a separate study for the phantom and non-phantom regimes. Using cosmic microwave background (CMB), baryon acoustic oscillations, and Type Ia supernovae data sets, constraints on the model parameters for each case have been obtained and a generic reduction in the H0σ8,0H_0-\sigma_{8,0} correlation has been observed, both for constant and dynamical DE EoS. This reduction, coupled with a significant negative correlation between the interaction term and σ8,0\sigma_{8,0}, contributes to easing the clustering tension by lowering σ8,0\sigma_{8,0} to somewhere in between the early CMB and late-time clustering measurements for the phantom regime, for almost all the models under consideration. Additionally, this is achieved without exacerbating the Hubble tension. In this regard, the interacting Chevallier-Polarski-Linder and Jassal-Bagla-Padmanabhan models perform the best in relaxing the S8S_8 tension to <1\sigma. However, for the non-phantom regime the σ8,0\sigma_{8,0} tension tends to have worsened, which reassures the merits of phantom DE from latest data. We further investigate the role of redshift space distortion data sets and find an overall reduction in tension, with a σ8,0\sigma_{8,0} value relatively closer to the CMB value. We finally check whether further extensions of this scenario, such as the inclusion of the sound speed of DE and warm dark matter interacting with DE, can have some effects.
This review comprehensively surveys the Landauer Principle and its role in the thermodynamics of computation, synthesizing theoretical advancements, experimental validations, and applications to classical and quantum computing models. It details how the principle generalizes to finite-time, non-equilibrium, and quantum regimes, addressing the physical limits of energy dissipation in information processing.
Assessing whether a sample survey credibly represents the population is a critical question for ensuring the validity of downstream research. Generally, this problem reduces to estimating the distance between two high-dimensional distributions, which typically requires a number of samples that grows exponentially with the dimension. However, depending on the model used for data analysis, the conclusions drawn from the data may remain consistent across different underlying distributions. In this context, we propose a task-based approach to assess the credibility of sampled surveys. Specifically, we introduce a model-specific distance metric to quantify this notion of credibility. We also design an algorithm to verify the credibility of survey data in the context of regression models. Notably, the sample complexity of our algorithm is independent of the data dimension. This efficiency stems from the fact that the algorithm focuses on verifying the credibility of the survey data rather than reconstructing the underlying regression model. Furthermore, we show that if one attempts to verify credibility by reconstructing the regression model, the sample complexity scales linearly with the dimensionality of the data. We prove the theoretical correctness of our algorithm and numerically demonstrate our algorithm's performance.
The Temporal Fusion Transformer (TFT), proposed by Lim et al. [\textit{International Journal of Forecasting}, 2021], is a state-of-the-art attention-based deep neural network architecture specifically designed for multi-horizon time series forecasting. It has demonstrated significant performance improvements over existing benchmarks. In this work, we propose a Quantum Temporal Fusion Transformer (QTFT), a quantum-enhanced hybrid quantum-classical architecture that extends the capabilities of the classical TFT framework. Our results demonstrate that QTFT is successfully trained on the forecasting datasets and is capable of accurately predicting future values. In particular, our experimental results display that in certain test cases, the model outperforms its classical counterpart in terms of both training and test loss, while in the remaining cases, it achieves comparable performance. A key advantage of our approach lies in its foundation on a variational quantum algorithm, enabling implementation on current noisy intermediate-scale quantum (NISQ) devices without strict requirements on the number of qubits or circuit depth.
The idea of \textit{Virtual Try-ON} (VTON) benefits e-retailing by giving an user the convenience of trying a clothing at the comfort of their home. In general, most of the existing VTON methods produce inconsistent results when a person posing with his arms folded i.e., bent or crossed, wants to try an outfit. The problem becomes severe in the case of long-sleeved outfits. As then, for crossed arm postures, overlap among different clothing parts might happen. The existing approaches, especially the warping-based methods employing \textit{Thin Plate Spline (TPS)} transform can not tackle such cases. To this end, we attempt a solution approach where the clothing from the source person is segmented into semantically meaningful parts and each part is warped independently to the shape of the person. To address the bending issue, we employ hand-crafted geometric features consistent with human body geometry for warping the source outfit. In addition, we propose two learning-based modules: a synthesizer network and a mask prediction network. All these together attempt to produce a photo-realistic, pose-robust VTON solution without requiring any paired training data. Comparison with some of the benchmark methods clearly establishes the effectiveness of the approach.
This paper examines the characterizations of equilibrium in economies with public projects. Public goods, as discussed by Mas-Colell (1980), are modeled as elements of an abstract set lacking a unified ordering structure. We introduce the concepts of cost share equilibrium for such economies, where the private commodity space is a (possibly nonseparable) Banach lattice. Within this framework, we present two distinct characterizations of cost share equilibria via the veto power of the grand coalition in economies featuring finitely many agents. The first characterization involves allocations that are Aubin non-dominated, while the second establishes that an allocation is a cost share equilibrium if and only if it cannot be dominated by the grand coalition, where domination is considered under specific perturbations of initial endowments.
We develop a framework for quantum differential privacy (QDP) based on quantum hypothesis testing and Blackwell's ordering. This approach characterizes (\eps,δ)(\eps,\delta)-QDP via hypothesis testing divergences and identifies the most informative quantum state pairs under privacy constraints. We apply this to analyze the stability of quantum learning algorithms, generalizing classical results to the case δ>0\delta>0. Additionally, we study privatized quantum parameter estimation, deriving tight bounds on the quantum Fisher information under QDP. Finally, we establish near-optimal contraction bounds for differentially private quantum channels with respect to the hockey-stick divergence.
We propose an e-value based framework for testing arbitrary composite nulls against composite alternatives, when an ϵ\epsilon fraction of the data can be arbitrarily corrupted. Our tests are inherently sequential, being valid at arbitrary data-dependent stopping times, but they are new even for fixed sample sizes, giving type-I error control without any regularity conditions. We first prove that least favourable distribution (LFD) pairs, when they exist, yield optimal e-values for testing arbitrary composite nulls against composite alternatives. Then we show that if an LFD pair exists for some composite null and alternative, then the LFDs of Huber's ϵ\epsilon-contamination or total variation (TV) neighborhoods around that specific pair form the optimal LFD pair for the corresponding robustified composite hypotheses. Furthermore, where LFDs do not exist, we develop new robust composite tests for general settings. Our test statistics are a nonnegative supermartingale under the (robust) null, even under a sequentially adaptive (non-i.i.d.) contamination model where the conditional distribution of each observation given the past data lies within an ϵ\epsilon TV ball of some distribution in the original composite null. When LFDs exist, our supermartingale grows to infinity exponentially fast under any distribution in the (ϵ\epsilon TV-corruption of the) alternative at the optimal rate. When LFDs do not exist, we provide an asymptotic growth rate analysis, showing that as ϵ0\epsilon \to 0, the exponent converges to the corresponding Kullback-Leibler divergence, recovering the classical optimal non-robust rate. Simulations validate the theory and demonstrate reasonable practical performance.
Researchers from the Institute for Advancing Intelligence (IAI), TCG CREST, introduced SC-pNA, a self-tuning spectral clustering method for speaker diarization that adaptively prunes affinity matrices. This approach improved computational efficiency and achieved a Diarization Error Rate (DER) of 10.27% on the challenging DIHARD-III evaluation set, outperforming other unsupervised methods and demonstrating performance competitive with semi-supervised baselines.
There are no more papers matching your filters at the moment.