Nanjing Normal University
Researchers at Nanjing Normal University developed PA-Diff, a physics-aware Transformer-based diffusion model for underwater image enhancement that integrates physical imaging models into its generative process. The framework achieves superior perceptual quality, evidenced by a Fréchet Inception Distance (FID) of 22.13 on the LSUI dataset, outperforming previous state-of-the-art methods.
25
The Circular Electron-Positron Collider (CEPC), a proposed next-generation Higgs factory, provides new opportunities to explore physics beyond the Standard Model (SM). With its clean electron-positron collision environment and the ability to collect large samples of Higgs, W, and Z bosons, the CEPC enables precision measurements and searches for new physics. This white paper outlines the CEPC's discovery potential, including studies of exotic decays of the Higgs, Z, and top quarks, dark matter and dark sector phenomena, long-lived particles, supersymmetry, and neutrino-related signatures. Advanced detector technologies and reconstruction techniques, such as one-to-one correspondence reconstruction and jet origin identification, significantly improve sensitivity to rare and weakly interacting processes. The CEPC is particularly well suited to probe the electroweak phase transition and test models of electroweak baryogenesis and dark sector interactions. In addition, global fit analyses highlight the CEPC's complementary role in constraining a wide range of new physics scenarios. These features position the CEPC as a powerful tool for exploring the next frontier in fundamental particle physics in the post-Higgs discovery era.
Nanjing University researchers developed MHier-RAG, a multi-modal RAG framework designed for visually-rich document question-answering, which addresses challenges in connecting different modalities and reasoning across fragmented, distant document sections. The system improved generalized accuracy by 19.9% over GPT-4V and by 27.2% over a prior RAG SOTA on the MMLongBench-Doc dataset.
This study investigates the use of generative AI and multi-agent systems to provide automatic feedback in educational contexts, particularly for student constructed responses in science assessments. The research addresses a key gap in the field by exploring how multi-agent systems, called AutoFeedback, can improve the quality of GenAI-generated feedback, overcoming known issues such as over-praise and over-inference that are common in single-agent large language models (LLMs). The study developed a multi-agent system consisting of two AI agents: one for generating feedback and another for validating and refining it. The system was tested on a dataset of 240 student responses, and its performance was compared to that of a single-agent LLM. Results showed that AutoFeedback significantly reduced the occurrence of over-praise and over-inference errors, providing more accurate and pedagogically sound feedback. The findings suggest that multi-agent systems can offer a more reliable solution for generating automated feedback in educational settings, highlighting their potential for scalable and personalized learning support. These results have important implications for educators and researchers seeking to leverage AI in formative assessments, offering a pathway to more effective feedback mechanisms that enhance student learning outcomes.
FreeZAD is introduced as the first training-free method for zero-shot temporal action detection, directly leveraging Vision-Language models. It demonstrates superior performance over state-of-the-art unsupervised approaches and offers high computational efficiency.
Out-of-Distribution (OOD) detection is a cornerstone for the safe deployment of AI systems in the open world. However, existing methods treat OOD detection as a binary classification problem, a cognitive flattening that fails to distinguish between semantically close (Near-OOD) and distant (Far-OOD) unknown risks. This limitation poses a significant safety bottleneck in applications requiring fine-grained risk stratification. To address this, we propose a paradigm shift from a conventional probabilistic view to a principled information-theoretic framework. We formalize the core task as quantifying the Semantic Surprise of a new sample and introduce a novel ternary classification challenge: In-Distribution (ID) vs. Near-OOD vs. Far-OOD. The theoretical foundation of our work is the concept of Low-Entropy Semantic Manifolds, which are explicitly structured to reflect the data's intrinsic semantic hierarchy. To construct these manifolds, we design a Hierarchical Prototypical Network. We then introduce the Semantic Surprise Vector (SSV), a universal probe that decomposes a sample's total surprise into three complementary and interpretable dimensions: conformity, novelty, and ambiguity. To evaluate performance on this new task, we propose the Normalized Semantic Risk (nSR), a cost-sensitive metric. Experiments demonstrate that our framework not only establishes a new state-of-the-art (sota) on the challenging ternary task, but its robust representations also achieve top results on conventional binary benchmarks, reducing the False Positive Rate by over 60% on datasets like LSUN.
We report a state-of-the-art lattice QCD calculation of the isovector quark transversity distribution of the proton in the continuum and physical mass limit using large-momentum effective theory. The calculation is done at four lattice spacings a={0.098,0.085,0.064,0.049}a=\{0.098,0.085,0.064,0.049\}~fm and various pion masses ranging between 220220 and 350350 MeV, with proton momenta up to 2.82.8 GeV. The result is non-perturbatively renormalized in the hybrid scheme with self renormalization which treats the infrared physics at large correlation distance properly, and extrapolated to the continuum, physical mass and infinite momentum limit. We also compare with recent global analyses for the nucleon isovector quark transversity distribution.
Randerath {\em et al.} [Discrete Math. 251 (2002) 137-153] proved that every (P6,C3)(P_6,C_3)-free graph GG satisfies χ(G)4\chi(G)\leq4. Pyatkin [Discrete Math. 313 (2013) 715-720] proved that every (2P3,C3)(2P_3,C_3)-free graph GG satisfies χ(G)4\chi(G)\leq4. In this paper, we prove that for a connected (P2P4,C3)(P_2\cup P_4, C_3)-free graph GG, either GG has two nonadjacent vertices u,vu,v such that N(u)N(v)N(u)\subseteq N(v), or GG is 3-colorable, or GG contains Grőtzsch graph as an induced subgraph and is an induced subgraph of Clebsch graph. Consequently, we have determined the chromatic number of (P2P4,C3)(P_2\cup P_4, C_3)-free graph is 4. A graph GG is {\em perfectly divisible} if, for each induced subgraph HH of GG, V(H)V(H) can be partitioned into AA and BB such that H[A]H[A] is perfect and \omega(H[B])<\omega(H). A {\em bull} is a graph consisting of a triangle with two disjoint pendant edges. Deng and Chang [Graphs Combin. (2025) 41: 63] proved that every (P2P3P_2\cup P_3, bull)-free graph GG with ω(G)3\omega(G)\geq3 has a partition (X,Y)(X,Y) such that G[X]G[X] is perfect and G[Y]G[Y] has clique number less than ω(G)\omega(G) if GG admits no homogeneous set; Chen and Wang [arXiv:2507.18506v2] proved that such property is also true for (P2P4P_2\cup P_4, bull)-free graphs. In this paper, we prove that a (P2P4P_2\cup P_4, bull)-free graph is perfectly divisible if and only if it contains no Grőtzsch graph.
Circularly polarized or axial phonons possessing nonzero angular momentum have attracted considerable interest. These phonons have finite magnetic moment and can couple to internal magnetic order. The rich magnetic structures enable phonon angular momentum (PAM) to acquire momentum-space textures analogous to electronic spin structures. However, a systematic framework for classifying these textures, especially their potential higher-order multipolar patterns, has remained elusive. Here, by employing magnetic point group analysis, we develop a complete classification of long-wavelength phonons in collinear magnets. Our theory distinguishes four fundamental types, including three families of magneto-axial phonons differentiated by symmetry and the parity (odd or even) of the PAM wave pattern. Strikingly, we reveal a full sequence of axial phonons exhibiting higher-order-wave (from p- to j-wave) PAM patterns covering both odd and even parities, which we term magneto-alteraxial phonons. Our high-throughput calculations predict hundreds of magnetic candidates hosting such magneto-alteraxial phonons. We have also performed ab initio calculations on representative materials to validate the proposed magneto-alteraxial phonon spectra and PAM patterns. Our work establishes a symmetry-guided design principle for axial phonons and related phenomena in magnetic materials.
We present the state-of-the-art lattice QCD calculation of the pion and kaon light-cone distribution amplitudes (DAs) using large-momentum effective theory. The calculation is done at three lattice spacings a{0.06,0.09,0.12}a\approx\{0.06,0.09,0.12\} fm and physical pion and kaon masses, with the meson momenta Pz={1.29,1.72,2.15}P_z = \{1.29,1.72,2.15\} GeV. The result is non-perturbatively renormalized in a recently proposed hybrid scheme with self renormalization, and extrapolated to the continuum as well as the infinite momentum limit. We find a significant deviation of the pion and kaon DAs from the asymptotic form, and a large SU(3)SU(3) flavor breaking effect in the kaon DA.
Wide-angle videos in few-shot action recognition (FSAR) effectively express actions within specific scenarios. However, without a global understanding of both subjects and background, recognizing actions in such samples remains challenging because of the background distractions. Receptance Weighted Key Value (RWKV), which learns interaction between various dimensions, shows promise for global modeling. While directly applying RWKV to wide-angle FSAR may fail to highlight subjects due to excessive background information. Additionally, temporal relation degraded by frames with similar backgrounds is difficult to reconstruct, further impacting performance. Therefore, we design the CompOund SegmenTation and Temporal REconstructing RWKV (Otter). Specifically, the Compound Segmentation Module~(CSM) is devised to segment and emphasize key patches in each frame, effectively highlighting subjects against background information. The Temporal Reconstruction Module (TRM) is incorporated into the temporal-enhanced prototype construction to enable bidirectional scanning, allowing better reconstruct temporal relation. Furthermore, a regular prototype is combined with the temporal-enhanced prototype to simultaneously enhance subject emphasis and temporal modeling, improving wide-angle FSAR performance. Extensive experiments on benchmarks such as SSv2, Kinetics, UCF101, and HMDB51 demonstrate that Otter achieves state-of-the-art performance. Extra evaluation on the VideoBadminton dataset further validates the superiority of Otter in wide-angle FSAR.
Representational Similarity Analysis (RSA) is a popular method for analyzing neuroimaging and behavioral data. Here we evaluate the accuracy and reliability of RSA in the context of model selection, and compare it to that of regression. Although RSA offers flexibility in handling high-dimensional, cross-modal, and cross-species data, its reliance on a transformation of raw data into similarity structures may result in the loss of critical stimulus-response information. Across extensive simulation studies and empirical analyses, we show that RSA leads to lower model selection accuracy, regardless of sample size, noise level, feature dimensionality, or multicollinearity, relative to regression. While principal component analysis and feature reweighting mitigate RSA's deficits driven by multicollinearity, regression remains superior in accurately distinguishing between models. Empirical data and a follow-up fMRI simulation further support these conclusions. Our findings suggest that researchers should carefully consider which approach to use: RSA is less effective than linear regression for model selection and fitting when direct stimulus-response mappings are available.
AT2018cqh is a unique optical tidal disruption event (TDE) discovered in a dwarf galaxy exhibiting delayed X-ray and radio flares. We present the results from high-resolution VLBA and e-MERLIN radio observations of AT2018cqh extending to δ\deltat \sim 2250 days post discovery, which reveal a compact radio emission, unresolved at a scale of <~ 0.13 pc at 7.6 GHz, with a high brightness temperature of TbT_b ~> 4.03 ×\times 109^{9} K. The radio spectral energy distribution (SED) is found to gradually shift towards a higher peak flux density and frequency over a period of \sim1000 days. An equipartition analysis suggests that there is a little change in the radio emitting region over this period, while the electron density increases by a factor of 3. The radio light curve at 0.89 GHz continues to rise, with a bump feature lasting for 240 days. These properties are in contrast to the predictions of standard shockwave model from a diffuse circumnuclear medium, but could be explained if dense clouds exist in the circumnuclear environment. The latter scenario is supported by our hydrodynamic simulations of the interaction of TDE outflow with a cloud, which can reproduce the temporal evolution in the radio SED. This work highlights the importance of the outflow-cloud interaction in explaining the delayed, fast-rising radio emission observed in some TDEs, especially those occurring in galaxies with pre-existing AGN activity.
In this paper, we study a class of dissipative stochastic differential equations driven by nonlinear multiplicative fractional Brownian noise with Hurst index H(13,12)(12,1)H \in \left(\frac{1}{3},\frac{1}{2})\cup(\frac{1}{2}, 1\right) . We establish the well-posedness of the associated coupled stochastic differential equations and prove synchronization in the sense of trajectories. Our approach relies on the Doss-Sussmann transformation, which enables us to extend existing results for additive and linear noise to the case of nonlinear multiplicative fractional Brownian noise. The findings provide new insights into the synchronization of dissipative systems under fractional noise perturbations.
Nuclear power plants are not only vital sources of clean energy but also powerful facilities for probing new physics beyond the Standard Model. Due to the intense gamma-ray flux and an appropriate energy conditions, they are particularly well-suited for searches of light hypothetical particles such as sub-MeV axions and axion-like particles (ALPs). In this work, we propose to search for the ALPs in the REactor Neutrino COherent scattering Detection Experiment (RECODE), where two low-threshold, high-purity germanium detectors are placed at 11 m (near point) and 22 m (far point) from a 3.4 GW nuclear reactor at Sanmen nuclear power plant. With a 10 kg\cdotyear exposure, we demonstrate that the expected sensitivities to the ALP couplings to the electrons and photons are competitive with or surpass the available results from the beam-dump experiments. A planned upgrade to 100 kg\cdotyear will fully cover the so-called {\it cosmological triangle} region, probing unexplored parameter space relevant to axions.
In recent years, the prospect of detecting gravitational waves sourced from a strongly first-order cosmological phase transition has emerged as one of the most exciting frontiers of gravitational wave astronomy. Cosmological phase transitions are an essential ingredient in the Standard Model of particle cosmology, and help explain the mechanism for creation of matter in the early Universe, provide insights into fundamental theories of physics, and shed light on the nature of dark matter. This underscores the significance of developing robust end-to-end tools for determining the resulting gravitational waves from these phase transitions. In this article we present PhaseTracer2, an improved version of the C++ software package PhaseTracer, designed for mapping cosmological phases and transitions in Standard Model extensions of multiple scalar fields. Building on the robust framework of its predecessor, PhaseTracer2 extends its capabilities by including new features crucial for a more comprehensive analysis of cosmological phase transitions. It can calculate more complex properties, such as the bounce action through the path deformation method or an interface with BubbleProfiler, thermodynamic parameters, and gravitational wave spectra. Its applicability has also been broadened via incorporating the dimensionally reduced effective potential for models obtained from DRalgo, as well as calculations in the MSbar and OS-like renormalisation schemes. This modular, flexible, and practical upgrade retains the speed and stability of the original PhaseTracer, while significantly expanding its utility.
We investigate the extent to which perturbative calculations of the electroweak phase transition are arbitrary and uncertain, owing to their gauge, renormalisation scale and scheme dependence, as well as treatments of the Goldstone catastrophe and daisy diagrams. Using the complete parameter space of the Standard Model extended by a real scalar singlet with a Z2\mathbb{Z}_2 symmetry as a test, we explore the properties of the electroweak phase transition in general RξR_\xi and covariant gauges, OS and MS\overline{\text{MS}} renormalisation schemes, and for common treatments of the Goldstone catastrophe and daisy diagrams. Reassuringly, we find that different renormalisation schemes and different treatments of the Goldstone catastrophe and daisy diagrams typically lead to only modest changes in predictions for the critical temperature and strength of the phase transition. On the other hand, the gauge and renormalisation scale dependence may be significant, and often impact the existence of the phase transition altogether.
The diffusion of high-energy cosmic rays (CRs) through the dark matter (DM) spikes of active galactic nuclei entails significant energy loss via interactions with DM. While previous studies of sub-GeV DM have focused on elastic scattering, this process becomes insufficient at higher proton energies and DM masses. In this work, we investigate the CR-DM deep inelastic scattering (DIS) as mediated by a vector portal. We calculate the DIS contribution to the CR energy loss rate and derive stringent exclusion limits on the CR-DM scattering cross-section for DM masses between 10610^{-6} GeV and 11 GeV. For higher CR energies and mediator masses, the resulting CR cooling timescales are reduced by orders of magnitude after involving the DIS contribution, producing stringent constraints that surpass most of current experimental limits.
Several Pulsar Timing Array (PTA) collaborations have recently reported the evidence for a stochastic gravitational-wave background (SGWB), which can unveil the formation of primordial seeds of inhomogeneities in the early universe. With the SGWB parameters inferred from PTAs data, we can make a prediction of the seeds for early galaxy formation from the domain walls in the axion-like particles (ALPs) field distribution. This also naturally provides a solution to the observation of high redshifts by the James Webb Space Telescope. The predicted photon coupling of the ALP is within the reach of future experimental searches.
In few-shot action recognition (FSAR), long sub-sequences of video naturally express entire actions more effectively. However, the high computational complexity of mainstream Transformer-based methods limits their application. Recent Mamba demonstrates efficiency in modeling long sequences, but directly applying Mamba to FSAR overlooks the importance of local feature modeling and alignment. Moreover, long sub-sequences within the same class accumulate intra-class variance, which adversely impacts FSAR performance. To solve these challenges, we propose a Matryoshka MAmba and CoNtrasTive LeArning framework (Manta). Firstly, the Matryoshka Mamba introduces multiple Inner Modules to enhance local feature representation, rather than directly modeling global features. An Outer Module captures dependencies of timeline between these local features for implicit temporal alignment. Secondly, a hybrid contrastive learning paradigm, combining both supervised and unsupervised methods, is designed to mitigate the negative effects of intra-class variance accumulation. The Matryoshka Mamba and the hybrid contrastive learning paradigm operate in two parallel branches within Manta, enhancing Mamba for FSAR of long sub-sequence. Manta achieves new state-of-the-art performance on prominent benchmarks, including SSv2, Kinetics, UCF101, and HMDB51. Extensive empirical studies prove that Manta significantly improves FSAR of long sub-sequence from multiple perspectives.
There are no more papers matching your filters at the moment.