Beijing Institute of Mathematical Sciences and Applications
In this study, we unveil a new AI model, termed PhyE2E, to discover physical formulas through symbolic regression. PhyE2E simplifies symbolic regression by decomposing it into sub-problems using the second-order derivatives of an oracle neural network, and employs a transformer model to translate data into symbolic formulas in an end-to-end manner. The resulting formulas are refined through Monte-Carlo Tree Search and Genetic Programming. We leverage a large language model to synthesize extensive symbolic expressions resembling real physics, and train the model to recover these formulas directly from data. A comprehensive evaluation reveals that PhyE2E outperforms existing state-of-the-art approaches, delivering superior symbolic accuracy, precision in data fitting, and consistency in physical units. We deployed PhyE2E to five applications in space physics, including the prediction of sunspot numbers, solar rotational angular velocity, emission line contribution functions, near-Earth plasma pressure, and lunar-tide plasma signals. The physical formulas generated by AI demonstrate a high degree of accuracy in fitting the experimental data from satellites and astronomical telescopes. We have successfully upgraded the formula proposed by NASA in 1993 regarding solar activity, and for the first time, provided the explanations for the long cycle of solar activity in an explicit form. We also found that the decay of near-Earth plasma pressure is proportional to r^2 to Earth, where subsequent mathematical derivations are consistent with satellite data from another independent study. Moreover, we found physical formulas that can describe the relationships between emission lines in the extreme ultraviolet spectrum of the Sun, temperatures, electron densities, and magnetic fields. The formula obtained is consistent with the properties that physicists had previously hypothesized it should possess.
2
LMNet introduces a system where language models communicate directly using dense vector representations, bypassing natural language. This approach enables smaller, collaborating models to achieve performance comparable to larger monolithic systems while significantly reducing training computational costs.
Topological data analysis (TDA) is a rapidly evolving field in applied mathematics and data science that leverages tools from topology to uncover robust, shape-driven insights in complex datasets. The main workhorse is persistent homology, a technique rooted in algebraic topology. Paired with topological deep learning (TDL) or topological machine learning, persistent homology has achieved tremendous success in a wide variety of applications in science, engineering, medicine, and industry. However, persistent homology has many limitations due to its high-level abstraction, insensitivity to non-topological changes, and reliance on point cloud data. This paper presents a comprehensive review of TDA and TDL beyond persistent homology. It analyzes how persistent topological Laplacians and Dirac operators provide spectral representations to capture both topological invariants and homotopic evolution. Other formulations are presented in terms of sheaf theory, Mayer topology, and interaction topology. For data on differentiable manifolds, techniques rooted in differential topology, such as persistent de Rham cohomology, persistent Hodge Laplacian, and Hodge decomposition, are reviewed. For one-dimensional (1D) curves embedded in 3-space, approaches from geometric topology are discussed, including multiscale Gauss-link integrals, persistent Jones polynomials, and persistent Khovanov homology. This paper further discusses the appropriate selection of topological tools for different input data, such as point clouds, sequential data, data on manifolds, curves embedded in 3-space, and data with additional non-geometric information. A review is also given of various topological representations, software packages, and machine learning vectorizations. Finally, this review ends with concluding remarks.
Meta-learning enables learning systems to adapt quickly to new tasks, similar to humans. Different meta-learning approaches all work under/with the mini-batch episodic training framework. Such framework naturally gives the information about task identity, which can serve as additional supervision for meta-training to improve generalizability. We propose to exploit task identity as additional supervision in meta-training, inspired by the alignment and discrimination ability which is is intrinsic in human's fast learning. This is achieved by contrasting what meta-learners learn, i.e., model representations. The proposed ConML is evaluating and optimizing the contrastive meta-objective under a problem- and learner-agnostic meta-training framework. We demonstrate that ConML integrates seamlessly with existing meta-learners, as well as in-context learning models, and brings significant boost in performance with small implementation cost.
Based on the spectral decomposition technique, we introduce a simple and universal numerical method to analyze the stability of solitons. Adopting this method, the linear dynamical properties of QQ-balls are systematically revealed, from the fundamental to the excited states. For the fundamental QQ-ball, the well-known stability criterion holds. However, for the excited QQ-balls, the situation becomes extremely complicated, in which the stability criterion is violated. The system exhibits dynamical instability to both spherically symmetric and non-spherically symmetric perturbations, manifested in the appearance of complex and imaginary modes. In addition, we observe two interesting phenomena. One is that the oscillation mode and the complex or imaginary mode can transform into each other, marking the transition of the dynamical properties of the system. The other is the existence of excited QQ-balls capable of resisting perturbations with low-order spherical harmonics. Such results indicate that the excited QQ-balls will exhibit rich dynamical behaviors.
Finding an ϵ\epsilon-stationary point of a nonconvex function with a Lipschitz continuous Hessian is a central problem in optimization. Regularized Newton methods are a classical tool and have been studied extensively, yet they still face a trade-off between global and local convergence. Whether a parameter-free algorithm of this type can simultaneously achieve optimal global complexity and quadratic local convergence remains an open question. To bridge this long-standing gap, we propose a new class of regularizers constructed from the current and previous gradients, and leverage the conjugate gradient approach with a negative curvature monitor to solve the regularized Newton equation. The proposed algorithm is adaptive, requiring no prior knowledge of the Hessian Lipschitz constant, and achieves a global complexity of O(ϵ3/2)O(\epsilon^{-3/2}) in terms of the second-order oracle calls, and O~(ϵ7/4)\tilde{O}(\epsilon^{-7/4}) for Hessian-vector products, respectively. When the iterates converge to a point where the Hessian is positive definite, the method exhibits quadratic local convergence. Preliminary numerical results, including training the physics-informed neural networks, illustrate the competitiveness of our algorithm.
5
We analyze the double Wick rotated BTZ black hole with the Euclidean signature, which is a Riemannian manifold. We calculate thermodynamics, total energy of spacetime, and holographic two point functions in the double Wick rotated background. Results agree with those of a rotating BTZ with the same periodicity.
The detection of gravitational waves from extreme-mass-ratio inspirals (EMRIs) in space-borne antennas like Taiji and LISA promises deep insights into strong-field gravity and black hole physics. However, the complex, highly degenerate, and non-convex likelihood landscapes characteristic of EMRI parameter spaces pose severe challenges for conventional Markov chain Monte Carlo (MCMC) methods. Under realistic instrumental noise and broad priors, these methods demand impractical computational costs but are prone to becoming trapped in local maxima, leading to biased and unreliable parameter estimates. To address this, we introduce Flow-Matching Markov Chain Monte Carlo (FM-MCMC), a novel Bayesian framework that integrates continuous normalizing flows (CNFs) with parallel tempering MCMC (PTMCMC). By generating high-likelihood regions via CNFs and refining them through PTMCMC, FM-MCMC enables robust exploration of the nontrivial parameter spaces, while achieving orders-of-magnitude improvement in computational efficiency and, more importantly, ensuring statistically reliable and unbiased inference. By enabling real-time, unbiased parameter inference, FM-MCMC could unlock the full scientific potential of EMRI observations, and would serve as a scalable pipeline for precision gravitational-wave astronomy.
Large language models (LLMs) exhibit unprecedentedly rich scaling behaviors. In physics, scaling behavior is closely related to phase transitions, critical phenomena, and field theory. To investigate the phase transition phenomena in LLMs, we reformulated the Transformer architecture as an O(N)O(N) model. Our study reveals two distinct phase transitions corresponding to the temperature used in text generation and the model's parameter size, respectively. The first phase transition enables us to estimate the internal dimension of the model, while the second phase transition is of \textit{higher-depth} and signals the emergence of new capabilities. As an application, the energy of the O(N)O(N) model can be used to evaluate whether an LLM's parameters are sufficient to learn the training data.
Simulating real-time quantum dynamics in interacting spin systems is a fundamental challenge, where exact diagonalization suffers from exponential Hilbert-space growth and tensor-network methods face entanglement barriers. In this work, we introduce a scalable Pauli propagation approach that evolves local observables directly in the Heisenberg picture. Theoretically, we derive a priori error bounds governed by the Operator Stabilizer Rényi entropy (OSE) Sα(O)\mathcal{S}^\alpha(O), which explicitly links the truncation accuracy to operator complexity and prescribes a suitable Top-KK truncation strategy. For the 1D Heisenberg model with Jz=0J_z = 0, we prove the number of non-zero Pauli coefficients scales quadratically in Trotter steps, establishing the compressibility of Heisenberg-evolved operators. Numerically, we validate the framework on XXZ Heisenberg chain benchmarks, showing high accuracy with small KK in free regimes (Jz=0J_z = 0) and competitive performance against tensor-network methods (e.g., TDVP) in interacting cases (Jz=0.5J_z = 0.5). These results establish an observable-centric simulator whose cost is governed by operator complexity rather than entanglement, offering a practical alternative for studying non-equilibrium dynamics in quantum many-body systems.
We study the spectral problem in deformed supersymmetric quantum mechanics with polynomial superpotential by using the exact WKB method and the TBA equations. We apply the ODE/IM correspondence to the Schrödinger equation with an effective potential deformed by integrating out the fermions, which admits a continuous deformation parameter. We find that the TBA equations are described by the Z4{\mathbb Z}_4-extended ones. For cubic superpotential corresponding to the symmetric double-well potential, the TBA system splits into the two D3D_3-type TBA equations. We investigate in detail this example based on the TBA equations and their analytic continuation as well as the massless limit. We find that the energy spectrum obtained from the exact quantization condition is in good agreement with the diagonalization approach of the Hamiltonian.
We introduce a novel machine learning based framework for discovering integrable models. Our approach first employs a synchronized ensemble of neural networks to find high-precision numerical solution to the Yang-Baxter equation within a specified class. Then, using an auxiliary system of algebraic equations, [Q_2, Q_3] = 0, and the numerical value of the Hamiltonian obtained via deep learning as a seed, we reconstruct the entire Hamiltonian family, forming an algebraic variety. We illustrate our presentation with three- and four-dimensional spin chains of difference form with local interactions. Remarkably, all discovered Hamiltonian families form rational varieties.
Locally repairable codes (LRCs) play a crucial role in mitigating data loss in large-scale distributed and cloud storage systems. This paper establishes a unified decomposition theorem for general optimal (r,δ)(r,\delta)-LRCs. Based on this, we obtain that the local protection codes of general optimal (r,δ)(r,\delta)-LRCs are MDS codes with the same minimum Hamming distance δ\delta. We prove that for general optimal (r,δ)(r,\delta)-LRCs, their minimum Hamming distance dd always satisfies dδd\geq \delta. We fully characterize the optimal quantum (r,δ)(r,\delta)-LRCs induced by classical optimal (r,δ)(r,\delta)-LRCs that admit a minimal decomposition. We construct three infinite families of optimal quantum (r,δ)(r,\delta)-LRCs with flexible parameters.
Loss explosions in training deep neural networks can nullify multi-million dollar training runs. Conventional monitoring metrics like weight and gradient norms are often lagging and ambiguous predictors, as their values vary dramatically across different models and even between layers of the same model, making it difficult to establish a unified standard for detecting impending failure. We introduce Spectral Alignment (SA), a novel, theoretically-grounded metric that monitors the distributional alignment between layer inputs and the principal singular vectors of weight matrices. We show that a collapse in the sign diversity of this alignment is a powerful early predictor of representational collapse and training divergence. Empirical results on language models demonstrate that monitoring the SA distribution provides a significantly earlier and clearer warning of loss explosions than traditional scalar metrics. SA's low computational overhead makes it a practical tool for safeguarding model training.
We develop a unified mathematical theory of defect condensations for topological orders in all dimensions based on higher categories, higher algebras and higher representations. A k-codimensional topological defect AA in an n+1D (potentially anomalous) topological order Cn+1C^{n+1} is condensable if it is equipped with the structure of a condensable EkE_k-algebra. Condensing such a defect AA amounts to a k-step process. In the first step, we condense the defect AA along one of its transversal directions, thus obtaining a (k-1)-codimensional defect ΣA\Sigma A, which is naturally equipped with the structure of a condensable Ek1E_{k-1}-algebra. In the second step, we condense the defect ΣA\Sigma A in one of the remaining transversal directions, thus obtaining a (k-2)-codimensional defect Σ2A\Sigma^2 A, so on and so forth. In the k-th step, we condense the 1-codimensional defect Σk1A\Sigma^{k-1}A along the only transversal direction, thus defining a phase transition from Cn+1C^{n+1} to a new n+1D topological order Dn+1D^{n+1}. We give precise mathematical descriptions of each step in above process, including the precise mathematical characterization of the condensed phase Dn+1D^{n+1}. When Cn+1C^{n+1} is anomaly-free, the same phase transition can be alternatively defined by replacing the last two steps by a single step of condensing the E2E_2-algebra Σk2A\Sigma^{k-2}A directly along the remaining two transversal directions. When n=2, this modified last step is precisely a usual anyon condensation in a 2+1D topological order. We derive many new mathematical results physically along the way. We also establish the connections among various notions of "gauging" symmetries. We also briefly discuss questions, generalizations and applications that naturally arise from our theory, including higher Morita theory, a theory of integrals and the condensations of liquid-like gapless defects in topological orders.
We construct dimer graph for relativistic Toda chain associated with classical untwisted Lie algebra of A, B, C0_0, Cπ_\pi, D types and twisted A, D types. We show that the Seiberg-Witten curve of 5d \CalN=1\CalN=1 pure supersymmetric gauge theory of gauge group GG is a spectral curve of the relativistic Toda chain of the dual group GG^\vee.
We develop an effective superpotential formalism for the SU(2)×\timesU(1) invariant sector of N=2\mathcal{N}=2 gauged supergravity in five dimensions with a U(1)3^3 Fayet-Iliopoulos gauging, and determine the exact superpotential that describes all 1/4 BPS solutions in this sector. This includes the Gutowski-Reall black holes, but also a much broader class of solutions with a squashed S3S^3, magnetic flux and vector multiplet sources, as well as complex Euclidean BPS saddles. Some of these solutions are known only numerically, but the exact superpotential allows us to analytically evaluate the on-shell action, holographic one-point functions and conserved charges of all BPS solutions and to study their thermodynamics. In particular, by examining the supersymmetry Ward identities we show that solutions with supersymmetric vector multiplet sources break supersymmetry spontaneously. We also demonstrate the first law for black holes in the SU(2)×\timesU(1) invariant sector and show that the conserved charges of supersymmetric solutions satisfy the generalized BPS relation derived in arXiv:1703.04299, which includes the supersymmetric Casimir energy as a consequence of the anomalous supersymmetry transformation of the N=1\mathcal{N}=1 supercurrent at the boundary. Finally, we show that the effective superpotential provides a unifying entropy extremization principle, reproducing Sen's entropy function for near extremal black holes and the Hosseini-Hristov-Zaffaroni functional for complex Euclidean BPS saddles.
PERSCEN, developed by researchers from Northwestern Polytechnical University, BIMSA, Tsinghua University, and Meituan, introduces a personalized multi-scenario matching framework. It enhances recommendation by explicitly modeling user-specific interaction patterns and scenario-aware preferences, achieving superior performance on public datasets, especially in data-sparse scenarios, while maintaining efficiency for industrial deployment.
We consider the Glauber-Kawasaki dynamics on a dd-dimensional periodic lattice of size NN, that is, a stochastic time evolution of particles performing random walks with interaction subject to the exclusion rule (Kawasaki part), in general, of non-gradient type, together with the effect of the creation and annihilation of particles (Glauber part) whose rates are set to favor two levels of particle density, called sparse and dense. We then study the limit of our dynamics under the hydrodynamic space-time scaling, that is, 1/N1/N in space and a diffusive scaling N2N^2 for the Kawasaki part and another scaling K=K(N)K=K(N), which diverges slower, for the Glauber part in time. In the limit as NN\to\infty, we show that the particles autonomously make phase separation into sparse or dense phases at the microscopic level, and an interface separating two regions is formed at the macroscopic level and evolves under an anisotropic curvature flow. In the present article, we show that the particle density at the macroscopic level is well approximated by a solution of a reaction-diffusion equation with a nonlinear diffusion term of divergence form and a large reaction term. Furthermore, by applying the results of Funaki, Gu and Wang [arXiv:2404.12234] for the convergence rate of the diffusion matrix approximated by local functions, we obtain a quantitative hydrodynamic limit as well as the upper bound for the allowed diverging speed of K=K(N)K=K(N). The above result for the derivation of the interface motion is proved by combining our result with that in a companion paper by Funaki and Park [arXiv:2403.01732], in which we analyzed the asymptotic behavior of the solution of the reaction-diffusion equation obtained in the present article and derived an anisotropic curvature flow in the situation where the macroscopic reaction term determined from the Glauber part is bistable and balanced.
TTˉT\bar T deformed CFTs with positive deformation parameter have been proposed to be holographically dual to Einstein gravity in a glue-on AdS3\mathrm{AdS}_3 spacetime. The latter is constructed from AdS3_3 by gluing a patch of an auxiliary AdS3_3^* spacetime to its asymptotic boundary. In this work, we propose a glue-on version of the Ryu-Takayanagi formula, which is given by the signed area of an extremal surface. The extremal surface is anchored at the endpoints of an interval on a cutoff surface in the glue-on geometry. It consists of an RT surface lying in the AdS3_3 part of the spacetime and its extension to the AdS3_3^* region. The signed area is the length of the RT surface minus the length of the segments in AdS3_3^*. We find that the Ryu-Takayanagi formula with the signed area reproduces the entanglement entropy of a half interval for TTˉT\bar T-deformed CFTs on the sphere. We then study the properties of extremal surfaces on various glue-on geometries, including Poincaré AdS3\mathrm{AdS}_3, global AdS3\mathrm{AdS}_3, and the BTZ black hole. When anchored on multiple intervals at the boundary, the signed area of the minimal surfaces undergoes phase transitions with novel properties. In all of these examples, we find that the glue-on extremal surfaces exhibit a minimum length related to the deformation parameter of TTˉT\bar T-deformed CFTs.
There are no more papers matching your filters at the moment.