The NSF AI Institute for Artificial Intelligence and Fundamental Interactions
This research investigates methods for creating specialized AI systems from larger, general-purpose models, demonstrating that pruning combined with regularization can efficiently extract narrow skills and robustly unlearn unrelated capabilities. The work reveals that broad training data acts as a crucial curriculum for learning hierarchical skills, while network skills often exhibit nonlocal, entangled representations.
Gamma-ray bursts are the most luminous electromagnetic events in the universe. Their prompt gamma-ray emission has typical durations between a fraction of a second and several minutes. A rare subset of these events have durations in excess of a thousand seconds, referred to as ultra-long gamma-ray bursts. Here, we report the discovery of the longest gamma-ray burst ever seen with a ~25,000 s gamma-ray duration, GRB 250702B, and characterize this event using data from four instruments in the InterPlanetary Network and the Monitor of All-sky X-ray Image. We find a hard spectrum, subsecond variability, and high total energy, which are only known to arise from ultrarelativistic jets powered by a rapidly-spinning stellar-mass central engine. These properties and the extreme duration are together incompatible with all confirmed gamma-ray burst progenitors and nearly all models in the literature. This burst is naturally explained with the helium merger model, where a field binary ends when a black hole falls into a stripped star and proceeds to consume and explode it from within. Under this paradigm, GRB 250702B adds to the growing evidence that helium stars expand and that some ultra-long GRBs have similar evolutionary pathways as collapsars, stellar-mass gravitational wave sources, and potentially rare types of supernovae.
This work establishes a unified theoretical framework for quantum information processing by extending the autonomous Hamiltonian approach to quantum systems, integrating quantum thermodynamics, Landauer's principle, and quantum speed limits. It derives conditions for catalytic work sources, formulates a generalized quantum Landauer bound that explicitly accounts for initial correlations, and introduces a Quantum Thermodynamic Speed Limit (QTSL) to quantify ultimate processing rates.
We study the impact of warm dark matter (WDM) particle mass on galaxy properties using 1,024 state-of-the-art cosmological hydrodynamical simulations from the DREAMS project. We begin by using a Multilayer Perceptron (MLP) coupled with a normalizing flow to explore global statistical descriptors of galaxy populations, such as the mean, standard deviation, and histograms of 14 galaxy properties. We find that subhalo gas mass is the most informative feature for constraining the WDM mass, achieving a determination coefficient of R^2 = 0.9. We employ symbolic regression to extract simple, interpretable relations with the WDM particle mass. Finally, we adopt a more localized approach by selecting individual dark matter halos and using a Graph Neural Network (GNN) with a normalizing flow to infer the WDM mass, incorporating subhalo properties as node features and global simulation statistics as graph-level features. The GNN approach yields only a residual improvement over MLP models based solely on global features, indicating that most of the predictive power resides in the global descriptors, with only marginal gains from halo-level information.
In this work, we present and investigate the novel blind inverse problem of position-blind ptychography, i.e., ptychographic phase retrieval without any knowledge of scan positions, which then must be recovered jointly with the image. The motivation for this problem comes from single-particle diffractive X-ray imaging, where particles in random orientations are illuminated and a set of diffraction patterns is collected. If one uses a highly focused X-ray beam, the measurements would also become sensitive to the beam positions relative to each particle and therefore ptychographic, but these positions are also unknown. We investigate the viability of image reconstruction in a simulated, simplified 2-D variant of this difficult problem, using variational inference with modern data-driven image priors in the form of score-based diffusion models. We find that, with the right illumination structure and a strong prior, one can achieve reliable and successful image reconstructions even under measurement noise, in all except the most difficult evaluated imaging scenario.
This research introduces a causal foundation model designed to disentangle true physical phenomena from instrumental distortions within structured time series data. The approach demonstrates that explicitly separating these factors leads to more robust, interpretable, and data-efficient representations, particularly for few-shot learning in scientific applications.
The Vera C. Rubin Observatory will soon survey the southern sky, delivering a depth and sky coverage that is unprecedented in time domain astronomy. As part of commissioning, Data Preview 1 (DP1) has been released. It comprises a LSSTComCam observing campaign between November and December 2024 with multi-band imaging of seven fields, covering roughly 0.4 square degrees each, providing a first glimpse into the data products that will become available once the Legacy Survey of Space and Time begins. In this work, we search three fields for extragalactic transients. We identify eight new likely supernovae, and three known ones from a sample of 369,644 difference image analysis objects. Photometric classification using Superphot+ assigns sub-classes with >95% confidence to only one SN Ia and one SN II in this sample. Our findings are in agreement with supernova detection rate predictions of 15±415\pm4 supernovae from simulations using simsurvey. The supernova detection rate in the data is possibly affected by the lack of suitable templates. Nevertheless, this work demonstrates the quality of the data products delivered in DP1 and indicates that the Rubin Observatory's Legacy Survey of Space and Time (LSST) is well placed to fulfill its discovery potential in time domain astronomy.
We present a formula for the universal anomalous scaling of the multipole moments of a generic gravitating source in classical general relativity. We derive this formula in two independent ways using effective field theory methods. First, we use the absorption of low frequency gravitational waves by a black hole to identify the total multipole scaling dimension as the renormalized angular momentum of black hole perturbation theory. More generally, we show that the anomalous dimension is determined by phase shifts of gravitational waves elastically scattering off generic source multipole moments, which reproduces the renormalized angular momentum in the particular case of black holes. The effective field theory approach thus clarifies the role of the renormalized angular momentum in the multipole expansion. The universality of the point-particle effective description of compact gravitating systems further allows us to extract the universal part of the anomalous dimension, which is the same for any object, including black holes, neutron stars, and binary systems. As an application, we propose a novel resummation of the universal short-distance logarithms (``tails'') in the gravitational waveform of binary systems, which may improve the modeling of signals from current and future gravitational wave experiments.
We present the first independent re-analysis of the galaxy clustering data from DESI Data Release 1, utilizing an effective field theory (EFT)-based full-shape model. We analyze the power spectra and bispectra of the public catalogs using a custom-built pipeline based on window-deconvolved quasi-optimal estimators, carefully accounting for all relevant systematic effects. Compared to the official collaboration analysis, we add the galaxy power spectrum hexadecapole and the bispectrum monopole, and also introduce a novel stochastic estimator for fiber collisions, which facilitates robust bispectrum analyses. As a first application, we perform an EFT-based full-shape analysis of the DESI power spectra and bispectra in the context of the standard cosmological model, Λ\LambdaCDM. Using external priors on the physical baryon density and the primordial power spectrum tilt, we constrain the matter density fraction to Ωm=0.284±0.011\Omega_m=0.284\pm 0.011, the Hubble constant to h=0.707±0.011h=0.707\pm 0.011, and the mass fluctuation amplitude to σ8=0.811±0.030\sigma_8=0.811\pm 0.030. The bispectrum has a noticeable effect on parameter estimation: it sharpens the constraints on σ8\sigma_8 and Ωm\Omega_m by 10\approx 10\% and shifts Ωm\Omega_m by 1σ\approx 1\sigma towards the Planck Λ\LambdaCDM value. Combining our full-shape likelihood with the official DESI DR2 BAO measurements, cosmological parameters shift further towards the \textit{Planck} values, with Ωm=0.296±0.007\Omega_m=0.296\pm 0.007, h=0.688±0.006h=0.688\pm 0.006, σ8=0.818±0.029\sigma_8=0.818\pm 0.029 (with tighter constraints obtained in joint analyses). Finally, the galaxy bispectrum data dramatically improves measurements of quadratic bias parameters, which are consistent with predictions from halo occupation distribution models. Our work highlights the importance of higher-order statistics and sets the stage for upcoming full-shape analyses of non-minimal cosmological models.
Direct detection experiments require information about the local dark matter speed distribution to produce constraints on dark matter candidates, or infer their properties in the event of a discovery. In this paper, we analyze how the uncertainty in the dark matter speed distribution near the Sun is affected by baryonic feedback, halo-to-halo variance, and halo mass. To do so, we harness the statistical power of the new DREAMS Cold Dark Matter simulation suite, which is comprised of 1024 zoom-in Milky Way-mass halos with varied initial conditions as well as cosmological and astrophysical parameters. Applying a normalizing flows emulator to these simulations, we find that the uncertainty in the local DM speed distribution is dominated by halo-to-halo variance and, to a lesser extent, uncertainty in host halo mass. Uncertainties in supernova and black hole feedback (from the IllustrisTNG model in this case) are negligible in comparison. Using the DREAMS suite, we present a state-of-the-art prediction for the DM speed distribution in the Milky Way. Although the Standard Halo Model is contained within the uncertainty of this prediction, individual galaxies may have distributions that differ from it. Lastly, we apply our DREAMS results to the XENON1T experiment and demonstrate that the astrophysical uncertainties are comparable to the experimental ones, solidifying previous results in the literature obtained with a smaller sample of simulated Milky Way-mass halos.
We use the embedding formalism to construct conformal fields in DD dimensions, by restricting Lorentz-invariant ensembles of homogeneous neural networks in (D+2)(D+2) dimensions to the projective null cone. Conformal correlators may be computed using the parameter space description of the neural network. Exact four-point correlators are computed in a number of examples, and we perform a 4D conformal block decomposition that elucidates the spectrum. In some examples the analysis is facilitated by recent approaches to Feynman integrals. Generalized free CFTs are constructed using the infinite-width Gaussian process limit of the neural network, enabling a realization of the free boson. The extension to deep networks constructs conformal fields at each subsequent layer, with recursion relations relating their conformal dimensions and four-point functions. Numerical approaches are discussed.
41
We develop a pairing-based graph neural network for simulating quantum many-body systems. Our architecture augments a BCS-type geminal wavefunction with a generalized pair amplitude parameterized by a graph neural network. Variational Monte Carlo with our neural network simultaneously provides an accurate, flexible, and scalable method for simulating many-electron systems. We apply this method to two-dimensional semiconductor electron-hole bilayers and obtain accurate results on a variety of interaction-induced phases, including the exciton Bose-Einstein condensate, electron-hole superconductor, and bilayer Wigner crystal. Our study demonstrates the potential of physically-motivated neural network wavefunctions for quantum materials simulations.
We present JWST/NIRSpec and NIRCam observations of the first optically selected off-nuclear tidal disruption event (TDE), AT 2024tvd, along with Keck/KCWI integral field unit spectroscopy. The spectra show broad H and He emission lines that are characteristic of a TDE. Stellar kinematics show smooth host-galaxy morphology and ordered bulge rotation, with no evidence of disturbances in velocity, dispersion, age or metallicity space. We construct the first quasi-simultaneous spectral-energy distribution (SED) from X-rays to infrared for a TDE and decompose it into three components: the TDE accretion flow, an unresolved nuclear star cluster (NSC), and heated dust emission. The accretion component implies a black hole mass of log(M/M)=5.50±0.04\log(M_\bullet/M_\odot) = 5.50\pm 0.04, an instantaneous super-Eddington accretion rate of log(M˙/Myr1)=1.22±0.04\log (\dot{M}/M_{\odot} yr^{-1}) = -1.22 \pm 0.04, and an outer disk photosphere radius of log(rout/rg)=3.8±0.1\log(r_{out}/r_{g}) = 3.8 \pm 0.1. The dust emission is well described by a blackbody with Tdust=873±15T_{dust} = 873\pm 15 K and peak luminosity log(Ldust/erg\log (L_{dust}/erg s1)=40.80±0.01s^{-1}) = 40.80\pm 0.01, consistent with a dust echo near the sublimation radius. The SED is best fit when including additional stellar emission above the galaxy background at the TDE location, corresponding to log(M/M)=7.970.26+0.16\log(M_{\star}/M_\odot) = 7.97^{+0.16}_{-0.26}, which we interpret as a massive NSC or an ultra-compact dwarf galaxy. These results support a minor-merger origin for the MBH responsible for the TDE over scenarios involving gravitational recoil or dynamical ejection from the nucleus.
A team of researchers from MIT, LBNL, KEK, and IAS developed an analytical, perturbative forward model based on Effective Field Theory (EFT) to predict the Lyman-alpha (Ly-alpha) forest at the field level. This model accurately reproduces the Ly-alpha forest's flux distribution, power spectrum, and cross-correlations with dark matter halos with percent-level agreement down to scales of a few megaparsecs when compared against hydrodynamic simulations.
Recent results from DESI BAO analyses suggest that dark energy may not be a cosmological constant and is in fact dynamical. Furthermore, the data suggest that the equation of state may have been in the phantom regime in the distant past, recently undergoing a phantom crossing. In this work, we investigate whether this preference can be realized within a kinetically mixed axion-dilaton (KMIX) quintessence model, a string-motivated system in which an axion-like field couples exponentially to a dilaton-like (moduli) field. Crucially, KMIX can appear phantom in a standard Chevallier-Polarski-Linder (CPL) based analysis. To confront the model with data, we develop a fast pipeline based on normalizing flows that (i) learns a theory-informed prior on (w0,wa)(w_0,w_a) from KMIX realizations and (ii) provides an inverse mapping from CPL parameters back to the physical KMIX parameters. By importance-sampling pre-computed CPL chains using this framework, we effectively transform generic phenomenological constraints into direct, computationally efficient constraints on the underlying KMIX theory, avoiding the prohibitive cost of full parameter space exploration. Applied to Planck+DESI DR2 BAO measurements, our framework finds support for KMIX at 2.5σ2.5\sigma compared to the base CPL fit at 3.1σ3.1\sigma, demonstrating that KMIX may account for the DESI preference without invoking true phantom behavior. When additionally including Type Ia supernovae data, we find that the preference remains above 3σ3\sigma for Union3 and DES Y5, but drops to 2.1σ2.1\sigma with Pantheon+. The latter, combined with the DESI full-shape power spectrum and bispectrum data, further reduces the preference to 1.7σ1.7\sigma. Ultimately, should the DESI deviation persist with future data, KMIX may offer a theoretically well-motivated explanation for the phantom-like signatures inferred from phenomenological fits.
In this paper, we present a method of embedding physics data manifolds with metric structure into lower dimensional spaces with simpler metrics, such as Euclidean and Hyperbolic spaces. We then demonstrate that it can be a powerful step in the data analysis pipeline for many applications. Using progressively more realistic simulated collisions at the Large Hadron Collider, we show that this embedding approach learns the underlying latent structure. With the notion of volume in Euclidean spaces, we provide for the first time a viable solution to quantifying the true search capability of model agnostic search algorithms in collider physics (i.e. anomaly detection). Finally, we discuss how the ideas presented in this paper can be employed to solve many practical challenges that require the extraction of physically meaningful representations from information in complex high dimensional datasets.
In mathematics or theoretical physics one is often interested in obtaining an exact analytic description of some data which can be produced, in principle, to arbitrary accuracy. For example, one might like to know the exact analytical form of a definite integral. Such problems are not well-suited to numerical symbolic regression, since typical numerical methods lead only to approximations. However, if one has some sense of the function space in which the analytic result should lie, it is possible to deduce the exact answer by judiciously sampling the data at a sufficient number of points with sufficient precision. We demonstrate how this can be done for the computation of Feynman integrals. We show that by combining high-precision numerical integration with analytic knowledge of the function space one can often deduce the exact answer using lattice reduction. A number of examples are given as well as an exploration of the trade-offs between number of datapoints, number of functional predicates, precision of the data, and compute. This method provides a bottom-up approach that neatly complements the top-down Landau-bootstrap approach of trying to constrain the exact answer using the analytic structure alone. Although we focus on the application to Feynman integrals, the techniques presented here are more general and could apply to a wide range of problems where an exact answer is needed and the function space is sufficiently well understood.
The Laser Interferometer Gravitational-Wave Observatory (LIGO) and Virgo Interferometer Collaborations have now detected all three classes of compact binary mergers: binary black hole (BBH), binary neutron star (BNS), and neutron star-black hole (NSBH). For coalescences involving neutron stars, the simultaneous observation of gravitational and electromagnetic radiation produced by an event, has broader potential to enhance our understanding of these events, and also to probe the equation of state (EOS) of dense matter. However, electromagnetic follow-up to gravitational wave (GW) events requires rapid real-time detection and classification of GW signals, and conventional detection approaches are computationally prohibitive for the anticipated rate of detection of next-generation GW detectors. In this work, we present the first deep learning based results of classification of GW signals from NSBH mergers in \textit{real} LIGO data. We show for the first time that a deep neural network can successfully distinguish all three classes of compact binary mergers and separate them from detector noise. Specifically, we train a convolutional neural network (CNN) on 500,000\sim 500,000 data samples of real LIGO noise with injected BBH, BNS, and NSBH GW signals, and we show that our network has high sensitivity and accuracy. Most importantly, we successfully recover the two confirmed NSBH events to-date (GW200105 and GW200115) and the two confirmed BNS mergers to-date (GW170817 and GW190425), together with $\approx 90\%$ of all BBH candidate events from the third Gravitational Wave Transient Catalog, GWTC-3. These results are an important step towards low-latency real-time GW detection, enabling multi-messenger astronomy.
The monotonic dependence of the outputs of a neural network on some of its inputs is a crucial inductive bias in many scenarios where domain knowledge dictates such behavior. This is especially important for interpretability and fairness considerations. In a broader context, scenarios in which monotonicity is important can be found in finance, medicine, physics, and other disciplines. It is thus desirable to build neural network architectures that implement this inductive bias provably. In this work, we propose a weight-constrained architecture with a single residual connection to achieve exact monotonic dependence in any subset of the inputs. The weight constraint scheme directly controls the Lipschitz constant of the neural network and thus provides the additional benefit of robustness. Compared to currently existing techniques used for monotonicity, our method is simpler in implementation and in theory foundations, has negligible computational overhead, is guaranteed to produce monotonic dependence, and is highly expressive. We show how the algorithm is used to train powerful, robust, and interpretable discriminators that achieve competitive performance compared to current state-of-the-art methods across various benchmarks, from social applications to the classification of the decays of subatomic particles produced at the CERN Large Hadron Collider.
We present an efficient and accurate pipeline for the analysis of the redshift-space galaxy bispectrum multipoles at one-loop order in effective field theory (EFT). We provide a systematic theory derivation based on power counting, which features the first comprehensive treatment of stochastic EFT contributions -- these are found to significantly improve the match to data. Our computational pipeline utilizes the COBRA technique that expands the linear matter power spectrum over a basis of principal components based on a singular value decomposition, allowing the cosmology dependence to be captured to sub-permille accuracy with just eight templates. This transforms the problem of computing the one-loop EFT bispectrum to a simple tensor multiplication, reducing the computation time to around a second per cosmology with negligible loss of accuracy. Using these tools, we study the cosmological information in the bispectrum by analyzing PTChallenge simulations, whose gigantic volume provides the most powerful test of the one-loop EFT bispectrum so far. We find that the one-loop prediction provides an excellent match to the bispectrum data up to kmax=0.15 hk_{\rm max}=0.15~h Mpc1^{-1}, as evidenced by the precise recovery of the dark matter density ωcdm\omega_\text{cdm}, Hubble constant H0H_0, and mass fluctuation amplitude σ8\sigma_8 parameters, and the amplitude of equilateral primordial non-Gaussianity (PNG) fNLequilf_{\rm NL}^{\rm equil}. Combined with the power spectrum, the COBRA-based one-loop bispectrum multipoles yield tighter constraints than the tree-level bispectrum monopole, with the posteriors on ωcdm\omega_{\text{cdm}}, H0H_0, and σ8\sigma_8 shrinking by 41\%, 25\%, and 19\%, respectively. This suggests that the COBRA-based bispectrum analysis will be an important tool in the interpretation of data from ongoing redshift surveys such as DESI and Euclid.
There are no more papers matching your filters at the moment.