University of Kansas
Human communication involves a complex interplay of verbal and nonverbal signals, essential for conveying meaning and achieving interpersonal goals. To develop socially intelligent AI technologies, it is crucial to develop models that can both comprehend and generate dyadic behavioral dynamics. To this end, we introduce the Seamless Interaction Dataset, a large-scale collection of over 4,000 hours of face-to-face interaction footage from over 4,000 participants in diverse contexts. This dataset enables the development of AI technologies that understand dyadic embodied dynamics, unlocking breakthroughs in virtual agents, telepresence experiences, and multimodal content analysis tools. We also develop a suite of models that utilize the dataset to generate dyadic motion gestures and facial expressions aligned with human speech. These models can take as input both the speech and visual behavior of their interlocutors. We present a variant with speech from an LLM model and integrations with 2D and 3D rendering methods, bringing us closer to interactive virtual agents. Additionally, we describe controllable variants of our motion models that can adapt emotional responses and expressivity levels, as well as generating more semantically-relevant gestures. Finally, we discuss methods for assessing the quality of these dyadic motion models, which are demonstrating the potential for more intuitive and responsive human-AI interactions.
300
An open-source framework, `exojax`, introduces an auto-differentiable spectral model for high-dispersion characterization of exoplanets and brown dwarfs, enabling efficient fully Bayesian inference of atmospheric parameters. This framework accurately modeled and provided robust parameter constraints for the brown dwarf Luhman 16 A, yielding a refined temperature at 1 bar of 1295 (+35, -32) K and a C/O ratio of 0.62 (+0.03, -0.04).
64
We report JWST NIRCAM and MIRI observations of Sgr B2, the most active site of star formation in the Galaxy. These observations, using 14 filters spanning 1.5 to 25 microns, have revealed a multilayered and highly structured cloud that contains both a revealed, low-extinction and hidden, high-extinction population of massive stars. JWST has detected new candidate HII regions around massive stars previously missed by radio telescopes. MIRI has detected radiation escaping from the forming massive cluster Sgr B2 N along its outflow cavities, demonstrating that infrared radiation finds geometric escape routes even in the densest, most heavily embedded regions in the universe. JWST further highlights the gas asymmetry in the cloud, showing a sharp, straight cutoff along the eastern cloud edge. Despite the great sensitivity of these observations, no extended population of YSOs has been detected, placing a limit on their minimum extinction; this result hints that star formation has only just begun in the cloud. Together, these results suggest that, despite already holding the crown for most actively star-forming cloud, we have underestimated the total star formation in Sgr B2. JWST unveils previously hidden massive stars and ionized structures, offering a transformative view of how stars form under some of the most extreme Galactic conditions.
·
A comprehensive survey of Graph Contrastive Learning (GCL) offers a structured taxonomy covering augmentation strategies, contrastive modes, and optimization objectives. It consolidates research on GCL's role in data-efficient learning and its diverse real-world applications, while outlining future challenges.
Young planets with mass measurements are particularly valuable in studying atmospheric mass-loss processes, but these planets are rare and their masses difficult to measure due to stellar activity. We report the discovery of a planetary system around TOI-6109, a young, 75 Myr-old Sun-like star in the Alpha Persei cluster. It hosts at least two transiting Neptune-like planets. Using three TESS sectors, 30 CHEOPS orbits, and photometric follow-up observations from the ground, we confirm the signals of the two planets. TOI-6109 b has an orbital period of P=5.69040.0004+0.00045.6904^{+0.0004}_{-0.0004} days and a radius of R=4.870.12+0.164.87^{+0.16}_{-0.12} R_\oplus. The outer planet, TOI-6109 c has an orbital period of P=8.53880.0005+0.00068.5388^{+0.0006}_{-0.0005} days and a radius of R=4.830.06+0.074.83^{+0.07}_{-0.06} R_\oplus. These planets orbit just outside a 3:2 mean motion resonance. The near-resonant configuration presents the opportunity to measure the planet's mass via TTV measurements and to bypass difficult RV measurements. Measuring the masses of the planets in this system will allow us to test theoretical models of atmospheric mass loss.
Root cause analysis (RCA) is crucial for enhancing the reliability and performance of complex systems. However, progress in this field has been hindered by the lack of large-scale, open-source datasets tailored for RCA. To bridge this gap, we introduce LEMMA-RCA, a large dataset designed for diverse RCA tasks across multiple domains and modalities. LEMMA-RCA features various real-world fault scenarios from IT and OT operation systems, encompassing microservices, water distribution, and water treatment systems, with hundreds of system entities involved. We evaluate the quality of LEMMA-RCA by testing the performance of eight baseline methods on this dataset under various settings, including offline and online modes as well as single and multiple modalities. Our experimental results demonstrate the high quality of LEMMA-RCA. The dataset is publicly available at this https URL
15
Researchers from Cornell University, University of Massachusetts Amherst, University of Kansas, and Samsung Electronics developed the Intelligent Knowledge Store (IKS), a CXL-based hardware accelerator designed to overcome the retrieval bottleneck in Retrieval-Augmented Generation (RAG) systems. IKS accelerates exact nearest neighbor search (ENNS), achieving 13.4–27.9 times faster search over a 512GB vector database and improving end-to-end RAG inference time by 1.7–26.3 times.
The IceCube Collaboration presents evidence for neutrino emission from a population of X-ray bright Active Galactic Nuclei, identifying a collective excess from 11 such sources with a 3.3σ global significance. The study also strengthens the detection of NGC 1068 as a persistent neutrino emitter with a refined, softer spectrum.
JWST has revealed an abundance of low-luminosity active galactic nuclei (AGN) at high redshifts (z>3z > 3), pushing the limits of black hole (BH) science in the early Universe. Results have claimed that these BHs are significantly more massive than expected from the BH mass-host galaxy stellar mass relation derived from the local Universe. We present a comprehensive census of the BH populations in the early Universe through a detailed stacking analysis of galaxy populations, binned by luminosity and redshift, using JWST spectroscopy from the CEERS, JADES, RUBIES, and GLASS extragalactic deep field surveys. Broad Hα\alpha detections in 31%31\% of the stacked spectra (5/16 bins) imply median BH masses of 105.21106.13 M10^{5.21} - 10^{6.13}~ \rm{M_{\odot}} and the stacked SEDs of these bins indicate median stellar masses of 107.84108.56 M10^{7.84} - 10^{8.56} ~\rm{M_{\odot}}. This suggests that the median galaxy hosts a BH that is at most a factor of 10 times over-massive compared to its host galaxy and lies closer to the locally derived MBHMM_{BH}-M_* relation. We investigate the seeding properties of the inferred BHs and find that they can be well-explained by a light stellar remnant seed undergoing moderate Eddington accretion. Our results indicate that individual detections of AGN are more likely to sample the upper envelope of the MBHMM_{BH}-M_* distribution, while stacking on ``normal" galaxies and searching for AGN signatures can overcome the selection bias of individual detections.
Large Language Models (LLMs) gain substantial reasoning and decision-making capabilities from thought structures. However, existing methods such as Tree of Thought and Retrieval Augmented Thoughts often fall short in complex tasks due to the limitations of insufficient local retrieval of factual knowledge and inadequate global selection of strategies. These limitations make it challenging for these methods to balance factual accuracy and comprehensive logical optimization effectively. To address these limitations, we introduce the Retrieval Augmented Thought Tree (RATT), a novel thought structure that considers both overall logical soundness and factual correctness at each step of the thinking process. Specifically, at every point of a thought branch, RATT performs planning and lookahead to explore and evaluate multiple potential reasoning steps, and integrate the fact-checking ability of Retrieval-Augmented Generation (RAG) with LLM's ability to assess overall strategy. Through this combination of factual knowledge and strategic feasibility, the RATT adjusts and integrates the thought tree structure to search for the most promising branches within the search space. This thought structure significantly enhances the model's coherence in logical inference and efficiency in decision-making, and thus increases the limit of the capacity of LLM to generate reliable inferences and decisions based on thought structures. A broad range of experiments on different types of tasks showcases that the RATT structure significantly outperforms existing methods in factual correctness and logical coherence.
9
In this paper, we define the rarefaction and compression characters for the supersonic expanding wave of the compressible Euler equations with radial symmetry. Under this new definition, we show that solutions with rarefaction initial data will not form shock in finite time, i.e. exist global-in-time as classical solutions. On the other hand, singularity forms in finite time when the initial data include strong compression somewhere. Several useful invariant domains will be also given.
We present the Cosmic Evolution Early Release Science Survey (CEERS) catalog, including space-based photometry, photometric redshifts, and physical parameters for more than 80,000 galaxies. The imaging used for this catalog comes from the CEERS survey, which has NIRCam coverage over ~100 sq. arcmin of the Extended Groth Strip (EGS) in seven filters from 1.15μ\mum to 4.44μ\mum. Alongside these data, we also include ancillary HST imaging in seven filters from 0.435μ\mum to 1.6μ\mum. We used Source Extractor with hot and cold detection settings to extract photometry. We derive photometric redshifts using the spectral energy distribution (SED) modeling code, LePHARE, and estimate their accuracy using spectroscopically confirmed galaxies out to z10z\sim10, with σNMAD\sigma_{NMAD} ranging from 0.035-0.073, depending strongly on galaxy magnitude and redshift. We compute stellar masses, star formation rates, and E(B-V) using three different SED fitting codes with different templates and assumptions about the galaxy star formation histories. All of these measurements, as well as the full mosaics in all filters, and redshift probability distribution functions, are made available via the CEERS DR1.0 data release.
JWST has revealed sulfur chemistry in giant exoplanet atmospheres, where molecules such as SO2 trace photochemistry, metallicity, and formation and migration. To ascertain the conditions that determine whether (or how much) SO2, H2S, and other sulfur-bearing species are present in exoplanet atmospheres, we present a grid of planetary atmospheres covering metallicities from 0.3-1000x Solar and temperatures from 250-2050 K. These models map out the 'SO2 shoreline,' the region of metallicity and irradiation for which SO2 may be sufficiently abundant to be detectable. SO2 is a sensitive indicator of metallicity; expected SO2 abundances also depend strongly on overall temperature and C/O ratio; the SO2 abundance depends surprisingly weakly on XUV irradiation, also weakly on Kzz (for Teq > 600 K), and is essentially independent of internal temperature. Despite its detection in a growing number of giant planets, SO2 is never the dominant sulfur-bearing molecule: depending on temperature and metallicity, H2S, S2, NS, SO, SH, and even S8 or atomic S are frequently as common (or more so) as SO2. Nonetheless SO2 remains the most easily detectable sulfur-bearing species, followed by H2S, though perhaps SO and SH could be detectable in some gas giants. Aside from a pressing need for additional observational constraints on sulfur, we also identify the need for future work to account for the effects of clouds and hazes, fully self-consistent atmospheric models, 2D and 3D models, a wider range of planetary masses and radii, and studies to measure and refine reaction rates and molecular opacities of sulfur-bearing species
Auto-regressive partial differential equation (PDE) foundation models have shown great potential in handling time-dependent data. However, these models suffer from the shortcut problem deeply rooted in auto-regressive prediction, causing error accumulation. The challenge becomes particularly evident for out-of-distribution data, as the pretraining performance may approach random model initialization for downstream tasks with long-term dynamics. To deal with this problem, we propose physics-informed temporal alignment (PITA), a self-supervised learning framework inspired by inverse problem solving. Specifically, PITA aligns the physical dynamics discovered at different time steps on each given PDE trajectory by integrating physics-informed constraints into the self-supervision signal. The alignment is derived from observation data without relying on known physics priors, indicating strong generalization ability to the out-of-distribution data. Extensive experiments show that PITA significantly enhances the accuracy and robustness of existing foundation models on diverse time-dependent PDE data. The code is available at this https URL
12
The IceCube Neutrino Observatory is a cubic-kilometer-scale high-energy neutrino detector built into the ice at the South Pole. Construction of IceCube, the largest neutrino detector built to date, was completed in 2011 and enabled the discovery of high-energy astrophysical neutrinos. We describe here the design, production, and calibration of the IceCube digital optical module (DOM), the cable systems, computing hardware, and our methodology for drilling and deployment. We also describe the online triggering and data filtering systems that select candidate neutrino and cosmic ray events for analysis. Due to a rigorous pre-deployment protocol, 98.4% of the DOMs in the deep ice are operating and collecting data. IceCube routinely achieves a detector uptime of 99% by emphasizing software stability and monitoring. Detector operations have been stable since construction was completed, and the detector is expected to operate at least until the end of the next decade.
The Fermi-Hubbard model is a fundamental model in condensed matter physics that describes strongly correlated electrons. On the other hand, quantum computers are emerging as powerful tools for exploring the complex dynamics of these quantum many-body systems. In this work, we demonstrate the quantum simulation of the one-dimensional Fermi-Hubbard model using IBM's superconducting quantum computers, employing over 100 qubits. We introduce a first-order Trotterization scheme and extend it to an optimized second-order Trotterization for the time evolution in the Fermi-Hubbard model, specifically tailored for the limited qubit connectivity of quantum architectures, such as IBM's platforms. Notably, both Trotterization approaches are scalable and maintain a constant circuit depth at each Trotter step, regardless of the qubit count, enabling us to precisely investigate the relaxation dynamics in the Fermi-Hubbard model by measuring the expectation value of the Néel observable (staggered magnetization) for time-evolved quantum states. Finally, our successful measurement of expectation values in such large-scale quantum many-body systems, especially at longer time scales with larger entanglement, highlights the quantum utility of superconducting quantum platforms over conventional classical approximation methods.
A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a set of simulated collider events. Participants in these Olympics have developed their methods using an R&D dataset and then tested them on black boxes: datasets with an unknown anomaly (or not). This paper will review the LHC Olympics 2020 challenge, including an overview of the competition, a description of methods deployed in the competition, lessons learned from the experience, and implications for data analyses with future datasets as well as future colliders.
We present a large spectroscopic survey with \textit{JWST}'s Mid-Infrared Instrument (MIRI) Low Resolution Spectrometer (LRS) targeting 3737 infrared-bright galaxies between z=0.652.46z=0.65-2.46 with infrared luminosities logLIR/L>11.5\log L_{\rm IR}/L_\odot>11.5 and logM/M=1011.5\log M_*/M_\odot=10-11.5. Targets were taken from a \textit{Spitzer} 24μ24\,\mum-selected sample with archival spectroscopy from the Infrared Spectrograph (IRS) and include a mix of star-forming galaxies and dust-obscured AGN. By combining IRS with the increased sensitivity of LRS, we expand the range of spectral features observed between 530μ5-30\,\mum for every galaxy in our sample. In this paper, we outline the sample selection, \textit{JWST} data reduction, 1D spectral extraction, and polycyclic aromatic hydrocarbon (PAH) feature measurements from λrest=3.311.2μ\lambda_{rest}=3.3-11.2\,\mum. In the \textit{JWST} spectra, we detect PAH emission features at 3.35.3μ3.3-5.3\,\mum, as well as Paschen and Brackett lines. The 3.3μ3.3\,\mum feature can be as bright as 1%1\% of the 81000μ8-1000\,\mum infrared luminosity and exhibits a tight correlation with the dust-obscured star-formation rate. We detect absorption features from CO gas, CO2_2 ice, H2_2O ice, and aliphatic dust. From the joint \textit{JWST} and \textit{Spitzer} analysis we find that the 11.3/3.3μ11.3/3.3\,\mum PAH ratios are on-average three times higher than that of local luminous, infrared galaxies. This is interpreted as evidence that the PAH grains are larger at z12z\sim1-2. The size distribution may be affected by coagulation of grains due to high gas densities and low temperatures. These conditions are supported by the observation of strong water ice absorption at 3.05μ3.05\,\mum, and can lower stellar radiative feedback as large PAHs transmit less energy per photon into the interstellar medium.
Brain function emerges from coordinated activity across anatomically connected regions, where structural connectivity (SC) -- the network of white matter pathways - provides the physical substrate for functional connectivity (FC) -- the correlated neural activity between brain areas. While these structural and functional networks exhibit substantial overlap, their relationship involves complex, indirect mechanisms, including the dynamic interplay of direct and indirect pathways, recurrent network interactions, and neuromodulatory influences. To systematically untangle how structural architecture shapes functional patterns, this work aims to establish a set of rules that decode how direct and indirect structural connections and motifs give rise to FC between brain regions. Specifically, using a generative linear model, we derive explicit rules that predict an individual's resting-state fMRI FC from diffusion-weighted imaging (DWI)-derived SC, validated against topological null models. Examining the rules reveals distinct classes of brain regions, with integrator hubs acting as structural linchpins promoting synchronization and mediator hubs serving as structural fulcrums orchestrating competing dynamics. Through virtual lesion experiments, we demonstrate how different cortical and subcortical systems distinctively contribute to global functional organization. Together, this framework disentangles the mechanisms by which structural architecture drives functional dynamics, enabling the prediction of how pathological or surgical disruptions to brain connectivity cascade through functional networks, potentially leading to cognitive and behavioral impairments.
Reasoning is a key component of language understanding in Large Language Models. While Chain-of-Thought prompting enhances performance via explicit intermediate steps, it suffers from sufficient token overhead and a fixed reasoning trajectory, preventing step-wise refinement. Recent advances in latent reasoning address these limitations by refining internal reasoning processes directly in the model's latent space, without producing explicit outputs. However, a key challenge remains: how to effectively update reasoning embeddings during post-training to guide the model toward more accurate solutions. To overcome this challenge, we propose a lightweight post-training framework that refines latent reasoning trajectories using two novel strategies: 1) Contrastive reasoning feedback, which compares reasoning embeddings against strong and weak baselines to infer effective update directions via embedding enhancement; 2) Residual embedding refinement, which stabilizes updates by progressively integrating current and historical gradients, enabling fast yet controlled convergence. Extensive experiments and case studies are conducted on five reasoning benchmarks to demonstrate the effectiveness of the proposed framework. Notably, a 5\% accuracy gain on MathQA without additional training.
There are no more papers matching your filters at the moment.