Univ Grenoble Alpes
Starting from a stochastic individual-based description of an SIS epidemic spreading on a random network, we study the dynamics when the size nn of the network tends to infinity. We recover in the limit an infinite-dimensional integro-differential equation studied by Delmas, Dronnier and Zitt (2022) for an SIS epidemic propagating on a graphon. Our work covers the case of dense and sparse graphs, provided that the number of edges grows faster than nn, but not the case of very sparse graphs with O(n)O(n) edges. In order to establish our limit theorem, we have to deal with both the convergence of the random graphs to the graphon and the convergence of the stochastic process spreading on top of these random structures: in particular, we propose a coupling between the process of interest and an epidemic that spreads on the complete graph but with a modified infection rate. Keywords: Random graph, mathematical models of epidemics, measure-valued process, large network limit, limit theorem, graphon.
We present OneFlow, the first non-autoregressive multimodal model that enables variable-length and concurrent mixed-modal generation. Unlike autoregressive models that enforce rigid causal ordering between text and image generation, OneFlow combines an insertion-based Edit Flow for discrete text tokens with Flow Matching for image latents. OneFlow enables concurrent text-image synthesis with hierarchical sampling that prioritizes content over grammar. Through controlled experiments across model sizes from 1B to 8B, we demonstrate that OneFlow outperforms autoregressive baselines on both generation and understanding tasks while using up to 50% fewer training FLOPs. OneFlow surpasses both autoregressive and diffusion-based approaches while unlocking new capabilities for concurrent generation, iterative refinement, and natural reasoning-like generation.
Masked Image Modeling (MIM) offers a promising approach to self-supervised representation learning, however existing MIM models still lag behind the state-of-the-art. In this paper, we systematically analyze target representations, loss functions, and architectures, to introduce CAPI - a novel pure-MIM framework that relies on the prediction of latent clusterings. Our approach leverages a clustering-based loss, which is stable to train, and exhibits promising scaling properties. Our ViT-L backbone, CAPI, achieves 83.8% accuracy on ImageNet and 32.1% mIoU on ADE20K with simple linear probes, substantially outperforming previous MIM methods and approaching the performance of the current state-of-the-art, DINOv2. We release all our code and models.
60
A comprehensive guide outlines a systematic workflow for applying neural-network-based Simulation-Based Inference (SBI) to perform robust parameter inference for complex scientific models lacking explicit likelihood functions. The guide demonstrates its utility across astrophysics, psychophysics, and neuroscience, emphasizing diagnostic checks for reliable posterior distributions.
1
Speech Continuation (SC) is the task of generating a coherent extension of a spoken prompt while preserving both semantic context and speaker identity. Because SC is constrained to a single audio stream, it offers a more direct setting for probing biases in speech foundation models than dialogue does. In this work we present the first systematic evaluation of bias in SC, investigating how gender and phonation type (breathy, creaky, end-creak) affect continuation behaviour. We evaluate three recent models: SpiritLM (base and expressive), VAE-GSLM, and SpeechGPT across speaker similarity, voice quality preservation, and text-based bias metrics. Results show that while both speaker similarity and coherence remain a challenge, textual evaluations reveal significant model and gender interactions: once coherence is sufficiently high (for VAE-GSLM), gender effects emerge on text-metrics such as agency and sentence polarity. In addition, continuations revert toward modal phonation more strongly for female prompts than for male ones, revealing a systematic voice-quality bias. These findings highlight SC as a controlled probe of socially relevant representational biases in speech foundation models, and suggest that it will become an increasingly informative diagnostic as continuation quality improves.
SpectralGPT introduces the first foundation model tailored for spectral remote sensing data, employing a 3D generative pretrained transformer with a specialized masking strategy. Trained on over one million spectral images, it achieves state-of-the-art performance across multiple Earth Observation tasks and exhibits robust spectral image reconstruction.
201
Latent Perceptual Loss (LPL) enhances the perceptual quality of images generated by latent diffusion models by introducing a training objective that leverages intermediate features from the autoencoder's decoder. This method, developed by researchers at FAIR at Meta and academic institutions, consistently improves metrics like FID and CLIPScore while producing sharper, more detailed images.
The emergence of in-context learning (ICL) in large language models (LLMs) remains poorly understood despite its consistent effectiveness, enabling models to adapt to new tasks from only a handful of examples. To clarify and improve these capabilities, we characterize how the statistical properties of the pretraining distribution (e.g., tail behavior, coverage) shape ICL on numerical tasks. We develop a theoretical framework that unifies task selection and generalization, extending and sharpening earlier results, and show how distributional properties govern sample efficiency, task retrieval, and robustness. To this end, we generalize Bayesian posterior consistency and concentration results to heavy-tailed priors and dependent sequences, better reflecting the structure of LLM pretraining data. We then empirically study how ICL performance varies with the pretraining distribution on challenging tasks such as stochastic differential equations and stochastic processes with memory. Together, these findings suggest that controlling key statistical properties of the pretraining distribution is essential for building ICL-capable and reliable LLMs.
LUDVIG introduces a learning-free "inverse rendering" approach that directly aggregates 2D visual features from models like DINOv2 onto 3D Gaussian Splatting scenes, complemented by a 3D graph diffusion mechanism for refinement. This method from Univ. Grenoble Alpes and NAVER LABS Europe achieves performance comparable to or better than optimization-based techniques across various 3D semantic tasks, while accelerating feature generation and uplifting by 5 to 10 times.
In this paper, we describe a graph-based algorithm that uses the features obtained by a self-supervised transformer to detect and segment salient objects in images and videos. With this approach, the image patches that compose an image or video are organised into a fully connected graph, where the edge between each pair of patches is labeled with a similarity score between patches using features learned by the transformer. Detection and segmentation of salient objects is then formulated as a graph-cut problem and solved using the classical Normalized Cut algorithm. Despite the simplicity of this approach, it achieves state-of-the-art results on several common image and video detection and segmentation tasks. For unsupervised object discovery, this approach outperforms the competing approaches by a margin of 6.1%, 5.7%, and 2.6%, respectively, when tested with the VOC07, VOC12, and COCO20K datasets. For the unsupervised saliency detection task in images, this method improves the score for Intersection over Union (IoU) by 4.4%, 5.6% and 5.2%. When tested with the ECSSD, DUTS, and DUT-OMRON datasets, respectively, compared to current state-of-the-art techniques. This method also achieves competitive results for unsupervised video object segmentation tasks with the DAVIS, SegTV2, and FBMS datasets.
57
Foundation Models are designed to serve as versatile embedding machines, with strong zero shot capabilities and superior generalization performance when fine-tuned on diverse downstream tasks. While this is largely true for language and vision foundation models, we argue that the inherent diversity of time series data makes them less suited for building effective foundation models. We demonstrate this using forecasting as our downstream task. We show that the zero-shot capabilities of a time series foundation model are significantly influenced and tied to the specific domains it has been pretrained on. Furthermore, when applied to unseen real-world time series data, fine-tuned foundation models do not consistently yield substantially better results, relative to their increased parameter count and memory footprint, than smaller, dedicated models tailored to the specific forecasting task at hand.
The search for atmospheres on rocky exoplanets is a crucial step in understanding the processes driving atmosphere formation, retention, and loss. Past studies have revealed the existence of planets interior to the radius valley with densities lower than would be expected for pure-rock compositions, indicative of the presence of large volatile inventories which could facilitate atmosphere retention. Here we present an analysis of the JWST NIRSpec/G395H transmission spectrum of the warm (Teq,AB=0T_\mathrm{eq,{A_B}=0} = 569 K) super-Earth TOI-270 b (RpR_\mathrm{p} = 1.306 RR_\oplus), captured alongside the transit of TOI-270 d. The JWST white light-curve transit depth updates TOI-270 b's density to ρp\rho_\mathrm{p} = 3.7 ±\pm 0.5 g/cm3^3, inconsistent at 4.4σ\sigma with an Earth-like composition. Instead, the planet is best explained by a non-zero, percent-level water mass fraction, possibly residing on the surface or stored within the interior. The JWST transmission spectrum shows possible spectroscopic evidence for the presence of this water as part of an atmosphere on TOI-270 b, favoring a H2_2O-rich steam atmosphere model over a flat spectrum (lnB\ln\mathcal{B} = 0.33.20.3-3.2, inconclusive to moderate), with the exact significance depending on whether an offset parameter between the NIRSpec detectors is included. We leverage the transit of the twice-larger TOI-270 d crossing the stellar disk almost simultaneously to rule out the alternative hypothesis that the transit-light-source effect could have caused the water feature in TOI-270 b's observed transmission spectrum. Planetary evolution modeling furthermore shows that TOI-270 b could sustain a significant atmosphere on Gyr timescales, despite its high stellar irradiation, if it formed with a large initial volatile inventory.
Recent foundational models for tabular data, such as TabPFN, excel at adapting to new tasks via in-context learning, but remain constrained to a fixed, pre-defined number of target dimensions-often necessitating costly ensembling strategies. We trace this constraint to a deeper architectural shortcoming: these models lack target equivariance, so that permuting target dimension orderings alters their predictions. This deficiency gives rise to an irreducible "equivariance gap", an error term that introduces instability in predictions. We eliminate this gap by designing a fully target-equivariant architecture-ensuring permutation invariance via equivariant encoders, decoders, and a bi-attention mechanism. Empirical evaluation on standard classification benchmarks shows that, on datasets with more classes than those seen during pre-training, our model matches or surpasses existing methods while incurring lower computational overhead.
We present the first results from a dark matter search using six Skipper-CCDs in the SENSEI detector operating at SNOLAB. We employ a bias-mitigation technique of hiding approximately 46% of our total data and aggressively mask images to remove backgrounds. Given a total exposure after masking of 100.72 gram-days from well-performing sensors, we observe 55 two-electron events, 4 three-electron events, and no events containing 4 to 10 electrons. The two-electron events are consistent with pileup from one-electron events. Among the 4 three-electron events, 2 appear in pixels that are likely impacted by detector defects, although not strongly enough to trigger our "hot-pixel" mask. We use these data to set world-leading constraints on sub-GeV dark matter interacting with electrons and nuclei.
Variational autoencoders (VAEs) are powerful deep generative models widely used to represent high-dimensional complex data through a low-dimensional latent space learned in an unsupervised manner. In the original VAE model, the input data vectors are processed independently. Recently, a series of papers have presented different extensions of the VAE to process sequential data, which model not only the latent space but also the temporal dependencies within a sequence of data vectors and corresponding latent vectors, relying on recurrent neural networks or state-space models. In this paper, we perform a literature review of these models. We introduce and discuss a general class of models, called dynamical variational autoencoders (DVAEs), which encompasses a large subset of these temporal VAE extensions. Then, we present in detail seven recently proposed DVAE models, with an aim to homogenize the notations and presentation lines, as well as to relate these models with existing classical temporal models. We have reimplemented those seven DVAE models and present the results of an experimental benchmark conducted on the speech analysis-resynthesis task (the PyTorch code is made publicly available). The paper concludes with a discussion on important issues concerning the DVAE class of models and future research guidelines.
We present JWST NIRSpec/PRISM IFU time-resolved observations of 2M1207 A and b (TWA 27), a 10\sim 10 Myr binary system consisting of a 2500\sim 2500 K sub-stellar primary hosting a 1300\sim 1300 K companion. Our data provide 20 time-resolved spectra over an observation spanning 12.56 hours. We provide an empirical characterization for the spectra of both objects across time. For 2M1207 A, non-linear trend models are statistically favored within the ranges 0.6-2.3 μ\mum and 3.8-5.3 μ\mum. However, most of the periods constrained from sinusoidal models exceed the observing window, setting a lower limit of 12.56 hours. We find the data at Hα\alpha and beyond 4.35 μ\mum show a moderate time correlation, as well as a pair of light curves at 0.73-0.80 μ\mum and 3.36-3.38 μ\mum. For 2M1207 b, light curves integrated across 0.86-1.77 μ\mum and 3.29-4.34 μ\mum support linear trend models. Following the interpretation of Zhang et. al. (2025), we model the 2M1207 b data with two 1D atmospheric components, both with silicate and iron condensates. The model of time variability as changes to the cloud filling factor shows broad consistency with the variability amplitudes derived from our data. Our amplitudes, however, disagree with the models at \approx0.86-1 μ\mum. While an additional model component such as rainout chemistry may be considered here, our analysis is limited by a low signal-to-noise ratio. Our results demonstrate the capability of JWST to simultaneously monitor the spectral variability of a planetary-mass companion and host at low contrast.
Researchers from Univ. Grenoble Alpes developed efficient linear algebra-based computational tools for simulating Graph-Rewriting Automata (GRAs) over extended time scales, enabling systematic investigation of their dynamic behavior. This approach uncovered a spectrum of growth patterns, including chaotic linear and quasi-quadratic, and demonstrated the emergence of intricate, organic-looking graph structures derived from simple local rules.
University of Washington logoUniversity of WashingtonCNRS logoCNRSCalifornia Institute of Technology logoCalifornia Institute of TechnologyUniversity of Illinois at Urbana-Champaign logoUniversity of Illinois at Urbana-ChampaignSLAC National Accelerator LaboratoryNational Central UniversityUCLA logoUCLACarnegie Mellon University logoCarnegie Mellon UniversityImperial College London logoImperial College LondonDESYUniversity of Chicago logoUniversity of ChicagoUC Berkeley logoUC BerkeleyUniversity College London logoUniversity College LondonUniversity of Oxford logoUniversity of Oxfordthe University of Tokyo logothe University of TokyoStanford University logoStanford UniversityUniversity of EdinburghINFN logoINFNETH Zürich logoETH ZürichUniversity of California, San Diego logoUniversity of California, San DiegoUniversity of British Columbia logoUniversity of British ColumbiaNASA Goddard Space Flight Center logoNASA Goddard Space Flight CenterUniversity of Texas at Austin logoUniversity of Texas at AustinKavli Institute for the Physics and Mathematics of the UniverseCurtin UniversityCERN logoCERNSpace Telescope Science Institute logoSpace Telescope Science InstituteJohns Hopkins University logoJohns Hopkins UniversityArizona State University logoArizona State UniversityUniversity of Maryland logoUniversity of MarylandThe Alan Turing InstituteUniversity of North Carolina at Chapel HillPurdue University logoPurdue UniversityUniversity of HelsinkiPolitecnico di MilanoUniversity of California, Davis logoUniversity of California, DavisDuke University logoDuke UniversityMIT logoMITCEA logoCEAPrinceton University logoPrinceton UniversityUniv. LilleUniversity of Central Florida logoUniversity of Central FloridaUniversity of Colorado BoulderUniversité Côte d’AzurUniversidade Federal do Rio de JaneiroNorthern Arizona UniversityJet Propulsion LaboratoryUniversidad de ChileEuropean Space AgencyUniversity of MontenegroCNESAdam Mickiewicz UniversityPSL Research UniversitySouthwest Research InstituteSETI InstituteUniversity of North DakotaThe Johns Hopkins University Applied Physics LaboratoryObservatoire de la Côte d’AzurUniversity of Hawai’iCalifornia State Polytechnic University, PomonaThe University of ArizonaMIT Kavli Institute for Astrophysics and Space ResearchUniversidade Federal de SergipeKavli Institute for Cosmological PhysicsThe Open UniversityCarnegie Institution for ScienceUniversidad Nacional de ColombiaVera C. Rubin ObservatoryCEA SaclayCNRS/IN2P3Queen's University BelfastInstituto de Astrofísica de Canarias (IAC)Lowell ObservatoryIPACLAPPUniv Grenoble AlpesIJCLabU.S. Naval ObservatoryPlanetary Science InstituteNSF’s National Optical-Infrared Astronomy Research LaboratoryPontificia Universidad Catolica de ChileUniversidad MayorLPNHEUniversities Space Research AssociationAcademia Sinica Institute of Astronomy and Astrophysics (ASIAA)California Polytechnic State University - San Luis ObispoMullard Space Science LaboratoryELTE Gothard Astrophysical ObservatoryParis ObservatoryAstroparticule et Cosmologie (APC)Universit\`a degli Studi di Urbino ‘Carlo Bo’Universit´e Paris DiderotIMCCEELTE Eotvos Lorand UniversityAix-Marseille Universit\'eUK ATCLaboratoire d’Astrophysique de Marseille (LAM)Observatorio Astronomico NacionalInstituto Nacional de Astrofısica Optica y ElectronicaObservatorio do ValongoEarth and Planets LaboratoryUniversit´e Paris Cit´eLSST Discovery AllianceUTFPR— Universidade Tecnol´ogica Federal do Paran´aInstituto de Ciencias Planetarias y Exoplanetarias (ICPE)CONICET-IARLaborat´orio Nacional de Astrof´ısica (LNA)The ExploratoriumELKH-CSFK Konkoly ObservatoryObservat´orio Nacional, MCTILudwig-Maximilians-Universität MünchenNASA, Ames Research CenterUniversité Paris-SaclayCenter for Astrophysics  Harvard & SmithsonianINAF ` Osservatorio Astronomico di TriesteSorbonne Université
We report on the observation and measurement of astrometry, photometry, morphology, and activity of the interstellar object 3I/ATLAS, also designated C/2025 N1 (ATLAS), with the NSF-DOE Vera C. Rubin Observatory. The third interstellar object, comet 3I/ATLAS, was first discovered on UT 2025 July 1. Serendipitously, the Rubin Observatory collected imaging in the area of the sky inhabited by the object during regular commissioning activities. We successfully recovered object detections from Rubin visits spanning UT 2025 June 21 (10 days before discovery) to UT 2025 July 7. Facilitated by Rubin's high resolution and large aperture, we report on the detection of cometary activity as early as June 21st, and observe it throughout. We measure the location and magnitude of the object on 37 Rubin images in r, i, and z bands, with typical precision of about 20 mas (100 mas, systematic) and about 10 mmag, respectively. We use these to derive improved orbit solutions, and to show there is no detectable photometric variability on hourly timescales. We derive a V-band absolute magnitude of H_V = (13.7 +/- 0.2) mag, and an equivalent effective nucleus radius of around (5.6 +/- 0.7) km. These data represent the earliest observations of this object by a large (8-meter class) telescope reported to date, and illustrate the type of measurements (and discoveries) Rubin's Legacy Survey of Space and Time (LSST) will begin to provide once operational later this year.
This research from Université Grenoble Alpes and Facebook AI Research introduces Depth-Adaptive Transformers, enabling models to dynamically adjust their computational depth during inference. The approach significantly reduces the number of decoder layers used (up to 76% on IWSLT'14) while maintaining competitive translation quality on machine translation benchmarks.
There are no more papers matching your filters at the moment.