Institute of PhysicsJagiellonian University
Researchers at Microsoft Research AI for Science, in collaboration with several universities, developed SYNTHESEUS, an open-source framework for standardized evaluation of retrosynthesis algorithms. Their re-evaluation of state-of-the-art methods using SYNTHESEUS revealed that previous inconsistent benchmarking practices had distorted reported performance and model rankings, while also highlighting significant challenges in out-of-distribution generalization.
Researchers at Google Brain developed the Sparsely-Gated Mixture-of-Experts (MoE) layer, an architectural innovation that enables training neural networks with over a hundred billion parameters while maintaining computational efficiency. This method achieves better perplexity scores on language modeling tasks and higher performance in machine translation by selectively activating only a few 'expert' subnetworks per input, rather than the entire model.
California Institute of Technology logoCalifornia Institute of TechnologyUniversity of OsloUniversity of Cambridge logoUniversity of CambridgeUniversity of VictoriaChinese Academy of Sciences logoChinese Academy of SciencesUniversity of ZurichTel Aviv University logoTel Aviv UniversityUniversity of Oxford logoUniversity of OxfordUniversity of Science and Technology of China logoUniversity of Science and Technology of ChinaScuola Normale SuperioreUniversity of Copenhagen logoUniversity of CopenhagenUniversity of EdinburghThe University of Texas at Austin logoThe University of Texas at AustinINFN logoINFNETH Zürich logoETH ZürichYonsei UniversityUniversity of CreteKavli Institute for the Physics and Mathematics of the UniverseUniversität HeidelbergUniversity of Maryland logoUniversity of MarylandUniversidad Autónoma de MadridUniversité Paris-Saclay logoUniversité Paris-SaclayStockholm University logoStockholm UniversityUniversity of HelsinkiUniversity of Arizona logoUniversity of ArizonaUniversity of Western AustraliaUniversity of SheffieldPrinceton University logoPrinceton UniversityUniversity of GenevaUniversity of PortsmouthUniversity of IcelandUniversità di GenovaUniversidade do PortoUniversity of SussexINAFAix Marseille UniversityNiels Bohr InstituteUniversity of JyväskyläUniversity of PadovaJet Propulsion LaboratoryJagiellonian UniversityInstituto de Astrofísica de CanariasUniversity of the WitwatersrandUniversity of NottinghamEuropean Space AgencyUniversity of Cape TownSISSANicolaus Copernicus Astronomical CenterObservatoire de la Côte d’AzurUniversity of Hawai’iUniversity of KwaZulu-NatalLudwig-Maximilians-UniversitätLaboratoire d’Astrophysique de MarseilleINAF-Istituto di RadioastronomiaINAF – Osservatorio Astronomico di RomaInstitut de Física d’Altes Energies (IFAE)Laboratoire de Physique des 2 Infinis Irène Joliot-CurieOsservatorio Astronomico della Regione Autonoma Valle d’AostaINAF - Osservatorio Astrofisico di CataniaINAF - Osservatorio Astronomico di ArcetriInstitut d’Astrophysique SpatialeNASADTU SpaceThe Queen’s University of BelfastInstituto de Astrofísica e Ciências do Espaço, Universidade de LisboaIRAP, Université de Toulouse, CNRS, CNESETH, Institute for AstronomyINAF-IASF, BolognaCosmic Dawn Center(DAWN)Universit degli Studi di FerraraUniversit de ParisUniversit Claude Bernard Lyon 1Excellence Cluster ‘Origins’Universit de LyonUniversit di PisaIFCA-CSIC-UCINAF Osservatorio Astronomico di PadovaUniversit degli Studi di FirenzeUniversit de MontpellierUniversit degli Studi di Napoli Federico IIUniversit di Roma Tor VergataINAF Osservatorio di Astrofisica e Scienza dello Spazio di BolognaUniversit Di BolognaINAF ` Osservatorio Astronomico di TriesteUniversit degli Studi di Trieste
Verifying the fully kinematic nature of the cosmic microwave background (CMB) dipole is of fundamental importance in cosmology. In the standard cosmological model with the Friedman-Lemaitre-Robertson-Walker (FLRW) metric from the inflationary expansion the CMB dipole should be entirely kinematic. Any non-kinematic CMB dipole component would thus reflect the preinflationary structure of spacetime probing the extent of the FLRW applicability. Cosmic backgrounds from galaxies after the matter-radiation decoupling, should have kinematic dipole component identical in velocity with the CMB kinematic dipole. Comparing the two can lead to isolating the CMB non-kinematic dipole. It was recently proposed that such measurement can be done using the near-IR cosmic infrared background (CIB) measured with the currently operating Euclid telescope, and later with Roman. The proposed method reconstructs the resolved CIB, the Integrated Galaxy Light (IGL), from Euclid's Wide Survey and probes its dipole, with a kinematic component amplified over that of the CMB by the Compton-Getting effect. The amplification coupled with the extensive galaxy samples forming the IGL would determine the CIB dipole with an overwhelming signal/noise, isolating its direction to sub-degree accuracy. We develop details of the method for Euclid's Wide Survey in 4 bands spanning 0.6 to 2 mic. We isolate the systematic and other uncertainties and present methodologies to minimize them, after confining the sample to the magnitude range with negligible IGL/CIB dipole from galaxy clustering. These include the required star-galaxy separation, accounting for the extinction correction dipole using the method newly developed here achieving total separation, accounting for the Earth's orbital motion and other systematic effects. (Abridged)
The growth of large language models underscores the need for parameter-efficient fine-tuning. Despite its popularity, LoRA encounters storage and computational challenges when deploying multiple task- or user-specific modules. To address this, we introduce LoRA-XS, a novel fine-tuning method backed by a theoretical derivation. LoRA-XS drastically reduces trainable parameters by incorporating a small, trainable weight matrix between frozen low-rank matrices derived from the Singular Value Decomposition of pre-trained weights. This design enables LoRA-XS to reduce storage requirements by over 100x in 7B models compared to LoRA. Additionally, unlike other methods, LoRA-XS imposes no lower bound on trainable parameters - it can scale from a single parameter per module to arbitrarily large values, adapting to any storage or computational constraint. Evaluations on GLUE, GSM8K, MATH, and commonsense reasoning benchmarks across different model scales reveal that LoRA-XS consistently outperforms or matches LoRA and VeRA in accuracy, offering unmatched parameter efficiency. Our ablation studies highlight the significance of singular vectors in transformer weights, establishing LoRA-XS as a powerful, storage-efficient solution for scaling and personalizing large language models.
28
This paper from Google Research proposes adapter modules for parameter-efficient transfer learning in Natural Language Processing, demonstrating that they enable large pre-trained models like BERT to achieve performance comparable to full fine-tuning while drastically reducing the number of task-specific parameters required (e.g., 3.6% of original model parameters per task on GLUE).
2,650
PanTS is a large-scale, multi-institutional dataset curated to advance research in pancreatic CT analysis. It contains 36,390 CT scans from 145 medical centers, with expert-validated, voxel-wise annotations of over 993,000 anatomical structures, covering pancreatic tumors, pancreas head, body, and tail, and 24 surrounding anatomical structures such as vascular/skeletal structures and abdominal/thoracic organs. Each scan includes metadata such as patient age, sex, diagnosis, contrast phase, in-plane spacing, slice thickness, etc. AI models trained on PanTS achieve significantly better performance in pancreatic tumor detection, localization, and segmentation compared to those trained on existing public datasets. Our analysis indicates that these gains are directly attributable to the 16x larger-scale tumor annotations and indirectly supported by the 24 additional surrounding anatomical structures. As the largest and most comprehensive resource of its kind, PanTS offers a new benchmark for developing and evaluating AI models in pancreatic CT analysis.
55
Classifier-free guidance (CFG) is an essential mechanism in contemporary text-driven diffusion models. In practice, in controlling the impact of guidance we can see the trade-off between the quality of the generated images and correspondence to the prompt. When we use strong guidance, generated images fit the conditioned text perfectly but at the cost of their quality. Dually, we can use small guidance to generate high-quality results, but the generated images do not suit our prompt. In this paper, we present β\beta-CFG (β\beta-adaptive scaling in Classifier-Free Guidance), which controls the impact of guidance during generation to solve the above trade-off. First, β\beta-CFG stabilizes the effects of guiding by gradient-based adaptive normalization. Second, β\beta-CFG uses the family of single-modal (β\beta-distribution), time-dependent curves to dynamically adapt the trade-off between prompt matching and the quality of samples during the diffusion denoising process. Our model obtained better FID scores, maintaining the text-to-image CLIP similarity scores at a level similar to that of the reference CFG.
Multifractality in time series analysis characterizes the presence of multiple scaling exponents, indicating heterogeneous temporal structures and complex dynamical behaviors beyond simple monofractal models. In the context of digital currency markets, multifractal properties arise due to the interplay of long-range temporal correlations and heavy-tailed distributions of returns, reflecting intricate market microstructure and trader interactions. Incorporating multifractal analysis into the modeling of cryptocurrency price dynamics enhances the understanding of market inefficiencies, may improve volatility forecasting and facilitate the detection of critical transitions or regime shifts. Based on the multifractal cross-correlation analysis (MFCCA) whose spacial case is the multifractal detrended fluctuation analysis (MFDFA), as the most commonly used practical tools for quantifying multifractality, in the present contribution a recently proposed method of disentangling sources of multifractality in time series was applied to the most representative instruments from the digital market. They include Bitcoin (BTC), Ethereum (ETH), decentralized exchanges (DEX) and non-fungible tokens (NFT). The results indicate the significant role of heavy tails in generating a broad multifractal spectrum. However, they also clearly demonstrate that the primary source of multifractality are temporal correlations in the series, and without them, multifractality fades out. It appears characteristic that these temporal correlations, to a large extent, do not depend on the thickness of the tails of the fluctuation distribution. These observations, made here in the context of the digital currency market, provide a further strong argument for the validity of the proposed methodology of disentangling sources of multifractality in time series.
We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function. We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods (continuous generative flow networks). Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work. We also propose a novel exploration strategy for off-policy methods, based on local search in the target space with the use of a replay buffer, and show that it improves the quality of samples on a variety of target distributions. Our code for the sampling methods and benchmarks studied is made public at this https URL as a base for future work on diffusion models for amortized inference.
Early tumor detection save lives. Each year, more than 300 million computed tomography (CT) scans are performed worldwide, offering a vast opportunity for effective cancer screening. However, detecting small or early-stage tumors on these CT scans remains challenging, even for experts. Artificial intelligence (AI) models can assist by highlighting suspicious regions, but training such models typically requires extensive tumor masks--detailed, voxel-wise outlines of tumors manually drawn by radiologists. Drawing these masks is costly, requiring years of effort and millions of dollars. In contrast, nearly every CT scan in clinical practice is already accompanied by medical reports describing the tumor's size, number, appearance, and sometimes, pathology results--information that is rich, abundant, and often underutilized for AI training. We introduce R-Super, which trains AI to segment tumors that match their descriptions in medical reports. This approach scales AI training with large collections of readily available medical reports, substantially reducing the need for manually drawn tumor masks. When trained on 101,654 reports, AI models achieved performance comparable to those trained on 723 masks. Combining reports and masks further improved sensitivity by +13% and specificity by +8%, surpassing radiologists in detecting five of the seven tumor types. Notably, R-Super enabled segmentation of tumors in the spleen, gallbladder, prostate, bladder, uterus, and esophagus, for which no public masks or AI models previously existed. This study challenges the long-held belief that large-scale, labor-intensive tumor mask creation is indispensable, establishing a scalable and accessible path toward early detection across diverse tumor types. We plan to release our trained models, code, and dataset at this https URL
Gaussian Splatting (GS) has recently emerged as an efficient representation for rendering 3D scenes from 2D images and has been extended to images, videos, and dynamic 4D content. However, applying style transfer to GS-based representations, especially beyond simple color changes, remains challenging. In this work, we introduce CLIPGaussian, the first unified style transfer framework that supports text- and image-guided stylization across multiple modalities: 2D images, videos, 3D objects, and 4D scenes. Our method operates directly on Gaussian primitives and integrates into existing GS pipelines as a plug-in module, without requiring large generative models or retraining from scratch. The CLIPGaussian approach enables joint optimization of color and geometry in 3D and 4D settings, and achieves temporal coherence in videos, while preserving the model size. We demonstrate superior style fidelity and consistency across all tasks, validating CLIPGaussian as a universal and efficient solution for multimodal style transfer.
1
Iterated Denoising Energy Matching (iDEM) introduces a scalable, simulation-free neural sampler for unnormalized Boltzmann distributions, enabling efficient sampling from complex energy landscapes purely from the energy function and its gradient. The method achieves state-of-the-art sample quality and significantly faster training, successfully tackling the challenging 165-dimensional LJ-55 system for the first time.
47
Self-supervision has the potential to transform reinforcement learning (RL), paralleling the breakthroughs it has enabled in other areas of machine learning. While self-supervised learning in other domains aims to find patterns in a fixed dataset, self-supervised goal-conditioned reinforcement learning (GCRL) agents discover new behaviors by learning from the goals achieved during unstructured interaction with the environment. However, these methods have failed to see similar success, both due to a lack of data from slow environment simulations as well as a lack of stable algorithms. We take a step toward addressing both of these issues by releasing a high-performance codebase and benchmark (JaxGCRL) for self-supervised GCRL, enabling researchers to train agents for millions of environment steps in minutes on a single GPU. By utilizing GPU-accelerated replay buffers, environments, and a stable contrastive RL algorithm, we reduce training time by up to 22×22\times. Additionally, we assess key design choices in contrastive RL, identifying those that most effectively stabilize and enhance training performance. With this approach, we provide a foundation for future research in self-supervised GCRL, enabling researchers to quickly iterate on new ideas and evaluate them in diverse and challenging environments. Website + Code: this https URL
192
Parton distribution functions (PDFs) at large xx are challenging to extract from experimental data, yet they are essential for understanding hadron structure and searching for new physics beyond the Standard Model. Within the framework of the large momentum PzP^z expansion of lattice quasi-PDFs, we investigate large xx PDFs, where the matching coefficient is factorized into the hard kernel, related to the active quark momentum xPzx P^z, and the threshold soft function, associated with the spectator momentum (1x)Pz(1-x) P^z. The renormalization group equation of the soft function enables the resummation of the threshold double logarithms αkln2k(1x)\alpha^{k} \ln^{2k}(1-x), which is crucial for a reliable and controllable calculation of large xx PDFs. Our analysis with pion valence PDFs indicates that perturbative matching breaks down when the spectator momentum (1x)Pz(1-x)P^z approaches ΛQCD\Lambda_{\rm QCD}, but remains valid when both xPzx P^z and (1x)Pz(1-x)P^z are much larger than $\Lambda_{\rm QCD}$. Additionally, we incorporate leading renormalon resummation within the threshold framework, demonstrating good perturbative convergence in the region where both spectator and active quark momenta are perturbative scales.
Clustering tabular data remains a significant open challenge in data analysis and machine learning. Unlike for image data, similarity between tabular records often varies across datasets, making the definition of clusters highly dataset-dependent. Furthermore, the absence of supervised signals complicates hyperparameter tuning in deep learning clustering methods, frequently resulting in unstable performance. To address these issues and reduce the need for per-dataset tuning, we adopt an emerging approach in deep learning: zero-shot learning. We propose ZEUS, a self-contained model capable of clustering new datasets without any additional training or fine-tuning. It operates by decomposing complex datasets into meaningful components that can then be clustered effectively. Thanks to pre-training on synthetic datasets generated from a latent-variable prior, it generalizes across various datasets without requiring user intervention. To the best of our knowledge, ZEUS is the first zero-shot method capable of generating embeddings for tabular data in a fully unsupervised manner. Experimental results demonstrate that it performs on par with or better than traditional clustering algorithms and recent deep learning-based methods, while being significantly faster and more user-friendly.
While transfer learning is an advantageous strategy, it overlooks the opportunity to leverage knowledge from numerous available models online. Addressing this multi-source transfer learning problem is a promising path to boost adaptability and cut re-training costs. However, existing approaches are inherently coarse-grained, lacking the necessary precision for granular knowledge extraction and the aggregation efficiency required to fuse knowledge from either a large number of source models or those with high parameter counts. We address these limitations by leveraging Singular Value Decomposition (SVD) to first decompose each source model into its elementary, rank-one components. A subsequent aggregation stage then selects only the most salient components from all sources, thereby overcoming the previous efficiency and precision limitations. To best preserve and leverage the synthesized knowledge base, our method adapts to the target task by fine-tuning only the principal singular values of the merged matrix. In essence, this process only recalibrates the importance of top SVD components. The proposed framework allows for efficient transfer learning, is robust to perturbations both at the input level and in the parameter space (e.g., noisy or pruned sources), and scales well computationally.
Researchers at Mila and collaborators introduce Relative Trajectory Balance (RTB), a novel objective for training diffusion models to perform amortized sampling of intractable posterior distributions. This method offers an asymptotically unbiased approach for posterior inference, demonstrating its effectiveness across tasks in vision, language, and continuous control.
In this work, we investigate the effects of logarithms on the asymptotic behavior of power expansion/OPE in supper-renormalizable QFTs. We performed a careful investigation of the large p2p^2 expansion of a scalar-scalar two-point function at the next-to-leading order in the large-NN expansion, in a large-NN O(N)O(N) quartic model that is populated by logarithms. We show that because the large-p2p^2 logarithms of the individual bubbles can be amplified by bubble-chains, there are factorial enhancements to the power expansion. We show how the factorial enhancements appear separately in the coefficient functions and operator condensates, and demonstrate how they are cancelled off-diagonally across different powers. Restricted to any given power, the factorial enhancements are no-longer canceled. The large-p2p^2 power expansion is divergent.
Physics simulation is paramount for modeling and utilizing 3D scenes in various real-world applications. However, integrating with state-of-the-art 3D scene rendering techniques such as Gaussian Splatting (GS) remains challenging. Existing models use additional meshing mechanisms, including triangle or tetrahedron meshing, marching cubes, or cage meshes. Alternatively, we can modify the physics-grounded Newtonian dynamics to align with 3D Gaussian components. Current models take the first-order approximation of a deformation map, which locally approximates the dynamics by linear transformations. In contrast, our GS for Physics-Based Simulations (GASP) pipeline uses parametrized flat Gaussian distributions. Consequently, the problem of modeling Gaussian components using the physics engine is reduced to working with 3D points. In our work, we present additional rules for manipulating Gaussians, demonstrating how to adapt the pipeline to incorporate meshes, control Gaussian sizes during simulations, and enhance simulation efficiency. This is achieved through the Gaussian grouping strategy, which implements hierarchical structuring and enables simulations to be performed exclusively on selected Gaussians. The resulting solution can be integrated into any physics engine that can be treated as a black box. As demonstrated in our studies, the proposed pipeline exhibits superior performance on a diverse range of benchmark datasets designed for 3D object rendering. The project webpage, which includes additional visualizations, can be found at this https URL.
In this work we provide a massless perturbative framework for the two dimensional non-linear sigma model (NLSM), that allows the computation of the perturbative series attached to the operator condensates in the operator product expansion (OPE). It is based on a limit of the quartic linear sigma model (LSM) and is manifestly O(N)O(N) symmetric. We show, at next-to-leading order in the 1/N1/N expansion, how this framework reproduces the perturbative contribution to the two-point function, as well as its first exponentially small correction due to the condensate of the Lagrangian operator, in full agreement with the exact non-perturbative large NN solution. We also show that, in the full LSM, the physics at the natural UV cutoff indeed decouples from the NLSM in the IR, in the weak-coupling limit. In particular, we show that the perturbative framework for the LSM at the cutoff scale is connected to the one in the NLSM. The structure of power divergences in the LSM regularization also reveals that the first renormalon on the positive Borel axis of the NLSM perturbative self-energy is an UV renormalon, which cancels against the ambiguity in the condensate.
There are no more papers matching your filters at the moment.