Moscow Center for Fundamental and Applied MathematicsMSU
The recently proposed Large Concept Model (LCM) generates text by predicting a sequence of sentence-level embeddings and training with either mean-squared error or diffusion objectives. We present SONAR-LLM, a decoder-only transformer that "thinks" in the same continuous SONAR embedding space, yet is supervised through token-level cross-entropy propagated via the frozen SONAR decoder. This hybrid objective retains the semantic abstraction of LCM while eliminating its diffusion sampler and restoring a likelihood-based training signal. Across model sizes from 39M to 1.3B parameters, SONAR-LLM attains competitive generation quality. We report scaling trends, ablations, benchmark results, and release the complete training code and all pretrained checkpoints to foster reproducibility and future research.
New methods, Belief-FB (BFB) and Rotation-FB (RFB), allow Behavioral Foundation Models (BFMs) to adapt to different environmental dynamics without additional training. These approaches, which explicitly model dynamics belief and disentangle policy representations, consistently outperform existing baselines on tasks like Randomized FourRooms, PointMass, and AntWind, demonstrating improved zero-shot generalization.
Yandex researchers develop Alchemist, a supervised fine-tuning dataset created through a novel multi-stage filtering pipeline that uses a pre-trained diffusion model's cross-attention activations to assess image quality from large-scale web data, achieving up to 20% win rate improvements in aesthetic quality and image complexity when fine-tuning five publicly available text-to-image models (SD1.5, SD2.1, SDXL1.0, SD3.5 Medium, and SD3.5 Large) while demonstrating that exceptional sample quality matters more than dataset volume for effective supervised fine-tuning.
· +1
As the field of Multimodal Large Language Models (MLLMs) continues to evolve, their potential to revolutionize artificial intelligence is particularly promising, especially in addressing mathematical reasoning tasks. Current mathematical benchmarks predominantly focus on evaluating MLLMs' problem-solving ability, yet there is a crucial gap in addressing more complex scenarios such as error detection, for enhancing reasoning capability in complicated settings. To fill this gap, we formally formulate the new task: multimodal error detection, and introduce ErrorRadar, the first benchmark designed to assess MLLMs' capabilities in such a task. ErrorRadar evaluates two sub-tasks: error step identification and error categorization, providing a comprehensive framework for evaluating MLLMs' complex mathematical reasoning ability. It consists of 2,500 high-quality multimodal K-12 mathematical problems, collected from real-world student interactions in an educational organization, with rigorous annotation and rich metadata such as problem type and error category. Through extensive experiments, we evaluated both open-source and closed-source representative MLLMs, benchmarking their performance against educational expert evaluators. Results indicate significant challenges still remain, as GPT-4o with best performance is still around 10% behind human evaluation. The dataset will be available upon acceptance.
We compute the potential-graviton contributions to the conservative scattering angle of two non-spinning bodies in maximal supergravity at fifth order in Newton's constant, including second-order self-force effects. Our goal is to tackle the challenging integrals arising at this order in Einstein gravity, but within the technically simpler framework of supergravity. The calculation employs the scattering-amplitude framework, effective field theory, and multi-loop integration techniques based on integration by parts and differential equations. The final result is expressed as a series expansion around the static limit, thereby avoiding the explicit evaluation of intricate special functions. This series solution for the master integrals applies, as well, to the corresponding computation in general relativity. Remarkably, we observe nontrivial cancellations among contributions associated with Calabi-Yau integrals, alongside a distinct contribution governed by a Heun differential equation.
This article gives a concise overview of the development and current status of studies on healthy models of the early Universe without an initial singularity, namely the cosmological bounce and Genesis scenarios, constructed within a broad class of scalar-tensor theories, specifically Horndeski theories and their generalizations. The review focuses on the topics related to linear stability at the perturbation level over the non-singular background solutions: 1) the no-go theorem, valid for non-singular cosmologies within Horndeski theory, 2) the updates on possible approaches to evade the no-go theorem, 3) the role of disformal transformations relating the Horndeski subclasses with the generalized theories like DHOST, 4) the effects on stability caused by additional matter coupling and potential emergence of superluminal perturbation modes in the multi-component setting.
FIRE is a program which performs integration-by-parts (IBP) reduction of Feynman integrals. Originally, the C++ version of FIRE relies on the computer algebra system Fermat by Robert Lewis to simplify rational functions. We present an upgrade of FIRE which incorporates a new library FUEL initially described in a separate publication, which enables a flexible choice of third-party computer algebra systems as simplifiers, as well as efficient communications with some of the simplifiers as C++ libraries rather than through Unix pipes. We achieve significant speedups for IBP reduction of Feynman integrals involving many kinematic variables, when using an open source backend based on FLINT newly added in this work, or the Symbolica backend developed by Ben Ruijl as a potential successor of FORM.
We prove a recent conjecture of Dragovic et al arXiv2504.20515 stating that the magnetic geodesic flow on the standard sphere SnRn+1S^n\subset \mathbb R^{n+1} whose magnetic 2-form is the restriction of a constant 2-form from Rn+1\mathbb{R}^{n+1} is Liouville integrable. The integrals are quadratic and linear in momenta.
Recent advances in video generation have outpaced progress in video editing, which remains constrained by several limiting factors, namely: (a) the task's dependency on supervision severely limits generality, (b) an unnecessary artificial separation between the generation and editing task, and (c) the high computational costs of training a video model. In this work, we propose UES (Unlocking Universal Editing via Self-Supervision), a lightweight self-supervised fine-tuning strategy that transforms generation models into unified generation-editing systems through self-supervised semantic alignment. Our approach establishes a dual-conditioning mechanism where original video-text pairs jointly provide visual and textual semantics, enabling structured learning of intrinsic spatiotemporal correspondences. Key advantages include: (i) Universality through supervision-free adaptation to diverse editing tasks, (ii) Unification of generation and editing applicable to most text(+image)-to-video model, and (iii) Efficiency via lightweight fine-tune that reduces tunable parameters by 92.67%. To enable systematic evaluation, we introduce OmniBench-99, a comprehensive benchmark spanning 99 videos across humans/animals, environments, and objects, comprising 4 editing types and 8 scenarios. Extensive experiments show UES enables models without inherent editing capability to perform powerful and universal editing while preserving or even enhancing their original generation performance.
We address the problem of evaluation of multiloop Feynman integrals by means of their Mellin-Barnes representation. After a brief overview of available capabilities though open source toolkits and their application in various circumstances, we introduce a new code MBcreate which allows one to automatically deduce a concise Mellin-Barnes representation for a given parametric integral. A thorough discussion of its implementation and use is provided.
Computer simulations have long presented the exciting possibility of scientific insight into complex real-world processes. Despite the power of modern computing, however, it remains challenging to systematically perform inference under simulation models. This has led to the rise of simulation-based inference (SBI), a class of machine learning-enabled techniques for approaching inverse problems with stochastic simulators. Many such methods, however, require large numbers of simulation samples and face difficulty scaling to high-dimensional settings, often making inference prohibitive under resource-intensive simulators. To mitigate these drawbacks, we introduce active sequential neural posterior estimation (ASNPE). ASNPE brings an active learning scheme into the inference loop to estimate the utility of simulation parameter candidates to the underlying probabilistic model. The proposed acquisition scheme is easily integrated into existing posterior estimation pipelines, allowing for improved sample efficiency with low computational overhead. We further demonstrate the effectiveness of the proposed method in the travel demand calibration setting, a high-dimensional inverse problem commonly requiring computationally expensive traffic simulators. Our method outperforms well-tuned benchmarks and state-of-the-art posterior estimation methods on a large-scale real-world traffic network, as well as demonstrates a performance advantage over non-active counterparts on a suite of SBI benchmark environments.
FIRE7 is a major update to the FIRE program for integration-by-parts (IBP) reduction of Feynman integrals. A large part of improvements is related to the automatic reduction and reconstruction with the modular arithmetic approach, while the performance of the classical rational polynomial approach is also significantly increased. An improved presolve algorithm performs Gaussian elimination to simplify IBP identities before substituting numerical indices as in the Laporta algorithm. Various new command line tools are included to facilitate tasks such as applying an IBP reduction table to reduce a loop integrand as a linear combination of individual integrals.
In this paper we present a new release of the FIESTA program (Feynman Integral Evaluation by a Sector decomposiTion Approach). FIESTA5 is performance-oriented - we implemented improvements of various kinds in order to make Feynman integral evaluation faster. We plugged in two new integrators, the Quasi Monte Carlo and Tensor Train. At the same time the old code of FIESTA4 was upgraded to the C++17 standard and mostly rewritten without self-made structures such as hash tables. There are also several essential improvements which are most relevant for complex integrations - the new release is capable of producing results where previously impossible.
In this work, we solve the fundamental problem of describing the coordinate transformations that preserve the upper triangular Toeplitz form of the given operator field. Surprisingly, this problem is closely related to the description of all Nijenhuis operators in the same form. This description, as well as the formulas for the aforementioned coordinate transformations, are given by the implicit formulas involving matrix-valued functions.
This paper addresses the challenge of automatically extracting attributes from news article web pages across multiple languages. Recent neural network models have shown high efficacy in extracting information from semi-structured web pages. However, these models are predominantly applied to domains like e-commerce and are pre-trained using English data, complicating their application to web pages in other languages. We prepared a multilingual dataset comprising 3,172 marked-up news web pages across six languages (English, German, Russian, Chinese, Korean, and Arabic) from 161 websites. The dataset is publicly available on GitHub. We fine-tuned the pre-trained state-of-the-art model, MarkupLM, to extract news attributes from these pages and evaluated the impact of translating pages into English on extraction quality. Additionally, we pre-trained another state-of-the-art model, DOM-LM, on multilingual data and fine-tuned it on our dataset. We compared both fine-tuned models to existing open-source news data extraction tools, achieving superior extraction metrics.
We present analytical results for all master integrals for massless three-point functions, with one off-shell leg, at four loops. Our solutions were obtained using differential equations and direct integration techniques. We review the methods and provide additional details.
We compute the potential-graviton contributions to the conservative scattering angle of two non-spinning bodies in maximal supergravity at fifth order in Newton's constant, including second-order self-force effects. Our goal is to tackle the challenging integrals arising at this order in Einstein gravity, but within the technically simpler framework of supergravity. The calculation employs the scattering-amplitude framework, effective field theory, and multi-loop integration techniques based on integration by parts and differential equations. The final result is expressed as a series expansion around the static limit, thereby avoiding the explicit evaluation of intricate special functions. This series solution for the master integrals applies, as well, to the corresponding computation in general relativity. Remarkably, we observe nontrivial cancellations among contributions associated with Calabi-Yau integrals, alongside a distinct contribution governed by a Heun differential equation.
We compute the photon-quark and Higgs-gluon form factors to four-loop order within massless perturbative Quantum Chromodynamics. Our results constitute ready-to-use building blocks for N4{}^4LO cross sections for Drell-Yan processes and gluon-fusion Higgs boson production at the LHC. We present complete analytic expressions for both form factors and show several of the most complicated master integrals.
Deep fake technology became a hot field of research in the last few years. Researchers investigate sophisticated Generative Adversarial Networks (GAN), autoencoders, and other approaches to establish precise and robust algorithms for face swapping. Achieved results show that the deep fake unsupervised synthesis task has problems in terms of the visual quality of generated data. These problems usually lead to high fake detection accuracy when an expert analyzes them. The first problem is that existing image-to-image approaches do not consider video domain specificity and frame-by-frame processing leads to face jittering and other clearly visible distortions. Another problem is the generated data resolution, which is low for many existing methods due to high computational complexity. The third problem appears when the source face has larger proportions (like bigger cheeks), and after replacement it becomes visible on the face border. Our main goal was to develop such an approach that could solve these problems and outperform existing solutions on a number of clue metrics. We introduce a new face swap pipeline that is based on FaceShifter architecture and fixes the problems stated above. With a new eye loss function, super-resolution block, and Gaussian-based face mask generation leads to improvements in quality which is confirmed during evaluation.
We study domain walls (DWs) arising in field theories where Z2Z_2-symmetry is spontaneously broken by a scalar expectation value decreasing proportionally to the Universe temperature. The energy density of such melting DWs redshifts sufficiently fast not to overclose the Universe. For the first time, evolution of melting DWs and the resulting gravitational waves (GWs) is investigated numerically using lattice simulations. We show that formation of closed melting DWs during radiation domination is much more efficient compared to the scenario with constant tension DWs. This suggests that it can be the main mechanism responsible for reaching the scaling regime similarly to the case of cosmic strings. However, the scaling behaviour of melting DWs is observed, provided only that the initial scalar field fluctuations are not very large. Otherwise, simulations reveal violation of the scaling law, potentially of the non-physical origin. The spectrum of GWs emitted by melting DWs is also significantly different from that of constant tension DWs. Whether the system has reached scaling or not, the numerical study reveals a GW spectrum described in the infrared by the spectral index n1.6n \approx 1.6 followed by the causality tail. We attribute the difference from the value n=2n=2 predicted in our previous studies to a finite lifetime of the DW network. Notably, the updated index is still in excellent agreement with the recent findings by pulsar timing arrays, which confirms that melting DWs can be responsible for the observed (GW) signal. We also point out that results for evolution of melting DWs in the radiation-dominated Universe are applicable to constant tension DW evolution in the flat spacetime.
There are no more papers matching your filters at the moment.