Ivannikov Institute for System Programming of the Russian Academy of Sciences
A machine learning method is proposed using two agents that simulate the biological behavior of a predator and a prey. In this method, the predator and the prey interact with each other - the predator chases the prey while the prey runs away from the predator - to perform an optimization on the landscape. This method allows, for the case of a ravine landscape (i.e., a landscape with narrow ravines and with gentle slopes along the ravines) to avoid getting optimization stuck in the ravine. For this, in the optimization over a ravine landscape the predator drives the prey along the ravine. Thus we also call this approach, for the case of ravine landscapes, the driven hunt method. For some examples of grokking (i.e., delayed generalization) problems we show that this method allows for achieving up to a hundred times faster learning compared to the standard learning procedure.
In this paper, we propose some accelerated methods for solving optimization problems under the condition of relatively smooth and relatively Lipschitz continuous functions with an inexact oracle. We consider the problem of minimizing the convex differentiable and relatively smooth function concerning a reference convex function. The first proposed method is based on a similar triangles method with an inexact oracle, which uses a special triangular scaling property for the used Bregman divergence. The other proposed methods are non-adaptive and adaptive (tuning to the relative smoothness parameter) accelerated Bregman proximal gradient methods with an inexact oracle. These methods are universal in the sense that they are applicable not only to relatively smooth but also to relatively Lipschitz continuous optimization problems. We also introduced an adaptive intermediate Bregman method which interpolates between slower but more robust algorithms non-accelerated and faster, but less robust accelerated algorithms. We conclude the paper with the results of numerical experiments demonstrating the advantages of the proposed algorithms for the Poisson inverse problem.
Program analysis and automated testing have recently become an essential part of SSDLC. Directed greybox fuzzing is one of the most popular automated testing methods that focuses on error detection in predefined code regions. However, it still lacks ability to overcome difficult program constraints. This problem can be well addressed by symbolic execution, but at the cost of lower performance. Thus, combining directed fuzzing and symbolic execution techniques can lead to more efficient error detection. In this paper, we propose a hybrid approach to directed fuzzing with novel seed scheduling algorithm, based on target-related interestingness and coverage. The approach also performs minimization and sorting of objective seeds according to a target-related information. We implement our approach in Sydr-Fuzz tool using LibAFL-DiFuzz as directed fuzzer and Sydr as dynamic symbolic executor. We evaluate our approach with Time to Exposure metric and compare it with pure LibAFL-DiFuzz, AFLGo, BEACON, WAFLGo, WindRanger, FishFuzz, and Prospector. The results show an improvement for 3 out of 7 examples with speedup up to 1.86 times over the second best result, as well as a significant improvement for 3 out of 7 examples over the pure LibAFL-DiFuzz fuzzer. Sydr-Fuzz hybrid approach to directed fuzzing shows high performance and helps to improve directed fuzzing efficiency.
A modular LLM-based system for ontology learning, developed by Aleksandra Beliaeva and Temurbek Rahmatullaev, addresses term extraction, typing, and taxonomy discovery without extensive model finetuning. The system achieved multiple top-tier placements across all tasks in the LLMs4OL 2025 challenge, demonstrating competitive performance with efficient prompting and lightweight adaptive modules.
The growing need for Trusted AI (TAI) highlights the importance of interpretability and robustness in machine learning models. However, many existing tools overlook graph data and rarely combine these two aspects into a single solution. Graph Neural Networks (GNNs) have become a popular approach, achieving top results across various tasks. We introduce GNN-AID (Graph Neural Network Analysis, Interpretation, and Defense), an open-source framework designed for graph data to address this gap. Built as a Python library, GNN-AID supports advanced trust methods and architectural layers, allowing users to analyze graph datasets and GNN behavior using attacks, defenses, and interpretability methods. GNN-AID is built on PyTorch-Geometric, offering preloaded datasets, models, and support for any GNNs through customizable interfaces. It also includes a web interface with tools for graph visualization and no-code features like an interactive model builder, simplifying the exploration and analysis of GNNs. The framework also supports MLOps techniques, ensuring reproducibility and result versioning to track and revisit analyses efficiently. GNN-AID is a flexible tool for developers and researchers. It helps developers create, analyze, and customize graph models, while also providing access to prebuilt datasets and models for quick experimentation. Researchers can use the framework to explore advanced topics on the relationship between interpretability and robustness, test defense strategies, and combine methods to protect against different types of attacks. We also show how defenses against evasion and poisoning attacks can conflict when applied to graph data, highlighting the complex connections between defense strategies. GNN-AID is available at \href{this https URL}{github.com/ispras/GNN-AID}
In recent years, as data and problem sizes have increased, distributed learning has become an essential tool for training high-performance models. However, the communication bottleneck, especially for high-dimensional data, is a challenge. Several techniques have been developed to overcome this problem. These include communication compression and implementation of local steps, which work particularly well when there is similarity of local data samples. In this paper, we study the synergy of these approaches for efficient distributed optimization. We propose the first theoretically grounded accelerated algorithms utilizing unbiased and biased compression under data similarity, leveraging variance reduction and error feedback frameworks. Our results are of record and confirmed by experiments on different average losses and datasets.
Graph Neural Networks (GNNs) have become a cornerstone in graph-based data analysis, with applications in diverse domains such as bioinformatics, social networks, and recommendation systems. However, the interplay between model interpretability and robustness remains poorly understood, especially under adversarial scenarios like poisoning and evasion attacks. This paper presents a comprehensive benchmark to systematically analyze the impact of various factors on the interpretability of GNNs, including the influence of robustness-enhancing defense mechanisms. We evaluate six GNN architectures based on GCN, SAGE, GIN, and GAT across five datasets from two distinct domains, employing four interpretability metrics: Fidelity, Stability, Consistency, and Sparsity. Our study examines how defenses against poisoning and evasion attacks, applied before and during model training, affect interpretability and highlights critical trade-offs between robustness and interpretability. The framework will be published as open source. The results reveal significant variations in interpretability depending on the chosen defense methods and model architecture characteristics. By establishing a standardized benchmark, this work provides a foundation for developing GNNs that are both robust to adversarial threats and interpretable, facilitating trust in their deployment in sensitive applications.
241
This paper describes the development of an auto-active verification technique in the Frama-C framework. We outline the lemma functions method and present the corresponding ACSL extension, its implementation in Frama-C, and evaluation on a set of string-manipulating functions from the Linux kernel. We illustrate the benefits our approach can bring concerning the effort required to prove lemmas, compared to the approach based on interactive provers such as Coq. Current limitations of the method and its implementation are discussed.
Random graph (RG) models play a central role in the complex networks analysis. They help to understand, control, and predict phenomena occurring, for instance, in social networks, biological networks, the Internet, etc. Despite a large number of RG models presented in the literature, there are few concepts underlying them. Instead of trying to classify a wide variety of very dispersed models, we capture and describe concepts they exploit considering preferential attachment, copying principle, hyperbolic geometry, recursively defined structure, edge switching, Monte Carlo sampling, etc. We analyze RG models, extract their basic principles, and build a taxonomy of concepts they are based on. We also discuss how these concepts are combined in RG models and how they work in typical applications like benchmarks, null models, and data anonymization.
The vulnerability of artificial neural networks to adversarial perturbations in the black-box setting is widely studied in the literature. The majority of attack methods to construct these perturbations suffer from an impractically large number of queries required to find an adversarial example. In this work, we focus on knowledge distillation as an approach to conduct transfer-based black-box adversarial attacks and propose an iterative training of the surrogate model on an expanding dataset. This work is the first, to our knowledge, to provide provable guarantees on the success of knowledge distillation-based attack on classification neural networks: we prove that if the student model has enough learning capabilities, the attack on the teacher model is guaranteed to be found within the finite number of distillation iterations.
While there are many works on the applications of machine learning, not so many of them are trying to understand the theoretical justifications to explain their efficiency. In this work, overfitting control (or generalization property) in machine learning is explained using analogies from physics and biology. For stochastic gradient Langevin dynamics, we show that the Eyring formula of kinetic theory allows to control overfitting in the algorithmic stability approach - when wide minima of the risk function with low free energy correspond to low overfitting. For the generative adversarial network (GAN) model, we establish an analogy between GAN and the predator-prey model in biology. An application of this analogy allows us to explain the selection of wide likelihood maxima and overfitting reduction for GANs.
Numerous studies are aimed at diagnosing heart diseases based on 12-lead electrocardiographic (ECG) records using deep learning methods. These studies usually use specific datasets that differ in size and parameters, such as patient metadata, number of doctors annotating ECGs, types of devices for ECG recording, data preprocessing techniques, etc. It is well-known that high-quality deep neural networks trained on one ECG dataset do not necessarily perform well on another dataset or clinical settings. In this paper, we propose a methodology to improve the quality of heart disease prediction regardless of the dataset by training neural networks on a variety of datasets with further fine-tuning for the specific dataset. To show its applicability, we train different neural networks on a large private dataset TIS containing various ECG records from multiple hospitals and on a relatively small public dataset PTB-XL. We demonstrate that training the networks on a large dataset and fine-tuning it on a small dataset from another source outperforms the networks trained only on one small dataset. We also show how the ability of a deep neural networks to generalize allows to improve classification quality of more diseases.
Social networks crawling is in the focus of active research the last years. One of the challenging task is to collect target nodes in an initially unknown graph given a budget of crawling steps. Predicting a node property based on its partially known neighbourhood is at the heart of a successful crawler. In this paper we adopt graph neural networks for this purpose and show they are competitive to traditional classifiers and are better for individual cases. Additionally we suggest a training sample boosting technique, which helps to diversify the training set at early stages of crawling and thus improves the predictor quality. The experimental study on three types of target set topology indicates GNN based approach has a potential in crawling task, especially in the case of distributed target nodes.
Most general dynamics of an open quantum system is commonly represented by a quantum channel, which is a completely positive trace-preserving map (CPTP or Kraus map). Well-known are the representations of quantum channels by Choi matrices and by Kraus operator-sum representation (OSR). As was shown before, one can use Kraus OSR to parameterize quantum channels by points of a suitable quotient under the action of the unitary group of some complex Stiefel manifold. In this work, we establish a continuity relation (homeomorphism) between the topological space of quantum channels and the quotient of the complex Stiefel manifold. Then the metric on the set of quantum channels induced by the Riemannian metric on the Stiefel manifold is defined. The established relation can be applied to various quantum optimization problems. As an example, we apply it to the analysis of extrema points for a wide variety of quantum control objective functionals defined on the complex Stiefel manifolds, including mean value, generation of quantum gates, thermodynamic quantities involving entropy, etc.
In this work, we tackle the problem of Armenian named entity recognition, providing silver- and gold-standard datasets as well as establishing baseline results on popular models. We present a 163000-token named entity corpus automatically generated and annotated from Wikipedia, and another 53400-token corpus of news sentences with manual annotation of people, organization and location named entities. The corpora were used to train and evaluate several popular named entity recognition models. Alongside the datasets, we release 50-, 100-, 200-, 300-dimensional GloVe word embeddings trained on a collection of Armenian texts from Wikipedia, news, blogs, and encyclopedia.
This work is devoted to a rigorous analysis of the weak coupling limit (WCL) for the reduced dynamics of an open infinite-dimensional quantum system interacting with electromagnetic field or a reservoir formed by Fermi or Bose particles in the dipole approximation. The free system Hamiltonian and the system part of the Hamiltonian describing interaction with the reservoir are considered as unbounded operators with continuous spectrum which are commuting in a weak sense. We derive in the weak coupling limit the reservoir statistics, which is determined by whose terms in the multi-point correlation functions of the reservoir which are non-zero in the WCL. Then we prove that the resulting reduced system dynamics converges to unitary dynamics (such behavior sometimes called as Quantum Cheshire Cat effect) with a modified Hamiltonian which can be interpreted as a Lamb shift to the original Hamiltonian. We obtain exact form of the modified Hamiltonian and estimate the rate of convergence to the limiting dynamics. For Fermi reservoir, we prove the convergence of the full Dyson series. For Bose case the convergence is understood term by term.
10 Feb 2025
Researchers at the Steklov Mathematical Institute introduce novel cyclic proof systems for first-order arithmetic, precisely characterizing standard fragments like IΣn, IΠRn, and IΣRn. The work develops a simpler variable discipline through annotated sequents, directly linking structural properties of cyclic proofs to the deductive strength of these arithmetic theories, which supports automated inductive proof search.
In this paper, we focused on the problem of extracting information from web pages containing many records, a task of growing importance in the era of massive web data. Recently, the development of neural network methods has improved the quality of information extraction from web pages. Nevertheless, most of the research and datasets are aimed at studying detailed pages. This has left multi-record "list pages" relatively understudied, despite their widespread presence and practical significance. To address this gap, we created a large-scale, open-access dataset specifically designed for list pages. This is the first dataset for this task in the Russian language. Our dataset contains 13,120 web pages with news lists, significantly exceeding existing datasets in both scale and complexity. Our dataset contains attributes of various types, including optional and multi-valued, providing a realistic representation of real-world list pages. These features make our dataset a valuable resource for studying information extraction from pages containing many records. Furthermore, we proposed our own multi-stage information extraction methods. In this work, we explore and demonstrate several strategies for applying MarkupLM to the specific challenges of multi-record web pages. Our experiments validate the advantages of our methods. By releasing our dataset to the public, we aim to advance the field of information extraction from multi-record pages.
The Unified Extensible Firmware Interface (UEFI) is a standardised interface between the firmware and the operating system used in all x86-based platforms over the past ten years, which continues to spread to other architectures such as ARM and RISC-V. The UEFI incorporates a modular design based on images containing a driver or an application in a Common Object File Format (COFF) either as a Portable Executable (PE) or as a Terse Executable (TE). The de-facto standard generic UEFI services implementation, including the image loading functionality, is TianoCore EDK II. Its track of security issues shows numerous design and implementation flaws some of which are yet to be addressed. In this paper we outline both the requirements for a secure UEFI Image Loader and the issues of the existing implementation. As an alternative we propose a formally verified Image Loader supporting both PE and TE images with fine-grained hardening enabling a seamless integration with EDK II and subsequently with the other firmwares.
Online network crawling tasks require a lot of efforts for the researchers to collect the data. One of them is identification of important nodes, which has many applications starting from viral marketing to the prevention of disease spread. Various crawling algorithms has been suggested but their efficiency is not studied well. In this paper we compared six known crawlers on the task of collecting the fraction of the most influential nodes of graph. We analyzed crawlers behavior for four measures of node influence: node degree, k-coreness, betweenness centrality, and eccentricity. The experiments confirmed that greedy methods perform the best in many settings, but the cases exist when they are very inefficient.
There are no more papers matching your filters at the moment.