Крымский федеральный университет имени В.И. Вернадского
A framework for Artificial General Intelligence (AGI) proposes a computational model of the psyche, integrating psychological, systems theory, and economic principles to formalize an agent's internal needs, motivation, and emotions. The model provides a multi-objective optimization approach for intelligent decision-making, with a minimal experiment showing balanced punishment can inhibit learning and exploration.
The article focuses on researching a system for processing and analyzing tracking data based on RFID technology to study the customer journey in retail. It examines the evolution of RFID technology, its key operating principles, and modern applications in retail that extend beyond logistics to include precise inventory management, loss prevention, and customer experience improvement. Particular attention is paid to the architecture for data collection, processing, and integration, specifically the ETL (extract, transform, load) methodology for transforming raw RFID and POS data into a structured analytical data warehouse. A detailed logical database model is proposed, designed for comprehensive analysis that combines financial sales metrics with behavioral patterns of customer movement. The article also analyzes the expected business benefits of RFID implementation through the lens of the Balanced Scorecard (BSC), which evaluates financial performance, customer satisfaction, and internal process optimization. It is concluded that the integration of tracking and transactional data creates a foundation for transforming retail into a precise, data-driven science, providing unprecedented visibility into physical product flows and consumer behavior.
The article is devoted to some adaptive methods for variational inequalities with relatively smooth and relatively strongly monotone operators. Starting from the recently proposed proximal variant of the extragradient method for this class of problems, we investigate in detail the method with adaptively selected parameter values. An estimate of the convergence rate of this method is proved. The result is generalized to a class of variational inequalities with relatively strongly monotone generalized smooth variational inequality operators. Numerical experiments have been performed for the problem of ridge regression and variational inequality associated with box-simplex games.
This textbook is based on lectures given by the authors at MIPT (Moscow), HSE (Moscow), FEFU (Vladivostok), V.I. Vernadsky KFU (Simferopol), ASU (Republic of Adygea), and the University of Grenoble-Alpes (Grenoble, France). First of all, the authors focused on the program of a two-semester course of lectures on convex optimization, which is given to students of MIPT. The first chapter of this book contains the materials of the first semester ("Fundamentals of convex analysis and optimization"), the second and third chapters contain the materials of the second semester ("Numerical methods of convex optimization"). The textbook has a number of features. First, in contrast to the classic manuals, this book does not provide proofs of all the theorems mentioned. This allowed, on one side, to describe more themes, but on the other side, made the presentation less self-sufficient. The second important point is that part of the material is advanced and is published in the Russian educational literature, apparently for the first time. Third, the accents that are given do not always coincide with the generally accepted accents in the textbooks that are now popular. First of all, we talk about a sufficiently advanced presentation of conic optimization, including robust optimization, as a vivid demonstration of the capabilities of modern convex analysis.
In the monograph "Strong artificial intelligence. On the Approaches to Superintelligence" contains an overview of general artificial intelligence (AGI). As an anthropomorphic research area, it includes Brain Principles Programming (BPP) -- the formalization of universal mechanisms (principles) of the brain work with information, which are implemented at all levels of the organization of nervous tissue. This monograph contains a formalization of these principles in terms of category theory. However, this formalization is not enough to develop algorithms for working with information. In this paper, for the description and modeling of BPP, it is proposed to apply mathematical models and algorithms developed earlier, which modeling cognitive functions and base on well-known physiological, psychological and other natural science theories. The paper uses mathematical models and algorithms of the following theories: this http URL Theory of Functional Brain Systems, Eleanor Rosch prototypical categorization theory, Bob Rehder theory of causal models and "natural" classification. As a result, a formalization of BPP is obtained and computer experiments demonstrating the operation of algorithms are presented.
CatBoost is a popular machine learning library. CatBoost models are based on oblivious decision trees, making training and evaluation rapid. CatBoost has many applications, and some require low latency and high throughput evaluation. This paper investigates the possibilities for improving CatBoost's performance in single-core CPU computations. We explore the new features provided by the AVX instruction sets to optimize evaluation. We increase performance by 20-40% using AVX2 instructions without quality impact. We also introduce a new trade-off between speed and quality. Using float16 for leaf values and AVX-512 instructions, we achieve 50-70% speed-up.
Some variant of the Frank-Wolfe method for convex optimization problems with adaptive selection of the step parameter corresponding to information about the smoothness of the objective function (the Lipschitz constant of the gradient). Theoretical estimates of the quality of the solution provided by the method are obtained in terms of adaptively selected parameters L_k. An important feature of the obtained result is the elaboration of a situation in which it is possible to guarantee, after the completion of the iteration, a reduction of the discrepancy in the function by at least 2 times. At the same time, using of adaptively selected parameters in theoretical estimates makes it possible to apply the method for both smooth and nonsmooth problems, provided that the exit criterion from the iteration is met. For smooth problems, this can be proved, and the theoretical estimates of the method are guaranteed to be optimal up to multiplication by a constant factor. Computational experiments were performed, and a comparison with two other algorithms was carried out, during which the efficiency of the algorithm was demonstrated for a number of both smooth and non-smooth problems.
Using data samples of (10087±44)×106(10087\pm 44)\times10^{6} J/ψJ/\psi events and (2712.4±14.3)×106(2712.4\pm 14.3)\times10^{6} ψ(3686)\psi(3686) events collected with the BESIII detector at the BEPCII collider, we search for the CP violating decays J/ψKS0KS0J/\psi\rightarrow K^{0}_{S}K^{0}_{S} and $\psi(3686)\rightarrow K^{0}_{S}K^{0}_{S}$. No significant signals are observed over the expected background yields. The upper limits on their branching fractions are set as \mathcal{B}(J/\psi\rightarrow K^{0}_{S}K^{0}_{S}) <4.7\times 10^{-9} and \mathcal{B}(\psi(3686)\rightarrow K^{0}_{S}K^{0}_{S}) <1.1\times 10^{-8} at the 90% confidence level. These results improve the previous limits by a factor of three for J/ψKS0KS0J/\psi\rightarrow K^{0}_{S} K^{0}_{S} and two orders of magnitude for ψ(3686)KS0KS0\psi(3686)\rightarrow K^{0}_{S} K^{0}_{S}.
The study aimed at detecting cartel collusion involved analyzing decisions of the Russian Federal Antimonopoly Service and data on auctions. As a result, a machine learning model was developed that predicts with 91% accuracy the signs of collusion between bidders based on their history after dividing 40 auctions into test and training samples in a 30/70 ratio. Decomposition of the model using the Shepley vector allowed the interpretation of the decision-making process. The behavior of honest companies in auctions was also studied, confirmed by independent simulation validation.
06 Mar 2023
In this thesis we study the preconditioning of square, non-symmetric and real Toeplitz systems. We prove theoretical results, which constitute sufficient conditions for the efficiency of the proposed preconditioners and the fast convergence to the solution of the system, by the Preconditioned Generalized Minimal Residual method (PGMRES) as well as by the Preconditioned Conjugate Gradient method applied to the system of Normal Equations (PCGN). As introduction, in the first chapter, we give the basic definitions and theorems/lemmas that we use to prove the theoretical results of the thesis. These are dealing with the clustering of the eigenvalues, as well as of the singular values, which is a criterion for the efficiency of the preconditioner. In the second chapter we construct a band Toeplitz preconditioner for wellconditioned, as well as for ill-conditioned systems. The preconditioning technique is based on the elimination of the roots of the generating function (if there exist), by a trigonometric polynomial, and on a further approximation. The clustering of the eigenvalues and the singular values of the preconditioned system has been proven. In the next chapter we construct a circulant preconditioner dealing with well-conditioned Toeplitz systems and a band-times-circulant preconditioner for ill-conditioned ones. We prove analogous theoretical results and we give a comparison with the preconditioner proposed previously at the numerical results of the last section. In the fourth and last chapter of the thesis we study Toeplitz systems, having an unknown generating function. We adapt the preconditioners constructed at the previous chapters. After estimating the generating function, its roots and the multiplicities of them, we construct the corresponding preconditioners.
This book is intended for beginners who have no familiarity with deep learning. Our only expectation from readers is that they already have the basic programming skills in Python.
In this expository note we present simple proofs of the lower bound of Ramsey numbers (Erdös theorem), and of the estimation of discrepancy. Neither statements nor proofs require any knowledge beyond high-school curriculum (except a minor detail). Thus they are accessible to non-specialists, in particular, to students. Our exposition is simpler than the standard exposition because no probabilistic language is used. In order to prove the existence of a `good' object we prove that the number of `bad' objects is smaller than the number of all objects.
A new adaptive approach is proposed for variational inequalities with a Lipschitz-continuous field. Estimates of the necessary number of iterations are obtained to achieve a given quality of the variational inequality solution. A generalization of the method under consideration to the case of a Holder-continuous field is considered.
Considering the high-performance and low-power requirements of edge AI, this study designs a specialized instruction set processor for edge AI based on the RISC-V instruction set architecture, addressing practical issues in digital signal processing for edge devices. This design enhances the execution efficiency of edge AI and reduces its energy consumption with limited hardware overhead, meeting the demands for efficient large language model (LLM) inference computation in edge AI applications. The main contributions of this paper are as follows: For the characteristics of large language models, custom instructions were extended based on the RISC-V instruction set to perform vector dot product calculations, accelerating the computation of large language models on dedicated vector dot product acceleration hardware. Based on the open-source high-performance RISC-V processor core XiangShan Nanhu architecture, the vector dot product specialized instruction set processor Nanhu-vdot was implemented, which adds vector dot product calculation units and pipeline processing logic on top of the XiangShan this http URL Nanhu-vdot underwent FPGA hardware testing, achieving over four times the speed of scalar methods in vector dot product computation. Using a hardware-software co-design approach for second-generation Generative Pre-Trained Transformer (GPT-2) model inference, the speed improved by approximately 30% compared to pure software implementation with almost no additional consumption of hardware resources and power consumption.
The integrable case of Kowalevski-Yehia in the dynamics of a gyrostat is considered. We present the new way to classify the bifurcation diagrams of the reduced systems. We find the efficiently checked existence conditions for the critical motions on the area integral constant sections of the surfaces bearing the 3-diagram of the complete system. The cases when these conditions qualitatively change give the analytical expressions of the dependencies between the area constant and the gyrostatic momentum forming the classifying set for the two-parametric family of the reduced systems diagrams. Finally, we present the computer system, which satisfy the given definition of the electronic atlas.
Variational inequalities as an effective tool for solving applied problems, including machine learning tasks, have been attracting more and more attention from researchers in recent years. The use of variational inequalities covers a wide range of areas - from reinforcement learning and generative models to traditional applications in economics and game theory. At the same time, it is impossible to imagine the modern world of machine learning without distributed optimization approaches that can significantly speed up the training process on large amounts of data. However, faced with the high costs of communication between devices in a computing network, the scientific community is striving to develop approaches that make computations cheap and stable. In this paper, we investigate the compression technique of transmitted information and its application to the distributed variational inequalities problem. In particular, we present a method based on advanced techniques originally developed for minimization problems. For the new method, we provide an exhaustive theoretical convergence analysis for cocoersive strongly monotone variational inequalities. We conduct experiments that emphasize the high performance of the presented technique and confirm its practical applicability.
The possibility of optimization of high voltage hybrid SIT-MOS transistors (HSMT) by local reduction of the lifetime near anode emitter and/or reduction of the anode emitter injection ability by three different ways has been investigated using two-dimensional numerical simulation. It has been shown that all of these methods proposed previously for optimization of insulated-gate bipolar transistor (IGBT) are physically equivalent and makes it possible to reduce turn-off energy losses EoffE_{off} in HSMT by 30-40%. Importantly that energy EoffE_{off} in optimized HSMT appears to be 15-35% less than in equivalent trench IGBT under other equal conditions.
Large Language Models (LLMs) have revolutionized various aspects of engineering and science. Their utility is often bottlenecked by the lack of interaction with the external digital environment. To overcome this limitation and achieve integration of LLMs and Artificial Intelligence (AI) into real-world applications, customized AI agents are being constructed. Based on the technological trends and techniques, we extract a high-level approach for constructing these AI agents, focusing on their underlying architecture. This thesis serves as a comprehensive guide that elucidates a multi-faceted approach for empowering LLMs with the capability to leverage Application Programming Interfaces (APIs). We present a 7-step methodology that begins with the selection of suitable LLMs and the task decomposition that is necessary for complex problem-solving. This methodology includes techniques for generating training data for API interactions and heuristics for selecting the appropriate API among a plethora of options. These steps eventually lead to the generation of API calls that are both syntactically and semantically aligned with the LLM's understanding of a given task. Moreover, we review existing frameworks and tools that facilitate these processes and highlight the gaps in current attempts. In this direction, we propose an on-device architecture that aims to exploit the functionality of carry-on devices by using small models from the Hugging Face community. We examine the effectiveness of these approaches on real-world applications of various domains, including the generation of a piano sheet. Through an extensive analysis of the literature and available technologies, this thesis aims to set a compass for researchers and practitioners to harness the full potential of LLMs augmented with external tool capabilities, thus paving the way for more autonomous, robust, and context-aware AI agents.
This paper studies non-smooth problems of convex stochastic optimization. Using the smoothing technique based on the replacement of the function value at the considered point by the averaged function value over a ball (in l1l_1-norm or l2l_2-norm) of small radius with the center in this point, the original problem is reduced to a smooth problem (whose Lipschitz constant of the gradient is inversely proportional to the radius of the ball). An important property of the smoothing used is the possibility to calculate an unbiased estimation of the gradient of a smoothed function based only on realizations of the original function. The obtained smooth stochastic optimization problem is proposed to be solved in a distributed federated learning architecture (the problem is solved in parallel: nodes make local steps, e.g. stochastic gradient descent, then they communicate - all with all, then all this is repeated). The goal of this paper is to build on the current advances in gradient-free non-smooth optimization and in feild of federated learning, gradient-free methods for solving non-smooth stochastic optimization problems in federated learning architecture.
Conditions sufficient for the transience of the process have been established for the Markov diffusion model with switching and two modes, transient and ergodic, with intensities bounded away from zero. This paper shows limitations on the conditions for exponential ergodicity with a given switching system.
There are no more papers matching your filters at the moment.