alphaXiv

Peter the Great St.Petersburg Polytechnic University

09 Feb 2025

ai-for-health computer-science artificial-intelligence

Survival Concept-Based Learning Models

Peter the Great St.Petersburg Polytechnic University

Concept-based learning enhances prediction accuracy and interpretability by leveraging high-level, human-understandable concepts. However, existing CBL frameworks do not address survival analysis tasks, which involve predicting event times in the presence of censored data -- a common scenario in fields like medicine and reliability analysis. To bridge this gap, we propose two novel models: SurvCBM (Survival Concept-based Bottleneck Model) and SurvRCM (Survival Regularized Concept-based Model), which integrate concept-based learning with survival analysis to handle censored event time data. The models employ the Cox proportional hazards model and the Beran estimator. SurvCBM is based on the architecture of the well-known concept bottleneck model, offering interpretable predictions through concept-based explanations. SurvRCM uses concepts as regularization to enhance accuracy. Both models are trained end-to-end and provide interpretable predictions in terms of concepts. Two interpretability approaches are proposed: one leveraging the linear relationship in the Cox model and another using an instance-based explanation framework with the Beran estimator. Numerical experiments demonstrate that SurvCBM outperforms SurvRCM and traditional survival models, underscoring the importance and advantages of incorporating concept information. The code for the proposed algorithms is publicly available.

27 Apr 2017

computer-science machine-learning ensemble-methods

A Siamese Deep Forest

Peter the Great St.Petersburg Polytechnic University

A Siamese Deep Forest (SDF) is proposed in the paper. It is based on the Deep Forest or gcForest proposed by Zhou and Feng and can be viewed as a gcForest modification. It can be also regarded as an alternative to the well-known Siamese neural networks. The SDF uses a modified training set consisting of concatenated pairs of vectors. Moreover, it defines the class distributions in the deep forest as the weighted sum of the tree class probabilities such that the weights are determined in order to reduce distances between similar pairs and to increase them between dissimilar points. We show that the weights can be obtained by solving a quadratic optimization problem. The SDF aims to prevent overfitting which takes place in neural networks when only limited training data are available. The numerical experiments illustrate the proposed distance metric method.

19 Jul 2023

computer-science artificial-intelligence machine-learning

A New Computationally Simple Approach for Implementing Neural Networks with Output Hard Constraints

Peter the Great St.Petersburg Polytechnic University

A new computationally simple method of imposing hard convex constraints on the neural network output values is proposed. The key idea behind the method is to map a vector of hidden parameters of the network to a point that is guaranteed to be inside the feasible set defined by a set of constraints. The mapping is implemented by the additional neural network layer with constraints for output. The proposed method is simply extended to the case when constraints are imposed not only on the output vectors, but also on joint constraints depending on inputs. The projection approach to imposing constraints on outputs can simply be implemented in the framework of the proposed method. It is shown how to incorporate different types of constraints into the proposed method, including linear and quadratic constraints, equality constraints, and dynamic constraints, constraints in the form of boundaries. An important feature of the method is its computational simplicity. Complexities of the forward pass of the proposed neural network layer by linear and quadratic constraints are O(n*m) and O(n^2*m), respectively, where n is the number of variables, m is the number of constraints. Numerical experiments illustrate the method by solving optimization and classification problems. The code implementing the method is publicly available.

224

30 Mar 2023

high-energy-physics-experiment high-energy-physics-phenomenology nuclear-experiment

Hot QCD White Paper

Charles University

UCLA Vanderbilt University Korea University

McGill University

Yale University

University of Texas at Austin University of Illinois Chicago

University of Minnesota Florida State University

Rice University

Stony Brook University

Brookhaven National Laboratory City University of New York

Lawrence Berkeley National Laboratory Los Alamos National Laboratory

Duke University Oak Ridge National Laboratory University of Houston

The Ohio State University Wayne State University University of Florence Czech Technical University in Prague Georgia State University Kent State University University of California Berkeley Eötvös Loránd University Frankfurt Institute for Advanced Studies Benemérita Universidad Autónoma de Puebla National Centre for Nuclear Research GSI Helmholtzzentrum für Schwerionenforschung Ohio University Augustana University Institute for Theoretical Physics - Goethe University Peter the Great St.Petersburg Polytechnic University Nara Women’s University NRC Kurchatov Institute−PNPI

Hot QCD physics studies the nuclear strong force under extreme temperature and densities. Experimentally these conditions are achieved via high-energy collisions of heavy ions at the Relativistic Heavy Ion Collider (RHIC) and the Large Hadron Collider (LHC). In the past decade, a unique and substantial suite of data was collected at RHIC and the LHC, probing hydrodynamics at the nucleon scale, the temperature dependence of the transport properties of quark-gluon plasma, the phase diagram of nuclear matter, the interaction of quarks and gluons at different scales and much more. This document, as part of the 2023 nuclear science long range planning process, was written to review the progress in hot QCD since the 2015 Long Range Plan for Nuclear Science, as well as highlight the realization of previous recommendations, and present opportunities for the next decade, building on the accomplishments and investments made in theoretical developments and the construction of new detectors. Furthermore, this document provides additional context to support the recommendations voted on at the Joint Hot and Cold QCD Town Hall Meeting, which are reported in a separate document.

18 Nov 2019

computer-science machine-learning embedding-methods

An explanation method for Siamese neural networks

Peter the Great St.Petersburg Polytechnic University

A new method for explaining the Siamese neural network is proposed. It uses the following main ideas. First, the explained feature vector is compared with the prototype of the corresponding class computed at the embedding level (the Siamese neural network output). The important features at this level are determined as features which are close to the same features of the prototype. Second, an autoencoder is trained in a special way in order to take into account the embedding level of the Si-amese network, and its decoder part is used for reconstructing input data with the corresponding changes. Numerical experiments with the well-known dataset MNIST illustrate the propose method.

10 Apr 2025

ai-for-health computer-science machine-learning

Automated Video-EEG Analysis in Epilepsy Studies: Advances and Challenges

Peter the Great St.Petersburg Polytechnic University Medical Center "XXI Century"

Epilepsy is typically diagnosed through electroencephalography (EEG) and long-term video-EEG (vEEG) monitoring. The manual analysis of vEEG recordings is time-consuming, necessitating automated tools for seizure detection. Recent advancements in machine learning have shown promise in real-time seizure detection and prediction using EEG and video data. However, diversity of seizure symptoms, markup ambiguities, and limited availability of multimodal datasets hinder progress. This paper reviews the latest developments in automated video-EEG analysis and discusses the integration of multimodal data. We also propose a novel pipeline for treatment effect estimation from vEEG data using concept-based learning, offering a pathway for future research in this domain.

13 Feb 2023

attention-mechanisms computer-science machine-learning

Multiple Instance Learning with Trainable Decision Tree Ensembles

Peter the Great St.Petersburg Polytechnic University

A new random forest based model for solving the Multiple Instance Learning (MIL) problem under small tabular data, called Soft Tree Ensemble MIL (STE-MIL), is proposed. A new type of soft decision trees is considered, which is similar to the well-known soft oblique trees, but with a smaller number of trainable parameters. In order to train the trees, it is proposed to convert them into neural networks of a specific form, which approximate the tree functions. It is also proposed to aggregate the instance and bag embeddings (output vectors) by using the attention mechanism. The whole STE-MIL model, including soft decision trees, neural networks, the attention mechanism and a classifier, is trained in an end-to-end manner. Numerical experiments with tabular datasets illustrate STE-MIL. The corresponding code implementing the model is publicly available.

148

11 Oct 2022

attention-mechanisms computer-science artificial-intelligence

LARF: Two-level Attention-based Random Forests with a Mixture of Contamination Models

Peter the Great St.Petersburg Polytechnic University

New models of the attention-based random forests called LARF (Leaf Attention-based Random Forest) are proposed. The first idea behind the models is to introduce a two-level attention, where one of the levels is the "leaf" attention and the attention mechanism is applied to every leaf of trees. The second level is the tree attention depending on the "leaf" attention. The second idea is to replace the softmax operation in the attention with the weighted sum of the softmax operations with different parameters. It is implemented by applying a mixture of the Huber's contamination models and can be regarded as an analog of the multi-head attention with "heads" defined by selecting a value of the softmax parameter. Attention parameters are simply trained by solving the quadratic optimization problem. To simplify the tuning process of the models, it is proposed to make the tuning contamination parameters to be training and to compute them by solving the quadratic optimization problem. Many numerical experiments with real datasets are performed for studying LARFs. The code of proposed algorithms can be found in this https URL

10 Dec 2024

attention-mechanisms computer-science machine-learning

SurvBETA: Ensemble-Based Survival Models Using Beran Estimators and Several Attention Mechanisms

Peter the Great St.Petersburg Polytechnic University Higher School of Artificial Intelligence Technologies

Many ensemble-based models have been proposed to solve machine learning problems in the survival analysis framework, including random survival forests, the gradient boosting machine with weak survival models, ensembles of the Cox models. To extend the set of models, a new ensemble-based model called SurvBETA (the Survival Beran estimator Ensemble using Three Attention mechanisms) is proposed where the Beran estimator is used as a weak learner in the ensemble. The Beran estimator can be regarded as a kernel regression model taking into account the relationship between instances. Outputs of weak learners in the form of conditional survival functions are aggregated with attention weights taking into account the distance between the analyzed instance and prototypes of all bootstrap samples. The attention mechanism is used three times: for implementation of the Beran estimators, for determining specific prototypes of bootstrap samples and for aggregating the weak model predictions. The proposed model is presented in two forms: in a general form requiring to solve a complex optimization problem for its training; in a simplified form by considering a special representation of the attention weights by means of the imprecise Huber's contamination model which leads to solving a simple optimization problem. Numerical experiments illustrate properties of the model on synthetic data and compare the model with other survival models on real data. A code implementing the proposed model is publicly available.

05 Dec 2024

astrophysics-of-galaxies physics

Kinematic distances of galaxies in the Local Volume

Special Astrophysical Observatory Peter the Great St.Petersburg Polytechnic University

We consider the kinematic distances to nearby galaxies obtained by the Numerical Action Method (NAM) based on the Cosmic-flow-3 survey data. NAM distances are compared with 418 high-precision distances measured by the Tip of the Red Giant Branch (TRGB) method using the Hubble Space Telescope. We estimated the average difference = -0.30 +- 0.08 Mpc and the standard deviation of 1.57 Mpc. Approximately the same difference in the distance scale is obtained in comparison with less accurate distance estimates through the membership of galaxies in known groups or from the Tully-Fisher relation. We conclude that the NAM method provides distance estimates with an accuracy of 20% within the Local Volume, which is valid for ~90% of the sky, except for the regions of the Virgo cluster and the Coma-I group.

22 Mar 2025

computer-science machine-learning optimization-methods

A novel gradient-based method for decision trees optimizing arbitrary differential loss functions

Peter the Great St.Petersburg Polytechnic University

There are many approaches for training decision trees. This work introduces a novel gradient-based method for constructing decision trees that optimize arbitrary differentiable loss functions, overcoming the limitations of heuristic splitting rules. Unlike traditional approaches that rely on heuristic splitting rules, the proposed method refines predictions using the first and second derivatives of the loss function, enabling the optimization of complex tasks such as classification, regression, and survival analysis. We demonstrate the method's applicability to classification, regression, and survival analysis tasks, including those with censored data. Numerical experiments on both real and synthetic datasets compare the proposed method with traditional decision tree algorithms, such as CART, Extremely Randomized Trees, and SurvTree. The implementation of the method is publicly available, providing a practical tool for researchers and practitioners. This work advances the field of decision tree-based modeling, offering a more flexible and accurate approach for handling structured data and complex tasks. By leveraging gradient-based optimization, the proposed method bridges the gap between traditional decision trees and modern machine learning techniques, paving the way for further innovations in interpretable and high-performing models.

14 Oct 2020

mathematics optimization-and-control

Fastener Installation Pattern Optimization in Airplane Assembly Process

Beihang University Technische Universität Berlin Universidade de Coimbra Peter the Great St.Petersburg Polytechnic University Università degli Studi di Verona

Within the framework of an airplane assembly process, in particular the process of fastener installation at drilled holes, the following two problems are studied in this work: quadratic programming optimization of final fastener positioning, and optimization of the fastener installation order. Several algorithms were suggested, modelled on MATLAB, tested and compared.

12 Oct 2020

computer-science machine-learning ensemble-methods

A Generalized Stacking for Implementing Ensembles of Gradient Boosting Machines

Peter the Great St.Petersburg Polytechnic University

The gradient boosting machine is one of the powerful tools for solving regression problems. In order to cope with its shortcomings, an approach for constructing ensembles of gradient boosting models is proposed. The main idea behind the approach is to use the stacking algorithm in order to learn a second-level meta-model which can be regarded as a model for implementing various ensembles of gradient boosting models. First, the linear regression of the gradient boosting models is considered as a simplest realization of the meta-model under condition that the linear model is differentiable with respect to its coefficients (weights). Then it is shown that the proposed approach can be simply extended on arbitrary differentiable combination models, for example, on neural networks which are differentiable and can implement arbitrary functions of gradient boosting models. Various numerical examples illustrate the proposed approach.

11 Dec 2021

attention-mechanisms computer-science machine-learning

Multi-Attention Multiple Instance Learning

Peter the Great St.Petersburg Polytechnic University

A new multi-attention based method for solving the MIL problem (MAMIL), which takes into account the neighboring patches or instances of each analyzed patch in a bag, is proposed. In the method, one of the attention modules takes into account adjacent patches or instances, several attention modules are used to get a diverse feature representation of patches, and one attention module is used to unite different feature representations to provide an accurate classification of each patch (instance) and the whole bag. Due to MAMIL, a combined representation of patches and their neighbors in the form of embeddings of a small dimensionality for simple classification is realized. Moreover, different types of patches are efficiently processed, and a diverse feature representation of patches in a bag by using several attention modules is implemented. A simple approach for explaining the classification predictions of patches is proposed. Numerical experiments with various datasets illustrate the proposed method.

31 Jul 2021

clustering-algorithms computer-science computer-vision-and-pattern-recognition

Representative elementary volume via averaged scalar Minkowski functionals

Novosibirsk State University Peter the Great St.Petersburg Polytechnic University A.P. Ershov Institute of Informatics Systems of SB RAS Sobolev Institute of Mathematics of SB RAS OOO “Gazpromneft NTC”

Representative Elementary Volume (REV) at which the material properties do not vary with change in volume is an important quantity for making measurements or simulations which represent the whole. We discuss the geometrical method to evaluation of REV based on the quantities coming in the Steiner formula from convex geometry. For bodies in the three-space this formula gives us four scalar functionals known as scalar Minkowski functionals. We demonstrate on certain samples that the values of such averaged functionals almost stabilize for cells for which the length of edges are greater than certain threshold value R. Therefore, from this point of view, it is reasonable to consider cubes of volume R^3 as representative elementary volumes.

03 May 2019

disordered-systems-and-neural-networks physics

Application of the random matrix theory to the boson peak in glasses

Ioffe Institute Peter the Great St.Petersburg Polytechnic University St.Petersburg Academic University

The density of vibrational states

g(\omega)

of an amorphous system is studied by using the random-matrix theory. Taking into account the most important correlations between elements of the random matrix of the system, equations for the density of vibrational states

g(\omega)

are obtained. The analysis of these equations shows that in the low-frequency region the vibrational density of states has the Debye behavior

g(\omega) \sim \omega^2

. In the higher frequency region, there is the boson peak as an additional contribution to the density of states. The obtained equations are in a good agreement with the numerical results and allow us to find an exact shape of the boson peak.

01 Jan 2019

computer-science machine-learning statistics

A weighted random survival forest

Peter the Great St.Petersburg Polytechnic University St.Petersburg Clinical Research Center for Special Types of Medical Care (Oncology-oriented)

A weighted random survival forest is presented in the paper. It can be regarded as a modification of the random forest improving its performance. The main idea underlying the proposed model is to replace the standard procedure of averaging used for estimation of the random survival forest hazard function by weighted avaraging where the weights are assigned to every tree and can be veiwed as training paremeters which are computed in an optimal way by solving a standard quadratic optimization problem maximizing Harrell's C-index. Numerical examples with real data illustrate the outperformance of the proposed model in comparison with the original random survival forest.

12 Jul 2022

attention-mechanisms computer-science machine-learning

AGBoost: Attention-based Modification of Gradient Boosting Machine

Peter the Great St.Petersburg Polytechnic University

A new attention-based model for the gradient boosting machine (GBM) called AGBoost (the attention-based gradient boosting) is proposed for solving regression problems. The main idea behind the proposed AGBoost model is to assign attention weights with trainable parameters to iterations of GBM under condition that decision trees are base learners in GBM. Attention weights are determined by applying properties of decision trees and by using the Huber's contamination model which provides an interesting linear dependence between trainable parameters of the attention and the attention weights. This peculiarity allows us to train the attention weights by solving the standard quadratic optimization problem with linear constraints. The attention weights also depend on the discount factor as a tuning parameter, which determines how much the impact of the weight is decreased with the number of iterations. Numerical experiments performed for two types of base learners, original decision trees and extremely randomized trees with various regression datasets illustrate the proposed model.

29 Jan 2024

computer-science artificial-intelligence machine-learning

Dual feature-based and example-based explanation methods

Peter the Great St.Petersburg Polytechnic University

A new approach to the local and global explanation is proposed. It is based on selecting a convex hull constructed for the finite number of points around an explained instance. The convex hull allows us to consider a dual representation of instances in the form of convex combinations of extreme points of a produced polytope. Instead of perturbing new instances in the Euclidean feature space, vectors of convex combination coefficients are uniformly generated from the unit simplex, and they form a new dual dataset. A dual linear surrogate model is trained on the dual dataset. The explanation feature importance values are computed by means of simple matrix calculations. The approach can be regarded as a modification of the well-known model LIME. The dual representation inherently allows us to get the example-based explanation. The neural additive model is also considered as a tool for implementing the example-based explanation approach. Many numerical experiments with real datasets are performed for studying the approach. The code of proposed algorithms is available.

17 Oct 2024

computer-science sound audio-and-speech-processing

STCON System for the CHiME-8 Challenge

ITMO University Peter the Great St.Petersburg Polytechnic University STCON LLC

This paper describes the STCON system for the CHiME-8 Challenge Task 1 (DASR) aimed at distant automatic speech transcription and diarization with multiple recording devices. Our main attention was paid to carefully trained and tuned diarization pipeline and speaker counting. This allowed to significantly reduce diarization error rate (DER) and obtain more reliable segments for speech separation and recognition. To improve source separation, we designed a Guided Target speaker Extraction (G-TSE) model and used it in conjunction with the traditional Guided Source Separation (GSS) method. To train various parts of our pipeline, we investigated several data augmentation and generation techniques, which helped us to improve the overall system quality.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Survival Concept-Based Learning Models

A Siamese Deep Forest

A New Computationally Simple Approach for Implementing Neural Networks with Output Hard Constraints

Hot QCD White Paper

An explanation method for Siamese neural networks

Automated Video-EEG Analysis in Epilepsy Studies: Advances and Challenges

Multiple Instance Learning with Trainable Decision Tree Ensembles

LARF: Two-level Attention-based Random Forests with a Mixture of Contamination Models

SurvBETA: Ensemble-Based Survival Models Using Beran Estimators and Several Attention Mechanisms

Kinematic distances of galaxies in the Local Volume

A novel gradient-based method for decision trees optimizing arbitrary differential loss functions

Fastener Installation Pattern Optimization in Airplane Assembly Process

A Generalized Stacking for Implementing Ensembles of Gradient Boosting Machines

Multi-Attention Multiple Instance Learning

Representative elementary volume via averaged scalar Minkowski functionals

Application of the random matrix theory to the boson peak in glasses

A weighted random survival forest

AGBoost: Attention-based Modification of Gradient Boosting Machine

Dual feature-based and example-based explanation methods

STCON System for the CHiME-8 Challenge

Events

AI for Law

Personalize Your Feed