IIM Indore
23 Nov 2025
Models with dimension more than the available sample size are now commonly used in various applications. A sensible inference is possible using a lower-dimensional structure. In regression problems with a large number of predictors, the model is often assumed to be sparse, with only a few predictors active. Interdependence between a large number of variables is succinctly described by a graphical model, where variables are represented by nodes on a graph and an edge between two nodes is used to indicate their conditional dependence given other variables. Many procedures for making inferences in the high-dimensional setting, typically using penalty functions to induce sparsity in the solution obtained by minimizing a loss function, were developed. Bayesian methods have been proposed for such problems more recently, where the prior takes care of the sparsity structure. These methods have the natural ability to also automatically quantify the uncertainty of the inference through the posterior distribution. Theoretical studies of Bayesian procedures in high-dimension have been carried out recently. Questions that arise are, whether the posterior distribution contracts near the true value of the parameter at the minimax optimal rate, whether the correct lower-dimensional structure is discovered with high posterior probability, and whether a credible region has adequate frequentist coverage. In this paper, we review these properties of Bayesian and related methods for several high-dimensional models such as many normal means problem, linear regression, generalized linear models, Gaussian and non-Gaussian graphical models. Effective computational approaches are also discussed.
Endoscopy provides a major contribution to the diagnosis of the Gastrointestinal Tract (GIT) diseases. With Colon Endoscopy having its certain limitations, Wireless Capsule Endoscopy is gradually taking over it in the terms of ease and efficiency. WCE is performed with a miniature optical endoscope which is swallowed by the patient and transmits colour images wirelessly during its journey through the GIT, inside the body of the patient. These images are used to implement an effective and computationally efficient approach which aims to detect the abnormal and normal tissues in the GIT automatically, and thus helps in reducing the manual work of the reviewers. The algorithm further aims to classify the diseased tissues into various GIT diseases that are commonly known to be affecting the tract. In this manuscript, the descriptor used for the detection of the interest points is Speeded Up Robust Features (SURF), which uses the colour information contained in the images which is converted to CIELAB space colours for better identification. The features extracted at the interest points are then used to train and test a Support Vector Machine (SVM), so that it automatically classifies the images into normal or abnormal and further detects the specific abnormalities. SVM, along with a few parameters, gives a very high accuracy of 94.58% while classifying normal and abnormal images and an accuracy of 82.91% while classifying into multi-class. The present work is an improvement on the previously reported analyses which were only limited to the bi-class classification using this approach.
Factorial designs are often used in various industrial and sociological experiments to identify significant factors and factor combinations that may affect the process response. In the statistics literature, several studies have investigated the analysis, construction, and isomorphism of factorial and fractional factorial designs. When there are multiple choices for a design, it is helpful to have an easy-to-use tool for identifying which are distinct, and which of those can be efficiently analyzed/has good theoretical properties. For this task, we present an R library called IsoCheck that checks the isomorphism of multi-stage 2^n factorial experiments with randomization restrictions. Through representing the factors and their combinations as a finite projective geometry, IsoCheck recasts the problem of searching over all possible relabelings as a search over collineations, then exploits projective geometric properties of the space to make the search much more efficient. Furthermore, a bitstring representation of the factorial effects is used to characterize all possible rearrangements of designs, thus facilitating quick comparisons after relabeling. We present several examples with R code to illustrate the usage of the main functions in IsoCheck. Besides checking equivalence and isomorphism of 2^n multi-stage factorial designs, we demonstrate how the functions of the package can be used to create a catalog of all non-isomorphic designs, and subsequently rank these designs based on a suitably defined ranking criterion. IsoCheck is free software and distributed under the General Public License and available from the Comprehensive R Archive Network.
In a variety of application areas, there is interest in assessing evidence of differences in the intensity of event realizations between groups. For example, in cancer genomic studies collecting data on rare variants, the focus is on assessing whether and how the variant profile changes with the disease subtype. Motivated by this application, we develop multiresolution nonparametric Bayes tests for differential mutation rates across groups. The multiresolution approach yields fast and accurate detection of spatial clusters of rare variants, and our nonparametric Bayes framework provides great flexibility for modeling the intensities of rare variants. Some theoretical properties are also assessed, including weak consistency of our Dirichlet Process-Poisson-Gamma mixture over multiple resolutions. Simulation studies illustrate excellent small sample properties relative to competitors, and we apply the method to detect rare variants related to common variable immunodeficiency from whole exome sequencing data on 215 patients and over 60,027 control subjects.
There are no more papers matching your filters at the moment.