Claremont Graduate University
This research investigates the use of Long Short-Term Memory (LSTM) networks to predict Major League Baseball player home run totals. It demonstrates that LSTMs can achieve higher predictive precision and lower error rates compared to traditional machine learning models and the established ZiPS projection system, exhibiting greater robustness for future predictions while also highlighting challenges in forecasting outlier performances.
While the universal approximation property holds both for hierarchical and shallow networks, we prove that deep (hierarchical) networks can approximate the class of compositional functions with the same accuracy as shallow networks but with exponentially lower number of training parameters as well as VC-dimension. This theorem settles an old conjecture by Bengio on the role of depth in networks. We then define a general class of scalable, shift-invariant algorithms to show a simple and natural set of requirements that justify deep convolutional networks.
9
This dissertation presents two signal processing methods using specially designed localized kernels for parameter recovery under noisy condition. The first method addresses the estimation of frequencies and amplitudes in multidimensional exponential models. It utilizes localized trigonometric polynomial kernels to detect the multivariate frequencies, followed by a more detailed parameter estimation. We compare our method with MUSIC and ESPRIT, which are classical subspace-based algorithms widely used for estimating the parameters of exponential signals. In the univariate case, the method outperforms MUSIC and ESPRIT under low signal-to-noise ratios. For the multivariate case, we develop a coordinate-wise projection and registration approach that achieves high recovery accuracy using significantly fewer samples than other methods. The second method focuses on separating linear chirp components from time-localized signal segments. A variant of the Signal Separation Operator (SSO) is constructed using a localized kernel. Instantaneous frequency estimates are obtained via FFT-based filtering, then clustered and fitted with piecewise linear regression. The method operates without prior knowledge of the number of components and is shown to recover intersecting and discontinuous chirps at SNR levels as low as -30 dB. Both methods share an idea based on localized kernels and efficient FFT-based implementation, and neither requires subspace decomposition or sparsity regularization. Experimental results confirm the robustness and tractability of the proposed approaches across a range of simulated data conditions. Potential extensions include application to nonlinear chirps, adaptive kernel design, and signal classification using extracted features.
The hypothalamic pituitary adrenal (HPA) axis responds to physical and mental challenge to maintain homeostasis in part by controlling the body's cortisol level. Dysregulation of the HPA axis is implicated in numerous stress-related diseases. For a structured model of the HPA axis that includes the glucocorticoid receptor but does not take into account the system response delay, we analyze linear and non-linear stability of stationary solutions. For a second mathematical model that describes the mechanism of the HPA axis self-regulatory activities and takes into account a delay of system response, we prove existence of periodic solutions under certain assumptions on ranges of parameter values and analyze stability of these solutions with respect to the time delay value.
Sparse representation of a single measurement vector (SMV) has been explored in a variety of compressive sensing applications. Recently, SMV models have been extended to solve multiple measurement vectors (MMV) problems, where the underlying signal is assumed to have joint sparse structures. To circumvent the NP-hardness of the 0\ell_0 minimization problem, many deterministic MMV algorithms solve the convex relaxed models with limited efficiency. In this paper, we develop stochastic greedy algorithms for solving the joint sparse MMV reconstruction problem. In particular, we propose the MMV Stochastic Iterative Hard Thresholding (MStoIHT) and MMV Stochastic Gradient Matching Pursuit (MStoGradMP) algorithms, and we also utilize the mini-batching technique to further improve their performance. Convergence analysis indicates that the proposed algorithms are able to converge faster than their SMV counterparts, i.e., concatenated StoIHT and StoGradMP, under certain conditions. Numerical experiments have illustrated the superior effectiveness of the proposed algorithms over their SMV counterparts.
Researchers at USC Information Sciences Institute and collaborators developed a data-driven framework to quantify "in-group love" and "out-group hate" from contentious online discussions. The proposed discrete choice model, validated on Twitter data regarding COVID-19 masking and lockdowns, accurately reproduced observed partisan opinion divergence and identified the relative influence of in-group favoritism and out-group animosity.
A central problem in machine learning is often formulated as follows: Given a dataset {(xj,yj)}j=1M\{(x_j, y_j)\}_{j=1}^M, which is a sample drawn from an unknown probability distribution, the goal is to construct a functional model ff such that f(x)yf(x) \approx y for any (x,y)(x, y) drawn from the same distribution. Neural networks and kernel-based methods are commonly employed for this task due to their capacity for fast and parallel computation. The approximation capabilities, or expressive power, of these methods have been extensively studied over the past 35 years. In this paper, we will present examples of key ideas in this area found in the literature. We will discuss emerging trends in machine learning including the role of shallow/deep networks, approximation on manifolds, physics-informed neural surrogates, neural operators, and transformer architectures. Despite function approximation being a fundamental problem in machine learning, approximation theory does not play a central role in the theoretical foundations of the field. One unfortunate consequence of this disconnect is that it is often unclear how well trained models will generalize to unseen or unlabeled data. In this review, we examine some of the shortcomings of the current machine learning framework and explore the reasons for the gap between approximation theory and machine learning practice. We will then introduce our novel research to achieve function approximation on unknown manifolds without the need to learn specific manifold features, such as the eigen-decomposition of the Laplace-Beltrami operator or atlas construction. In many machine learning problems, particularly classification tasks, the labels yjy_j are drawn from a finite set of values.
The problem of classification in machine learning has often been approached in terms of function approximation. In this paper, we propose an alternative approach for classification in arbitrary compact metric spaces which, in theory, yields both the number of classes, and a perfect classification using a minimal number of queried labels. Our approach uses localized trigonometric polynomial kernels initially developed for the point source signal separation problem in signal processing. Rather than point sources, we argue that the various classes come from different probability distributions. The localized kernel technique developed for separating point sources is then shown to separate the supports of these distributions. This is done in a hierarchical manner in our MASC algorithm to accommodate touching/overlapping class boundaries. We illustrate our theory on several simulated and real life datasets, including the Salinas and Indian Pines hyperspectral datasets and a document dataset.
The problem of extending a function ff defined on a training data C\mathcal{C} on an unknown manifold X\mathbb{X} to the entire manifold and a tubular neighborhood of this manifold is considered in this paper. For X\mathbb{X} embedded in a high dimensional ambient Euclidean space RD\mathbb{R}^D, a deep learning algorithm is developed for finding a local coordinate system for the manifold {\bf without eigen--decomposition}, which reduces the problem to the classical problem of function approximation on a low dimensional cube. Deep nets (or multilayered neural networks) are proposed to accomplish this approximation scheme by using the training data. Our methods do not involve such optimization techniques as back--propagation, while assuring optimal (a priori) error bounds on the output in terms of the number of derivatives of the target function. In addition, these methods are universal, in that they do not require a prior knowledge of the smoothness of the target function, but adjust the accuracy of approximation locally and automatically, depending only upon the local smoothness of the target function. Our ideas are easily extended to solve both the pre--image problem and the out--of--sample extension problem, with a priori bounds on the growth of the function thus extended.
The problem of real time prediction of blood glucose (BG) levels based on the readings from a continuous glucose monitoring (CGM) device is a problem of great importance in diabetes care, and therefore, has attracted a lot of research in recent years, especially based on machine learning. An accurate prediction with a 30, 60, or 90 minute prediction horizon has the potential of saving millions of dollars in emergency care costs. In this paper, we treat the problem as one of function approximation, where the value of the BG level at time t+ht+h (where hh the prediction horizon) is considered to be an unknown function of dd readings prior to the time tt. This unknown function may be supported in particular on some unknown submanifold of the dd-dimensional Euclidean space. While manifold learning is classically done in a semi-supervised setting, where the entire data has to be known in advance, we use recent ideas to achieve an accurate function approximation in a supervised setting; i.e., construct a model for the target function. We use the state-of-the-art clinically relevant PRED-EGA grid to evaluate our results, and demonstrate that for a real life dataset, our method performs better than a standard deep network, especially in hypoglycemic and hyperglycemic regimes. One noteworthy aspect of this work is that the training data and test data may come from different distributions.
We show that, under certain smoothness conditions, a Brownian martingale, when evaluated at a fixed time, can be represented via an exponential formula at a later time. The time-dependent generator of this exponential operator only depends on the second order Malliavin derivative operator evaluated along a "frozen path". The exponential operator can be expanded explicitly to a series representation, which resembles the Dyson series of quantum mechanics. Our continuous-time martingale representation result can be proven independently by two different methods. In the first method, one constructs a time-evolution equation, by passage to the limit of a special case of a backward Taylor expansion of an approximating discrete time martingale. The exponential formula is a solution of the time-evolution equation, but we emphasize in our article that the time-evolution equation is a separate result of independent interest. In the second method, which we only highlight in this article, we use the property of denseness of exponential functions. We provide several applications of the exponential formula, and briefly highlight numerical applications of the backward Taylor expansion.
We introduce a modified rack algebra Z[X] for racks X with finite rack rank N. We use representations of Z[X] into rings, known as rack modules, to define enhancements of the rack counting invariant for classical and virtual knots and links. We provide computations and examples to show that the new invariants are strictly stronger than the unenhanced counting invariant and are not determined by the Jones or Alexander polynomials.
We present a graph-based variational algorithm for classification of high-dimensional data, generalizing the binary diffuse interface model to the case of multiple classes. Motivated by total variation techniques, the method involves minimizing an energy functional made up of three terms. The first two terms promote a stepwise continuous classification function with sharp transitions between classes, while preserving symmetry among the class labels. The third term is a data fidelity term, allowing us to incorporate prior information into the model in a semi-supervised framework. The performance of the algorithm on synthetic data, as well as on the COIL and MNIST benchmark datasets, is competitive with state-of-the-art graph-based multiclass segmentation methods.
Despite its importance, studying economic behavior across diverse, non-WEIRD (Western, Educated, Industrialized, Rich, and Democratic) populations presents significant challenges. We address this issue by introducing a novel methodology that uses Large Language Models (LLMs) to create synthetic cultural agents (SCAs) representing these populations. We subject these SCAs to classic behavioral experiments, including the dictator and ultimatum games. Our results demonstrate substantial cross-cultural variability in experimental behavior. Notably, for populations with available data, SCAs' behaviors qualitatively resemble those of real human subjects. For unstudied populations, our method can generate novel, testable hypotheses about economic behavior. By integrating AI into experimental economics, this approach offers an effective and ethical method to pilot experiments and refine protocols for hard-to-reach populations. Our study provides a new tool for cross-cultural economic studies and demonstrates how LLMs can help experimental behavioral research.
This paper introduces kdiff, a novel kernel-based measure for estimating distances between instances of time series, random fields and other forms of structured data. This measure is based on the idea of matching distributions that only overlap over a portion of their region of support. Our proposed measure is inspired by MPdist which has been previously proposed for such datasets and is constructed using Euclidean metrics, whereas kdiff is constructed using non-linear kernel distances. Also, kdiff accounts for both self and cross similarities across the instances and is defined using a lower quantile of the distance distribution. Comparing the cross similarity to self similarity allows for measures of similarity that are more robust to noise and partial occlusions of the relevant signals. Our proposed measure kdiff is a more general form of the well known kernel-based Maximum Mean Discrepancy (MMD) distance estimated over the embeddings. Some theoretical results are provided for separability conditions using kdiff as a distance measure for clustering and classification problems where the embedding distributions can be modeled as two component mixtures. Applications are demonstrated for clustering of synthetic and real-life time series and image data, and the performance of kdiff is compared to competing distance measures for clustering.
This research suggests an add-on to empower Google Forms to be an automatic machine for generating multiple-choice questions (MCQs) used in online assessments. In this paper, we elaborate an add-on design mainly comprising question-formulating software and data storage. The algorithm as an intellectual mechanism of this software can produce MCQs at an analytical level. In an experiment, we found the MCQs could assess levels of students' knowledge comparably with those generated by human experts. This add-on can be applied generally to formulate MCQs for any rational concepts. With no effort from an instructor at runtime, the add-on can transform a few data instances describing rational concepts to be variety sets of MCQs.
This research is aimed to propose an artificial intelligence algorithm comprising an ontology-based design, text mining, and natural language processing for automatically generating gap-fill multiple choice questions (MCQs). The simulation of this research demonstrated an application of the algorithm in generating gap-fill MCQs about software testing. The simulation results revealed that by using 103 online documents as inputs, the algorithm could automatically produce more than 16 thousand valid gap-fill MCQs covering a variety of topics in the software testing domain. Finally, in the discussion section of this paper we suggest how the proposed algorithm should be applied to produce gap-fill MCQs being collected in a question pool used by a knowledge expert system.
Climate change is affecting every known society, especially for small farmers in Low-Income Countries because they depend heavily on rain, seasonality patterns, and known temperature ranges. To build climate change resilient communities among rural farmers, the first step is to understand the impact of climate change on the population. This paper proposes a Climate Change Vulnerability Assessment Framework (CCVAF) to assess climate change vulnerabilities among rural farmers. The CCVAF framework uses information and communication technology (ICT) to assess climate change vulnerabilities among rural farmers by integrating both community level and individual household level indicators. The CCVAF was instantiated into a GIS-based web application named THRIVE for different decision-makers to better assess how climate change is affecting rural farmers in Western Honduras. Qualitative evaluation of the THRIVE showed that it is an innovative and useful tool. The CCVAF contributes to not only the knowledge base of the climate change vulnerability assessment but also the design science literature by providing guidelines to design a class of climate change vulnerability assessment solutions.
Spectral clustering is widely used to partition graphs into distinct modules or communities. Existing methods for spectral clustering use the eigenvalues and eigenvectors of the graph Laplacian, an operator that is closely associated with random walks on graphs. We propose a new spectral partitioning method that exploits the properties of epidemic diffusion. An epidemic is a dynamic process that, unlike the random walk, simultaneously transitions to all the neighbors of a given node. We show that the replicator, an operator describing epidemic diffusion, is equivalent to the symmetric normalized Laplacian of a reweighted graph with edges reweighted by the eigenvector centralities of their incident nodes. Thus, more weight is given to edges connecting more central nodes. We describe a method that partitions the nodes based on the componentwise ratio of the replicator's second eigenvector to the first, and compare its performance to traditional spectral clustering techniques on synthetic graphs with known community structure. We demonstrate that the replicator gives preference to dense, clique-like structures, enabling it to more effectively discover communities that may be obscured by dense intercommunity linking.
Biological macromolecules including nucleic acids, proteins, and glycosaminoglycans are typically anionic and can span domains of up to hundreds of nanometers and even micron length scales. The structures exist in crowded environments that are dominated by weak multivalent electrostatic interactions that can be modeled using mean field continuum approaches that represent underlying molecular nanoscale biophysics. We develop such models for glycosaminoglycan brushes using both steady state modified Poisson-Boltzmann models and transient Poisson-Nernst-Planck models that incorporate important ion-specific (Hofmeister) effects. The results quantify how electroneutrality is attained through ion electrophoresis, dielectric decrement hydration forces, and ion-specific pairing. Brush-Salt interfacial profiles of the electrostatic potential as well as bound and unbound ions are characterized for imposed jump conditions across the interface. The models should be applicable to many intrinsically-disordered biophysical environments and are anticipated to provide insight into the design and development of therapeutics and drug-delivery vehicles to improve human health.
There are no more papers matching your filters at the moment.