Air Force Office of Scientific Research
Although the performance of Temporal Action Segmentation (TAS) has improved in recent years, achieving promising results often comes with a high computational cost due to dense inputs, complex model structures, and resource-intensive post-processing requirements. To improve the efficiency while keeping the performance, we present a novel perspective centered on per-segment classification. By harnessing the capabilities of Transformers, we tokenize each video segment as an instance token, endowed with intrinsic instance segmentation. To realize efficient action segmentation, we introduce BaFormer, a boundary-aware Transformer network. It employs instance queries for instance segmentation and a global query for class-agnostic boundary prediction, yielding continuous segment proposals. During inference, BaFormer employs a simple yet effective voting strategy to classify boundary-wise segments based on instance segmentation. Remarkably, as a single-stage approach, BaFormer significantly reduces the computational costs, utilizing only 6% of the running time compared to state-of-the-art method DiffAct, while producing better or comparable accuracy over several popular benchmarks. The code for this project is publicly available at this https URL.
Despite an extensive body of literature on trust in technology, designing trustworthy AI systems for high-stakes decision domains remains a significant challenge, further compounded by the lack of actionable design and evaluation tools. The Multisource AI Scorecard Table (MAST) was designed to bridge this gap by offering a systematic, tradecraft-centered approach to evaluating AI-enabled decision support systems. Expanding on MAST, we introduce an iterative design framework called \textit{Principles-based Approach for Designing Trustworthy, Human-centered AI using MAST Methodology} (PADTHAI-MM). We demonstrate this framework in our development of the Reporting Assistant for Defense and Intelligence Tasks (READIT), a research platform that leverages data visualizations and natural language processing-based text analysis, emulating an AI-enabled system supporting intelligence reporting work. To empirically assess the efficacy of MAST on trust in AI, we developed two distinct iterations of READIT for comparison: a High-MAST version, which incorporates AI contextual information and explanations, and a Low-MAST version, akin to a ``black box'' system. This iterative design process, guided by stakeholder feedback and contemporary AI architectures, culminated in a prototype that was evaluated through its use in an intelligence reporting task. We further discuss the potential benefits of employing the MAST-inspired design framework to address context-specific needs. We also explore the relationship between stakeholder evaluators' MAST ratings and three categories of information known to impact trust: \textit{process}, \textit{purpose}, and \textit{performance}. Overall, our study supports the practical benefits and theoretical validity for PADTHAI-MM as a viable method for designing trustable, context-specific AI systems.
We analyze Oja's algorithm for streaming kk-PCA and prove that it achieves performance nearly matching that of an optimal offline algorithm. Given access to a sequence of i.i.d. d×dd \times d symmetric matrices, we show that Oja's algorithm can obtain an accurate approximation to the subspace of the top kk eigenvectors of their expectation using a number of samples that scales polylogarithmically with dd. Previously, such a result was only known in the case where the updates have rank one. Our analysis is based on recently developed matrix concentration tools, which allow us to prove strong bounds on the tails of the random matrices which arise in the course of the algorithm's execution.
Gun violence is a critical security problem, and it is imperative for the computer vision community to develop effective gun detection algorithms for real-world scenarios, particularly in Closed Circuit Television (CCTV) surveillance data. Despite significant progress in visual object detection, detecting guns in real-world CCTV images remains a challenging and under-explored task. Firearms, especially handguns, are typically very small in size, non-salient in appearance, and often severely occluded or indistinguishable from other small objects. Additionally, the lack of principled benchmarks and difficulty collecting relevant datasets further hinder algorithmic development. In this paper, we present a meticulously crafted and annotated benchmark, called \textbf{CCTV-Gun}, which addresses the challenges of detecting handguns in real-world CCTV images. Our contribution is three-fold. Firstly, we carefully select and analyze real-world CCTV images from three datasets, manually annotate handguns and their holders, and assign each image with relevant challenge factors such as blur and occlusion. Secondly, we propose a new cross-dataset evaluation protocol in addition to the standard intra-dataset protocol, which is vital for gun detection in practical settings. Finally, we comprehensively evaluate both classical and state-of-the-art object detection algorithms, providing an in-depth analysis of their generalizing abilities. The benchmark will facilitate further research and development on this topic and ultimately enhance security. Code, annotations, and trained models are available at this https URL.
The paper constructs a multi-variate Hawkes process model of Bitcoin block arrivals and price jumps. Hawkes processes are selfexciting point processes that can capture the self- and cross-excitation effects of block mining and Bitcoin price volatility. We use publicly available blockchain datasets to estimate the model parameters via maximum likelihood estimation. The results show that Bitcoin price volatility boost block mining rate and Bitcoin investment return demonstrates mean reversion. Quantile-Quantile plots show that the proposed Hawkes process model is a better fit to the blockchain datasets than a Poisson process model.
Elliptic reconstruction property, originally introduced by Makridakis and Nochetto for linear parabolic problems, is a well-known tool to derive optimal a posteriori error estimates. No such results are known for nonlinear and nonsmooth problems such as parabolic variational inequalities (VIs). This article establishes the elliptic reconstruction property for parabolic VIs and derives a posteriori error estimates in L(0,T;L2(Ω))L^{\infty}(0,T;L^{2}(\Omega)). The estimator consists of discrete complementarity terms and standard residual. As an application, the residual-type error estimates are presented.
This paper develops a non-asymptotic, local approach to quantitative propagation of chaos for a wide class of mean field diffusive dynamics. For a system of nn interacting particles, the relative entropy between the marginal law of kk particles and its limiting product measure is shown to be O((k/n)2)O((k/n)^2) at each time, as long as the same is true at time zero. A simple Gaussian example shows that this rate is optimal. The main assumption is that the limiting measure obeys a certain functional inequality, which is shown to encompass many potentially irregular but not too singular finite-range interactions, as well as some infinite-range interactions. This unifies the previously disparate cases of Lipschitz versus bounded measurable interactions, improving the best prior bounds of O(k/n)O(k/n) which were deduced from global estimates involving all nn particles. We also cover a class of models for which qualitative propagation of chaos and even well-posedness of the McKean-Vlasov equation were previously unknown. At the center of a new approach is a differential inequality, derived from a form of the BBGKY hierarchy, which bounds the kk-particle entropy in terms of the (k+1)(k+1)-particle entropy.
We address the problem of detecting a change in the distribution of a high-dimensional multivariate normal time series. Assuming that the post-change parameters are unknown and estimated using a window of historical data, we extend the framework of quickest change detection (QCD) to the highdimensional setting in which the number of variables increases proportionally with the size of the window used to estimate the post-change parameters. Our analysis reveals that an information theoretic quantity, which we call the Normalized High- Dimensional Kullback-Leibler divergence (NHDKL), governs the high-dimensional asymptotic performance of QCD procedures. Specifically, we show that the detection delay is asymptotically inversely proportional to the difference between the NHDKL of the true post-change versus pre-change distributions and the NHDKL of the true versus estimated post-change distributions. In cases of perfect estimation, where the latter NHDKL is zero, the delay is inversely proportional to the NHDKL between the post-change and pre-change distributions alone. Thus, our analysis is a direct generalization of the traditional fixed-dimension, large-sample asymptotic framework, where the standard KL divergence is asymptotically inversely proportional to detection delay. Finally, we identify parameter estimators that asymptotically minimize the NHDKL between the true versus estimated post-change distributions, resulting in a QCD method that is guaranteed to outperform standard approaches based on fixed-dimension asymptotics.
3
We present a numerical model and a set of conservative algorithms for Non-Maxwellian plasma kinetics with inelastic collisions. These algorithms self-consistently solve for the time evolution of an isotropic electron energy distribution function interacting with an atomic state distribution function of an arbitrary number of levels through collisional excitation, deexcitation, as well as ionization and recombination. Electron-electron collisions, responsible for thermalization of the electron distribution, are also included in the model. The proposed algorithms guarantee mass/charge and energy conservation in a single step, and is applied to the case of non-uniform gridding of the energy axis in the phase space of the electron distribution function. Numerical test cases are shown to demonstrate the accuracy of the method and its conservation properties.
We show that contrary to appearances, Multimodal Type Theory (MTT) over a 2-category M can be interpreted in any M-shaped diagram of categories having, and functors preserving, M-sized limits, without the need for extra left adjoints. This is achieved by a construction called "co-dextrification" that co-freely adds left adjoints to any such diagram, which can then be used to interpret the "context lock" functors of MTT. Furthermore, if any of the functors in the diagram have right adjoints, these can also be internalized in type theory as negative modalities in the style of FitchTT. We introduce the name Multimodal Adjoint Type Theory (MATT) for the resulting combined general modal type theory. In particular, we can interpret MATT in any finite diagram of toposes and geometric morphisms, with positive modalities for inverse image functors and negative modalities for direct image functors.
We consider the Selective Harmonic Modulation (SHM) problem, consisting in the design of a staircase control signal with some prescribed frequency components. In this work, we propose a novel methodology to address SHM as an optimal control problem in which the admissible controls are piecewise constant functions, taking values only in a given finite set. In order to fulfill this constraint, we introduce a cost functional with piecewise affine penalization for the control, which, by means of Pontryagin's maximum principle, makes the optimal control have the desired staircase form. Moreover, the addition of the penalization term for the control provides uniqueness and continuity of the solution with respect to the target frequencies. Another advantage of our approach is that the number of switching angles and the waveform need not be determined a priori. Indeed, the solution to the optimal control problem is the entire control signal, and therefore, it determines the waveform and the location of the switches. We also provide several numerical examples in which the SHM problem is solved by means of our approach.
01 Jan 2017
Optics offers unique opportunities for reducing energy in information processing and communications while resolving the problem of interconnect bandwidth density inside machines. Such energy dissipation overall is now at environmentally significant levels; the source of that dissipation is progressively shifting from logic operations to interconnect energies. Without the prospect of substantial reduction in energy per bit communicated, we cannot continue the exponential growth of our use of information. The physics of optics and optoelectronics fundamentally addresses both interconnect energy and bandwidth density, and optics may be the only scalable solution to such problems. Here we summarize the corresponding background, status, opportunities, and research directions for optoelectronic technology and novel optics, including sub-femtojoule devices in waveguide and novel 2D array optical systems. We compare different approaches to low-energy optoelectronic output devices and their scaling, including lasers, modulators and LEDs, optical confinement approaches (such as resonators) to enhance effects, and the benefits of different material choices, including 2D materials and other quantum-confined structures. Beyond the elimination of line charging by the use optical connections, the next major interconnect dissipations are in the electronic circuits for receiver amplifiers, timing recovery and multiplexing. We can address these through the integration of photodetectors to reduce or eliminate receiver circuit energies, free-space optics to eliminate the need for timing and multiplexing circuits (while solving bandwidth density problems), and using optics generally to save power by running large synchronous systems. One target concept is interconnects from ~ 1 cm to ~ 10 m that have the same energy (~ 10fJ/bit) and simplicity as local electrical wires on chip.
Building on structure observed in equivariant homotopy theory, we define an equivariant generalization of a symmetric monoidal category: a GG-symmetric monoidal category. These record not only the symmetric monoidal products but also symmetric monoidal powers indexed by arbitrary finite GG-sets. We then define GG-commutative monoids to be the natural extension of ordinary commutative monoids to this new context. Using this machinery, we then describe when Bousfield localization in equivariant spectra preserves certain operadic algebra structures, and we explore the consequences of our definitions for categories of modules over a GG-commutative monoid.
Many Gibbs measures with mean field interactions are known to be chaotic, in the sense that any collection of kk particles in the nn-particle system are asymptotically independent, as nn\to\infty with kk fixed or perhaps k=o(n)k=o(n). This paper quantifies this notion for a class of continuous Gibbs measures on Euclidean space with pairwise interactions, with main examples being systems governed by convex interactions and uniformly convex confinement potentials. The distance between the marginal law of kk particles and its limiting product measure is shown to be O((k/n)c2)O((k/n)^{c \wedge 2}), with cc proportional to the squared temperature. In the high temperature case, this improves upon prior results based on subadditivity of entropy, which yield O(k/n)O(k/n) at best. The bound O((k/n)2)O((k/n)^2) cannot be improved, as a Gaussian example demonstrates. The results are non-asymptotic, and distance is quantified via relative Fisher information, relative entropy, or the squared quadratic Wasserstein metric. The method relies on an a priori functional inequality for the limiting measure, used to derive an estimate for the kk-particle distance in terms of the (k+1)(k+1)-particle distance.
Given a family H\mathcal{H} of graphs, we say that a graph GG is H\mathcal{H}-free if no induced subgraph of GG is isomorphic to a member of H\mathcal{H}. Let Wt×tW_{t\times t} be the tt-by-tt hexagonal grid and let Lt\mathcal{L}_t be the family of all graphs GG such that GG is the line graph of some subdivision of Wt×tW_{t \times t}. We denote by ω(G)\omega(G) the size of the largest clique in GG. We prove that for every integer tt there exist integers c1(t)c_1(t), c2(t)c_2(t) and d(t)d(t) such that every (pyramid, theta, Lt\mathcal{L}_t)-free graph GG satisfies: i) GG has a tree decomposition where every bag has size at most ω(G)c1(t)log(V(G))\omega(G)^{c_1(t)} \log (|V(G)|). ii) If GG has at least two vertices, then GG has a tree decomposition where every bag has independence number at most logc2(t)(V(G))\log^{c_2(t)} (|V(G)|). iii) For any weight function, GG has a balanced separator that is contained in the union of the neighborhoods of at most d(t)d(t) vertices. These results qualitatively generalize the main theorems of Abrishami et al. (2022) and Chudnovsky et al. (2024). Additionally, we show that there exist integers c3(t),c4(t)c_3(t), c_4(t) such that for every (theta, pyramid)-free graph GG and for every non-adjacent pair of vertices a,bV(G)a,b \in V(G), i) aa can be separated from bb by removing at most w(G)c3(t)log(V(G))w(G)^{c_3(t)}\log(|V(G)|) vertices. ii) aa can be separated from bb by removing a set of vertices with independence number at most logc4(t)(V(G))\log^{c_4(t)}(|V(G)|).
Let MnM_n be a class of symmetric sparse random matrices, with independent entries Mij=δijξijM_{ij} = \delta_{ij} \xi_{ij} for iji \leq j. δij\delta_{ij} are i.i.d. Bernoulli random variables taking the value 11 with probability pn1+δp \geq n^{-1+\delta} for any constant δ>0\delta > 0 and ξij\xi_{ij} are i.i.d. centered, subgaussian random variables. We show that with high probability this class of random matrices has simple spectrum (i.e. the eigenvalues appear with multiplicity one). We can slightly modify our proof to show that the adjacency matrix of a sparse Erdős-Rényi graph has simple spectrum for n1+δp1n1+δn^{-1+\delta } \leq p \leq 1- n^{-1+\delta}. These results are optimal in the exponent. The result for graphs has connections to the notorious graph isomorphism problem.
We will show that tight frames satisfying the restricted isometry property give rise to nearly tight fusion frames which are nearly orthogonal and hence are nearly equi-isoclinic. We will also show how to replace parts of the RIP frame with orthonormal sets while maintaining the RIP property.
We investigate covariance shrinkage for Hotelling's T2T^2 in the regime where the data dimension pp and the sample size nn grow in a fixed ratio -- without assuming that the population covariance matrix is spiked or well-conditioned. When p/nϕ(0,1)p/n\to\phi \in (0,1), we propose a practical finite-sample shrinker that, for any maximum-entropy signal prior and any fixed significance level, (a) asymptotically maximizes power under Gaussian data, and (b) asymptotically saturates the Hanson--Wright lower bound on power in the more general sub-Gaussian case. Our approach is to formulate and solve a variational problem characterizing the optimal limiting shrinker, and to show that our finite-sample method consistently approximates this limit by extending recent local random matrix laws. Empirical studies on simulated and real-world data, including the Crawdad UMich/RSS data set, demonstrate up to a 50%50\% gain in power over leading linear and nonlinear competitors at a significance level of 10410^{-4}.
Given a Banach space X and one of its compact sets F, we consider the problem of finding a good n dimensional space X_n \subset X which can be used to approximate the elements of F. The best possible error we can achieve for such an approximation is given by the Kolmogorov width d_n(F)_X. However, finding the space which gives this performance is typically numerically intractable. Recently, a new greedy strategy for obtaining good spaces was given in the context of the reduced basis method for solving a parametric family of PDEs. The performance of this greedy algorithm was initially analyzed in A. Buffa, Y. Maday, A.T. Patera, C. Prud'homme, and G. Turinici, "A Priori convergence of the greedy algorithm for the parameterized reduced basis", M2AN Math. Model. Numer. Anal., 46(2012), 595-603 in the case X = H is a Hilbert space. The results there were significantly improved on in P. Binev, A. Cohen, W. Dahmen, R. DeVore, G. Petrova, and P. Wojtaszczyk, "Convergence rates for greedy algorithms in reduced bases Methods", SIAM J. Math. Anal., 43 (2011), 1457-1472. The purpose of the present paper is to give a new analysis of the performance of such greedy algorithms. Our analysis not only gives improved results for the Hilbert space case but can also be applied to the same greedy procedure in general Banach spaces.
The family of symmetric powers of an LL-function associated with an elliptic curve with complex multiplication has received much attention from algebraic, automorphic and p-adic points of view. Here we examine one explicit such family from the perspectives of classical analytic number theory and random matrix theory, especially focusing on evidence for the symmetry type of the family. In particular, we investigate the values at the central point and give evidence that this family can be modeled by ensembles of orthogonal matrices. We prove an asymptotic formula with power savings for the average of these L-values, which reproduces, by a completely different method, an asymptotic formula proven by Greenberg and Villegas--Zagier. We give an upper bound for the second moment which is conjecturally too large by just one logarithm. We also give an explicit conjecture for the second moment of this family, with power savings. Finally, we compute the one level density for this family with a test function whose Fourier transform has limited support. It is known by the work of Villegas -- Zagier that the subset of these L-functions from our family which have even functional equations never vanish; we show to what extent this result is reflected by our analytic results.
There are no more papers matching your filters at the moment.