Naval Postgraduate School
Can algebraic geometry enhance the sharpness, robustness, and interpretability of modern neural reasoning models by equipping them with a mathematically grounded inductive bias? To answer this, we introduce Tropical Attention, an attention mechanism grounded in tropical geometry that lifts the attention kernel into tropical projective space, where reasoning is piecewise-linear and 1-Lipschitz, thus preserving the polyhedral decision structure inherent to combinatorial reasoning. We prove that Multi-Head Tropical Attention (MHTA) stacks universally approximate tropical circuits and realize tropical transitive closure through composition, achieving polynomial resource bounds without invoking recurrent mechanisms. These guarantees explain why the induced polyhedral decision boundaries remain sharp and scale-invariant, rather than smoothed by Softmax. Empirically, we show that Tropical Attention delivers stronger out-of-distribution generalization in both length and value, with high robustness against perturbative noise, and substantially faster inference with fewer parameters compared to Softmax-based and recurrent attention baselines. For the first time, we extend neural algorithmic reasoning beyond PTIME problems to NP-hard and NP-complete problems, paving the way toward sharper and more expressive Large Reasoning Models (LRMs) capable of tackling complex combinatorial challenges in phylogenetics, cryptography, particle physics, and mathematical discovery.
9
A central goal in the cognitive sciences is the development of numerical models for mental representations of object concepts. This paper introduces Variational Interpretable Concept Embeddings (VICE), an approximate Bayesian method for embedding object concepts in a vector space using data collected from humans in a triplet odd-one-out task. VICE uses variational inference to obtain sparse, non-negative representations of object concepts with uncertainty estimates for the embedding values. These estimates are used to automatically select the dimensions that best explain the data. We derive a PAC learning bound for VICE that can be used to estimate generalization performance or determine a sufficient sample size for experimental design. VICE rivals or outperforms its predecessor, SPoSE, at predicting human behavior in the triplet odd-one-out task. Furthermore, VICE's object representations are more reproducible and consistent across random initializations, highlighting the unique advantage of using VICE for deriving interpretable embeddings from human behavior.
We present DAVE Aquatic Virtual Environment (DAVE), an open source simulation stack for underwater robots, sensors, and environments. Conventional robotics simulators are not designed to address unique challenges that come with the marine environment, including but not limited to environment conditions that vary spatially and temporally, impaired or challenging perception, and the unavailability of data in a generally unexplored environment. Given the variety of sensors and platforms, wheels are often reinvented for specific use cases that inevitably resist wider adoption. Building on existing simulators, we provide a framework to help speed up the development and evaluation of algorithms that would otherwise require expensive and time-consuming operations at sea. The framework includes basic building blocks (e.g., new vehicles, water-tracking Doppler Velocity Logger, physics-based multibeam sonar) as well as development tools (e.g., dynamic bathymetry spawning, ocean currents), which allows the user to focus on methodology rather than software infrastructure. We demonstrate usage through example scenarios, bathymetric data import, user interfaces for data inspection and motion planning for manipulation, and visualizations.
Uncertainty is prevalent in engineering design, data-driven problems, and decision making broadly. Due to inherent risk-averseness and ambiguity about assumptions, it is common to address uncertainty by formulating and solving conservative optimization models expressed using measures of risk and related concepts. We survey the rapid development of risk measures over the last quarter century. From their beginning in financial engineering, we recount the spread to nearly all areas of engineering and applied mathematics. Solidly rooted in convex analysis, risk measures furnish a general framework for handling uncertainty with significant computational and theoretical advantages. We describe the key facts, list several concrete algorithms, and provide an extensive list of references for further reading. The survey recalls connections with utility theory and distributionally robust optimization, points to emerging applications areas such as fair machine learning, and defines measures of reliability.
Existing numerical optimizers deployed in quantum compilers use expensive O(4n)\mathcal{O}(4^n) matrix-matrix operations. Inspired by recent advances in quantum machine learning (QML), QFactor-Sample replaces matrix-matrix operations with simpler O(2n)\mathcal{O}(2^n) circuit simulations on a set of sample inputs. The simpler the circuit, the lower the number of required input samples. We validate QFactor-Sample on a large set of circuits and discuss its hyperparameter tuning. When incorporated in the BQSKit quantum compiler and compared against a state-of-the-art domain-specific optimizer, We demonstrate improved scalability and a reduction in compile time, achieving an average speedup factor of 69 for circuits with more than 8 qubits. We also discuss how improved numerical optimization affects the dynamics of partitioning-based compilation schemes, which allow a trade-off between compilation speed and solution quality.
This paper addresses zero-sum ``turn'' games, in which only one player can make decisions at each state. We show that pure saddle-point state-feedback policies for turn games can be constructed from dynamic programming fixed-point equations for a single value function or Q-function. These fixed-points can be constructed using a suitable form of Q-learning. For discounted costs, convergence of this form of Q-learning can be established using classical techniques. For undiscounted costs, we provide a convergence result that applies to finite-time deterministic games, which we use to illustrate our results. For complex games, the Q-learning iteration must be terminated before exploring the full-state, which can lead to policies that cannot guarantee the security levels implied by the final Q-function. To mitigate this, we propose an ``opponent-informed'' exploration policy for selecting the Q-learning samples. This form of exploration can guarantee that the final Q-function provides security levels that hold, at least, against a given set of policies. A numerical demonstration for a multi-agent game, Atlatl, indicates the effectiveness of these methods.
In today's rapidly evolving military landscape, advancing artificial intelligence (AI) in support of wargaming becomes essential. Despite reinforcement learning (RL) showing promise for developing intelligent agents, conventional RL faces limitations in handling the complexity inherent in combat simulations. This dissertation proposes a comprehensive approach, including targeted observation abstractions, multi-model integration, a hybrid AI framework, and an overarching hierarchical reinforcement learning (HRL) framework. Our localized observation abstraction using piecewise linear spatial decay simplifies the RL problem, enhancing computational efficiency and demonstrating superior efficacy over traditional global observation methods. Our multi-model framework combines various AI methodologies, optimizing performance while still enabling the use of diverse, specialized individual behavior models. Our hybrid AI framework synergizes RL with scripted agents, leveraging RL for high-level decisions and scripted agents for lower-level tasks, enhancing adaptability, reliability, and performance. Our HRL architecture and training framework decomposes complex problems into manageable subproblems, aligning with military decision-making structures. Although initial tests did not show improved performance, insights were gained to improve future iterations. This study underscores AI's potential to revolutionize wargaming, emphasizing the need for continued research in this domain.
Markov games and robust MDPs are closely related models that involve computing a pair of saddle point policies. As part of the long-standing effort to develop efficient algorithms for these models, the Filar-Tolwinski (FT) algorithm has shown considerable promise. As our first contribution, we demonstrate that FT may fail to converge to a saddle point and may loop indefinitely, even in small games. This observation contradicts the proof of FT's convergence to a saddle point in the original paper. As our second contribution, we propose Residual Conditioned Policy Iteration (RCPI). RCPI builds on FT, but is guaranteed to converge to a saddle point. Our numerical results show that RCPI outperforms other convergent algorithms by several orders of magnitude.
14 Jun 2019
We develop a fast and accurate algorithm for evaluating a2+b2\sqrt{a^2+b^2} for two floating point numbers aa and bb. Library functions that perform this computation are generally named {\tt hypot(a,b)}. We will compare four approaches that we will develop in this paper to the current resident library function that is delivered with Julia 1.1 and to the code that has been distributed with the C math library for decades. We will demonstrate the performance of our algorithms by simulation.
In many modern image-classification applications, understanding the cause of model's prediction can be as critical as the prediction's accuracy itself. Various feature-based local explanations generation methods have been designed to give us more insights on the decision of complex classifiers. Nevertheless, there is no consensus on evaluating the quality of different explanations. In response to this lack of comprehensive evaluation, we introduce the c-Eval metric and its corresponding framework to quantify the feature-based local explanation's quality. Given a classifier's prediction and the corresponding explanation on that prediction, c-Eval is the minimum-distortion perturbation that successfully alters the prediction while keeping the explanation's features unchanged. We then demonstrate how c-Eval can be computed using some modifications on existing adversarial generation libraries. To show that c-Eval captures the importance of input's features, we establish the connection between c-Eval and the features returned by explainers in affine and nearly-affine classifiers. We then introduce the c-Eval plot, which not only displays a strong connection between c-Eval and explainers' quality, but also helps automatically determine explainer's parameters. Since the generation of c-Eval relies on adversarial generation, we provide a demo of c-Eval on adversarial-robust models and show that the metric is applicable in those models. Finally, extensive experiments of explainers on different datasets are conducted to support the adoption of c-Eval in evaluating explainers' performance.
We analyze the stability of the network's giant connected component under impact of adverse events, which we model through the link percolation. Specifically, we quantify the extent to which the largest connected component of a network consists of the same nodes, regardless of the specific set of deactivated links. Our results are intuitive in the case of single-layered systems: the presence of large degree nodes in a single-layered network ensures both its robustness and stability. In contrast, we find that interdependent networks that are robust to adverse events have unstable connected components. Our results bring novel insights to the design of resilient network topologies and the reinforcement of existing networked systems.
Shear instabilities can be the source of significant amounts of turbulent mixing in stellar radiative zones. Past attempts at modeling their effects (either theoretically or using numerical simulations) have focused on idealized geometries where the shear is either purely vertical or purely horizontal. In stars, however, the shear can have arbitrary directions with respect to gravity. In this work, we use direct numerical simulations to investigate the nonlinear saturation of shear instabilities in a stably stratified fluid, where the shear is sinusoidal in the horizontal direction, and either constant or sinusoidal in the vertical direction. We find that, in the parameter regime studied here (non-diffusive, fully turbulent flow), the mean vertical shear does not play any role in controlling the dynamics of the resulting turbulence unless its Richardson number is smaller than one (approximately). As most stellar radiative regions have a Richardson number much greater than one, our result implies that the vertical shear can essentially be ignored in the computation of the vertical mixing coefficient associated with shear instabilities for the purpose of stellar evolution calculations, even when it is much larger than the horizontal shear (as in the solar tachocline, for instance).
The Synthetic Theater Operations Research Model (STORM) simulates theater-level conflict and requires inputs about utilization of surveillance satellites to search large geographical areas. We develop a mixed-integer linear optimization model that prescribes plans for how satellites and their sensors should be directed to best search an area of operations. It also specifies the resolution levels employed by the sensors to ensure a suitable fidelity of the resulting images. We solve large-scale instances of the model involving up to 22 million variables and 11 million constraints in scenarios derived from STORM. On average, the model yields 55% improvement in search coverage relative to an existing heuristic algorithm in STORM.
One might argue that solving a trajectory optimization problem over a million grid points is preposterous. How about solving such a problem at an incredibly fast computational time? On a small form-factor processor? Algorithmic details that make possible this trifecta of breakthroughs are presented in this paper. The computational mathematics that deliver these advancements are: (i) a Birkhoff-theoretic discretization of optimal control problems, (ii) matrix-free linear algebra leveraging Krylov-subspace methods, and (iii) a near-perfect Birkhoff preconditioner that helps achieve O(1)\mathcal{O}(1) iteration speed with respect to the grid size,~NN. A key enabler of this high performance is the computation of Birkhoff matrix-vector products at O(Nlog(N))\mathcal{O}(N\log(N)) time using fast Fourier transform techniques that eliminate traditional computational bottlenecks. A numerical demonstration of this unprecedented scale and speed is illustrated for a practical astrodynamics problem.
Deception is being increasingly explored as a cyberdefense strategy to protect operational systems. We are studying implementation of deception-in-depth strategies with initially three logical layers: network, host, and data. We draw ideas from military deception, network orchestration, software deception, file deception, fake honeypots, and moving-target defenses. We are building a prototype representing our ideas and will be testing it in several adversarial environments. We hope to show that deploying a broad range of deception techniques can be more effective in protecting systems than deploying single techniques. Unlike traditional deception methods that try to encourage active engagement from attackers to collect intelligence, we focus on deceptions that can be used on real machines to discourage attacks.
Existing methods for active topology discovery within the IPv6 Internet largely mirror those of IPv4. In light of the large and sparsely populated address space, in conjunction with aggressive ICMPv6 rate limiting by routers, this work develops a different approach to Internet-wide IPv6 topology mapping. We adopt randomized probing techniques in order to distribute probing load, minimize the effects of rate limiting, and probe at higher rates. Second, we extensively analyze the efficiency and efficacy of various IPv6 hitlists and target generation methods when used for topology discovery, and synthesize new target lists based on our empirical results to provide both breadth (coverage across networks) and depth (to find potential subnetting). Employing our probing strategy, we discover more than 1.3M IPv6 router interface addresses from a single vantage point. Finally, we share our prober implementation, synthesized target lists, and discovered IPv6 topology results.
Recent cyber incidents and the push for zero trust security underscore the necessity of monitoring host-level events. However, current host-level intrusion detection systems (IDS) lack the ability to correlate alerts and coordinate a network-wide response in real time. Motivated by advances in system-level extensions free of rebooting and network-wide orchestration of host actions, we propose using a central IDS orchestrator to remotely program the logic of each host IDS and collect the alerts generated in real time. In this paper, we make arguments for such a system concept and provide a high level design of the main system components. Furthermore, we have developed a system prototype and evaluated it using two experimental scenarios rooted from real-world attacks. The evaluation results show that the host-based IDS orchestration system is able to defend against the attacks effectively.
Over the last two decades, pseudospectral methods based on Lagrange interpolants have flourished in solving trajectory optimization problems and their flight implementations. In a seemingly unjustified departure from these highly successful methods, a new starting point for trajectory optimization is proposed. This starting point is based on the recently-developed concept of universal Birkhoff interpolants. The new approach offers a substantial computational upgrade to the Lagrange theory in completely flattening the rapid growth of the condition numbers from O(N2) to O(1), where N is the number of grid points. In addition, the Birkhoff-specific primal-dual computations are isolated to a well-conditioned linear system even for nonlinear, nonconvex problems. This is part I of a two-part paper. In part I, a new theory is developed on the basis of two hypotheses. Other than these hypotheses, the theoretical development makes no assumptions on the choices of basis functions or the selection of grid points. Several covector mapping theorems are proved to establish the mathematical equivalence between direct and indirect Birkhoff methods. In part II of this paper (with Proulx), it is shown that a select family of Gegenbauer grids satisfy the two hypotheses required for the theory to hold. Numerical examples in part II illustrate the power and utility of the new theory.
Building upon the observation that the newly defined~\cite{EFRST20} concept of cc-differential uniformity is not invariant under EA or CCZ-equivalence~\cite{SPRS20}, we showed in~\cite{SG20} that adding some appropriate linearized monomials increases the cc-differential uniformity of the inverse function, significantly, for some~cc. We continue that investigation here. First, by analyzing the involved equations, we find bounds for the uniformity of the Gold function perturbed by a single monomial, exhibiting the discrepancy we previously observed on the inverse function. Secondly, to treat the general case of perturbations via any linearized polynomial, we use characters in the finite field to express all entries in the cc-Differential Distribution Table (DDT) of an (n,n)(n,n)-function on the finite field \Fpn\F_{p^n}, and further, we use that method to find explicit expressions for all entries of the cc-DDT of the perturbed Gold function (via an arbitrary linearized polynomial).
Understanding the characteristics of swarming autonomous agents is critical for defense and security applications. This article presents a study on using supervised neural network time series classification (NN TSC) to predict key attributes and tactics of swarming autonomous agents for military contexts. Specifically, NN TSC is applied to infer two binary attributes - communication and proportional navigation - which combine to define four mutually exclusive swarm tactics. We identify a gap in literature on using NNs for swarm classification and demonstrate the effectiveness of NN TSC in rapidly deducing intelligence about attacking swarms to inform counter-maneuvers. Through simulated swarm-vs-swarm engagements, we evaluate NN TSC performance in terms of observation window requirements, noise robustness, and scalability to swarm size. Key findings show NNs can predict swarm behaviors with 97% accuracy using short observation windows of 20 time steps, while also demonstrating graceful degradation down to 80% accuracy under 50% noise, as well as excellent scalability to swarm sizes from 10 to 100 agents. These capabilities are promising for real-time decision-making support in defense scenarios by rapidly inferring insights about swarm behavior.
There are no more papers matching your filters at the moment.