Hamilton Institute
When developing Bayesian hierarchical models, selecting the most appropriate hierarchical structure can be a challenging task, and visualisation remains an underutilised tool in this context. In this paper, we consider visualisations for the display of hierarchical models in data space and compare a collection of multiple models via their parameters and hyper-parameter estimates. Specifically, with the aim of aiding model choice, we propose new visualisations to explore how the choice of Bayesian hierarchical modelling structure impacts parameter distributions. The visualisations are designed using a robust set of principles to provide richer comparisons that extend beyond the conventional plots and numerical summaries typically used. As a case study, we investigate five Bayesian hierarchical models fit using the brms R package, a high-level interface to Stan for Bayesian modelling, to model country mathematics trends from the PISA (Programme for International Student Assessment) database. Our case study demonstrates that by adhering to these principles, researchers can create visualisations that not only help them make more informed choices between Bayesian hierarchical model structures but also enable them to effectively communicate the rationale for those choices.
Extensive research on formal verification of machine learning systems indicates that learning from data alone often fails to capture underlying background knowledge, such as specifications implicitly available in the data. Various neural network verifiers have been developed to ensure that a machine-learnt model satisfies correctness and safety properties; however, they typically assume a trained network with fixed weights. A promising approach for creating machine learning models that inherently satisfy constraints after training is to encode background knowledge as explicit logical constraints that guide the learning process via so-called differentiable logics. In this paper, we experimentally compare and evaluate various logics from the literature, present our findings, and highlight open problems for future work. We evaluate differentiable logics with respect to their suitability in training, and use a neural network verifier to check their ability to establish formal guarantees. The complete source code for our experiments is available as an easy-to-use framework for training with differentiable logics at this https URL
How hard is it guess a password? Massey showed that that the Shannon entropy of the distribution from which the password is selected is a lower bound on the expected number of guesses, but one which is not tight in general. In a series of subsequent papers under ever less restrictive stochastic assumptions, an asymptotic relationship as password length grows between scaled moments of the guesswork and specific Rényi entropy was identified. Here we show that, when appropriately scaled, as the password length grows the logarithm of the guesswork satisfies a Large Deviation Principle (LDP), providing direct estimates of the guesswork distribution when passwords are long. The rate function governing the LDP possess a specific, restrictive form that encapsulates underlying structure in the nature of guesswork. Returning to Massey's original observation, a corollary to the LDP shows that expectation of the logarithm of the guesswork is the specific Shannon entropy of the password selection process.
Modern applications are driving demand for ultra-reliable low-latency communications, rekindling interest in the performance of short, high-rate error correcting codes. To that end, here we introduce a soft-detection variant of Guessing Random Additive Noise Decoding (GRAND) called Ordered Reliability Bits GRAND that can decode any short, high-rate block-code. For a code of nn bits, it avails of no more than log2(n)\lceil\log_2(n)\rceil bits of code-book-independent quantized soft detection information per received bit to determine an accurate decoding while retaining the original algorithm's suitability for a highly parallelized implementation in hardware. ORBGRAND is shown to provide similar block error performance for codes of distinct classes (BCH, CA-Polar and RLC) with low complexity, while providing better block error rate performance than CA-SCL, a state of the art soft detection CA-Polar decoder.
6
This paper gives an overview of previous work in which the authors used NASA's Formal Requirement Elicitation Tool (FRET) to formalise requirements. We discuss four case studies where we used FRET to capture the system's requirements. These formalised requirements subsequently guided the case study specifications in a combination of formal paradigms. For each case study we summarise insights gained during this process, exploring the expressiveness and the potential interoperability of these approaches. Our experience confirms FRET's suitability as a framework for the elicitation and understanding of requirements and for providing traceability from requirements to specification.
The information-encoding molecules RNA and DNA form a combinatorially large set of secondary structures through nucleic acid base pairing. Thermodynamic prediction algorithms predict favoured, or minimum free energy (MFE), secondary structures, and can assign an equilibrium probability to any structure via the partition function: a Boltzman-weighted sum over the set of secondary structures. MFE is NP-hard in the presence pseudoknots, base pairings that violate a restricted planarity condition. However, unpseudoknotted structures are amenable to dynamic programming: for a single DNA/RNA strand there are polynomial time algorithms for MFE and partition function. For multiple strands, the problem is more complicated due to entropic penalties. Dirks et al [SICOMP Review; 2007] showed that for O(1) strands, with N bases, there is a polynomial time in N partition function algorithm, however their technique did not generalise to MFE which they left open. We give the first polynomial time (O(N^4)) algorithm for unpseudoknotted multiple (O(1)) strand MFE, answering the open problem from Dirks et al. The challenge lies in considering rotational symmetry of secondary structures, a feature not immediately amenable to dynamic programming algorithms. Our proof has two main technical contributions: First, a polynomial upper bound on the number of symmetric secondary structures to be considered when computing rotational symmetry penalties. Second, that bound is leveraged by a backtracking algorithm to find the MFE in an exponential space of contenders. Our MFE algorithm has the same asymptotic run time as Dirks et al's partition function algorithm, suggesting efficient handling of rotational symmetry, although higher space complexity. It also seems reasonably tight in the number of strands since Codon, Hajiaghayi & Thachuk [DNA27, 2021] have shown that unpseudoknotted MFE is NP-hard for O(N) strands.
Within neuroimgaing studies it is a common practice to perform repetitions of trials in an experiment when working with a noisy class of data acquisition system, such as electroencephalography (EEG) or magnetoencephalography (MEG). While this approach can be useful in some experimental designs, it presents significant limitations for certain types of analyses, such as identifying the category of an object observed by a subject. In this study we demonstrate that when trials relating to a single object are allowed to appear in both the training and testing sets, almost any classification algorithm is capable of learning the representation of an object given only category labels. This ability to learn object representations is of particular significance as it suggests that the results of several published studies which predict the category of observed objects from EEG signals may be affected by a subtle form of leakage which has inflated their reported accuracies. We demonstrate the ability of both simple classification algorithms, and sophisticated deep learning models, to learn object representations given only category labels. We do this using two datasets; the Kaneshiro et al. (2015) dataset and the Gifford et al. (2022) dataset. Our results raise doubts about the true generalizability of several published models and suggests that the reported performance of these models may be significantly inflated.
The 802.11e standard enables user configuration of several MAC parameters, making WLANs vulnerable to users that selfishly configure these parameters to gain throughput. In this paper we propose a novel distributed algorithm to thwart such selfish behavior. The key idea of the algorithm is for honest stations to react, upon detecting a selfish station, by using a more aggressive configuration that penalizes this station. We show that the proposed algorithm guarantees global stability while providing good response times. By conducting a game theoretic analysis of the algorithm based on repeated games, we also show its effectiveness against selfish stations. Simulation results confirm that the proposed algorithm optimizes throughput performance while discouraging selfish behavior. We also present an experimental prototype of the proposed algorithm demonstrating that it can be implemented on commodity hardware.
Surrogate-assisted evolutionary algorithms (SAEAs) aim to use efficient computational models with the goal of approximating the fitness function in evolutionary computation systems. This area of research has been active for over two decades and has received significant attention from the specialised research community in different areas, for example, single and many objective optimisation or dynamic and stationary optimisation problems. An emergent and exciting area that has received little attention from the SAEAs community is in neuroevolution. This refers to the use of evolutionary algorithms in the automatic configuration of artificial neural network (ANN) architectures, hyper-parameters and/or the training of ANNs. However, ANNs suffer from two major issues: (a) the use of highly-intense computational power for their correct training, and (b) the highly specialised human expertise required to correctly configure ANNs necessary to get a well-performing network. This work aims to fill this important research gap in SAEAs in neuroevolution by addressing these two issues. We demonstrate how one can use a Kriging Partial Least Squares method that allows efficient computation of good approximate surrogate models compared to the well-known Kriging method, which normally cannot be used in neuroevolution due to the high dimensionality of the data.
We introduce an R package for fitting Stable Isotope Mixing Models (SIMMs) via both Markov chain Monte Carlo and Variational Bayes. The package is mainly used for estimating dietary contributions from food sources taken via measurements of stable isotope ratios from animals. It can also be used to estimate proportional contributions of a mixture from known sources, for example apportionment of river sediment, amongst many other use cases. The package contains a simple structure which allows non-expert users to interface with the package, with most of the computational complexity hidden behind the main fitting functions. In this paper we detail the background to these functions and provide case studies on how the package should be used. Further examples are available in the online package vignettes.
This study provides a comprehensive review of the utilization of Virtual Reality (VR) for visualizing Artificial Intelligence (AI) systems, drawing on 18 selected studies. The results illuminate a complex interplay of tools, methods, and approaches, notably the prominence of VR engines like Unreal Engine and Unity. However, despite these tools, a universal solution for effective AI visualization remains elusive, reflecting the unique strengths and limitations of each technique. We observed the application of VR for AI visualization across multiple domains, despite challenges such as high data complexity and cognitive load. Moreover, it briefly discusses the emerging ethical considerations pertaining to the broad integration of these technologies. Despite these challenges, the field shows significant potential, emphasizing the need for dedicated research efforts to unlock the full potential of these immersive technologies. This review, therefore, outlines a roadmap for future research, encouraging innovation in visualization techniques, addressing identified challenges, and considering the ethical implications of VR and AI convergence.
In neural-decoding studies, recordings of participants' responses to stimuli are used to train models. In recent years, there has been an explosion of publications detailing applications of innovations from deep-learning research to neural-decoding studies. The data-hungry models used in these experiments have resulted in a demand for increasingly large datasets. Consequently, in some studies, the same stimuli are presented multiple times to each participant to increase the number of trials available for use in model training. However, when a decoding model is trained and subsequently evaluated on responses to the same stimuli, stimulus identity becomes a confounder for accuracy. We term this the repeated-stimulus confound. We identify a susceptible dataset, and 16 publications which report model performance based on evaluation procedures affected by the confound. We conducted experiments using models from the affected studies to investigate the likely extent to which results in the literature have been misreported. Our findings suggest that the decoding accuracies of these models were overestimated by between 4.46-7.42%. Our analysis also indicates that per 1% increase in accuracy under the confound, the magnitude of the overestimation increases by 0.26%. The confound not only results in optimistic estimates of decoding performance, but undermines the validity of several claims made within the affected publications. We conducted further experiments to investigate the implications of the confound in alternative contexts. We found that the same methodology used within the affected studies could also be used to justify an array of pseudoscientific claims, such as the existence of extrasensory perception.
Error correction techniques traditionally focus on the co-design of restricted code-structures in tandem with code-specific decoders that are computationally efficient when decoding long codes in hardware. Modern applications are, however, driving demand for ultra-reliable low-latency communications (URLLC), rekindling interest in the performance of shorter, higher-rate error correcting codes, and raising the possibility of revisiting universal, code-agnostic decoders. To that end, here we introduce a soft-detection variant of Guessing Random Additive Noise Decoding (GRAND) called Ordered Reliability Bits GRAND that can accurately decode any moderate redundancy block-code. It is designed with efficient circuit implementation in mind, and determines accurate decodings while retaining the original hard detection GRAND algorithm's suitability for a highly parallelized implementation in hardware. ORBGRAND is shown to provide excellent soft decision block error performance for codes of distinct classes (BCH, CA-Polar and RLC) with modest complexity, while providing better block error rate performance than CA-SCL, a state of the art soft detection CA-Polar decoder. ORBGRAND offers the possibility of an accurate, energy efficient soft detection decoder suitable for delivering URLLC in a single hardware realization.
We propose a Bayesian, noisy-input, spatial-temporal generalised additive model to examine regional relative sea-level (RSL) changes over time. The model provides probabilistic estimates of component drivers of regional RSL change via the combination of a univariate spline capturing a common regional signal over time, random slopes and intercepts capturing site-specific (local), long-term linear trends and a spatial-temporal spline capturing residual, non-linear, local variations. Proxy and instrumental records of RSL and corresponding measurement errors inform the model and a noisy-input method accounts for proxy temporal uncertainties. Results focus on the decomposition of RSL over the past 3000 years along the Atlantic coast of North America.
We consider the problem of quickest event detection with sleep-wake scheduling in small extent wireless sensor networks in which, at each time slot, each sensor node in the awake state observes a sample and communicates the information to the fusion centre. The sensor nodes in the sleep state do not sample or communicate any information to the fusion centre (FC), thereby conserving energy. At each time slot, the FC, after having received the samples from the sensor nodes in the wake state, makes a decision to stop (and thus declare that the event has occurred) or to continue observing. If it decides to continue, the FC also makes the decision of choosing the number of sensor nodes to be in the wake state in the next time slot. We consider three alternative approaches to the problem of choosing the number of sensor nodes to be in the wake state in time slot k+1, based on the information available at time slot k, namely, 1. optimal control of M_{k+1}, the number of sensor nodes to be in the awake state in time slot k+1, 2. optimal control of q_{k+1}, the probability of a sensor node to be in the awake state in time slot k+1, and 3. optimal probability q that a sensor node is in the awake state in any time slot. In each case, we formulate the problem as a sequential decision process. We show that a sufficient statistic for the decision at time k is the a posteriori probability of change Pi_k. Also, we show that the optimal stopping rule is a threshold rule on Pi_k. The optimal policy for M_{k+1} can keep very few sensors wake during the prechange phase and then quickly increase the number of sensors in the wake state when a change is "suspected". Among the three sleep-wake algorithms described, we observe that the total cost is minimum for the optimum control of M_{k+1} and is maximum for the optimum control on q.
On-demand vehicle access is a method that can be used to reduce types of range anxiety problems related to planned travel for electric vehicle owners. Using ideas from elementary queueing theory, basic QoS metrics are defined to dimension a shared fleet to ensure high levels of vehicle access. Using mobility data from Ireland, it is argued that the potential cost of such a system is very low.
The last decade has witnessed substantial interest in protocols for transferring information on networks of quantum mechanical objects. A variety of control methods and network topologies have been proposed, on the basis that transfer with perfect fidelity --- i.e. deterministic and without information loss --- is impossible through unmodulated spin chains with more than a few particles. Solving the original problem formulated by Bose [Phys. Rev. Lett. 91, 207901 (2003)], we determine the exact number of qubits in unmodulated chains (with XY Hamiltonian) that permit the transfer with fidelity arbitrarily close to 1, a phenomenon called pretty good state transfer. We prove that this happens if and only if the number of nodes is n=p-1, 2p-1, where p is a prime, or n=2^{m}-1. The result highlights the potential of quantum spin system dynamics for reinterpreting questions about the arithmetic structure of integers, and, in this case, primality.
This book chapter introduces the use of Continuous Time Markov Networks (CTMN) to analytically capture the operation of Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) networks. It is of tutorial nature, and it aims to be an introduction on this topic, providing a clear and easy-to-follow description. To illustrate how CTMN can be used, we introduce a set of representative and cutting-edge scenarios, such as Vehicular Ad-hoc Networks (VANETs), Power Line Communication networks and multiple overlapping Wireless Local Area Networks (WLANs). For each scenario, we describe the specific CTMN, obtain its stationary distribution and compute the throughput achieved by each node in the network. Taking the per-node throughput as reference, we discuss how the complex interactions between nodes using CSMA/CA have an impact on system performance.
The study of semantics in Genetic Program (GP) deals with the behaviour of a program given a set of inputs and has been widely reported in helping to promote diversity in GP for a range of complex problems ultimately improving evolutionary search. The vast majority of these studies have focused their attention in single-objective GP, with just a few exceptions where Pareto-based dominance algorithms such as NSGA-II and SPEA2 have been used as frameworks to test whether highly popular semantics-based methods, such as Semantic Similarity-based Crossover (SSC), helps or hinders evolutionary search. Surprisingly it has been reported that the benefits exhibited by SSC in SOGP are not seen in Pareto-based dominance Multi-objective GP. In this work, we are interested in studying if the same carries out in Multi-objective Evolutionary Algorithms based on Decomposition (MOEA/D). By using the MNIST dataset, a well-known dataset used in the machine learning community, we show how SSC in MOEA/D promotes semantic diversity yielding better results compared to when this is not present in canonical MOEA/D.
We ask the question of how small a self-assembling set of tiles can be yet have interesting computational behaviour. We study this question in a model where supporting walls are provided as an input structure for tiles to grow along: we call it the Maze-Walking Tile Assembly Model. The model has a number of implementation prospects, one being DNA strands that attach to a DNA origami substrate. Intuitively, the model suggests a separation of signal routing and computation: the input structure (maze) supplies a routing diagram, and the programmer's tile set provides the computational ability. We ask how simple the computational part can be. We give two tiny tile sets that are computationally universal in the Maze-Walking Tile Assembly Model. The first has four tiles and simulates Boolean circuits by directly implementing NAND, NXOR and NOT gates. Our second tile set has 6 tiles and is called the Collatz tile set as it produces patterns found in binary/ternary representations of iterations of the Collatz function. Using computer search we find that the Collatz tile set is expressive enough to encode Boolean circuits using blocks of these patterns. These two tile sets give two different methods to find simple universal tile sets, and provide motivation for using pre-assembled maze structures as circuit wiring diagrams in molecular self-assembly based computing.
There are no more papers matching your filters at the moment.