Needle-in-a-Haystack problems exist across a wide range of applications
including rare disease prediction, ecological resource management, fraud
detection, and material property optimization. A Needle-in-a-Haystack problem
arises when there is an extreme imbalance of optimum conditions relative to the
size of the dataset. For example, only
0.82% out of
146k total materials
in the open-access Materials Project database have a negative Poisson's ratio.
However, current state-of-the-art optimization algorithms are not designed with
the capabilities to find solutions to these challenging multidimensional
Needle-in-a-Haystack problems, resulting in slow convergence to a global
optimum or pigeonholing into a local minimum. In this paper, we present a
Zooming Memory-Based Initialization algorithm, entitled ZoMBI. ZoMBI actively
extracts knowledge from the previously best-performing evaluated experiments to
iteratively zoom in the sampling search bounds towards the global optimum
"needle" and then prunes the memory of low-performing historical experiments to
accelerate compute times by reducing the algorithm time complexity from
O(n3) to
O(ϕ3) for
ϕ forward experiments per activation, which
trends to a constant
O(1) over several activations. Additionally, ZoMBI
implements two custom adaptive acquisition functions to further guide the
sampling of new experiments toward the global optimum. We validate the
algorithm's optimization performance on three real-world datasets exhibiting
Needle-in-a-Haystack and further stress-test the algorithm's performance on an
additional 174 analytical datasets. The ZoMBI algorithm demonstrates compute
time speed-ups of 400x compared to traditional Bayesian optimization as well as
efficiently discovering optima in under 100 experiments that are up to 3x more
highly optimized than those discovered by similar methods MiP-EGO, TuRBO, and
HEBO.