Symbolic Regression is the study of algorithms that automate the search for
analytic expressions that fit data. While recent advances in deep learning have
generated renewed interest in such approaches, the development of symbolic
regression methods has not been focused on physics, where we have important
additional constraints due to the units associated with our data. Here we
present
Φ-SO, a Physical Symbolic Optimization framework for recovering
analytical symbolic expressions from physics data using deep reinforcement
learning techniques by learning units constraints. Our system is built, from
the ground up, to propose solutions where the physical units are consistent by
construction. This is useful not only in eliminating physically impossible
solutions, but because the "grammatical" rules of dimensional analysis restrict
enormously the freedom of the equation generator, thus vastly improving
performance. The algorithm can be used to fit noiseless data, which can be
useful for instance when attempting to derive an analytical property of a
physical model, and it can also be used to obtain analytical approximations to
noisy data. We test our machinery on a standard benchmark of equations from the
Feynman Lectures on Physics and other physics textbooks, achieving
state-of-the-art performance in the presence of noise (exceeding 0.1%) and show
that it is robust even in the presence of substantial (10%) noise. We showcase
its abilities on a panel of examples from astrophysics.