Object slip perception is essential for mobile manipulation robots to perform
manipulation tasks reliably in the dynamic real-world. Traditional approaches
to robot arms' slip perception use tactile or vision sensors. However, mobile
robots still have to deal with noise in their sensor signals caused by the
robot's movement in a changing environment. To solve this problem, we present
an anomaly detection method that utilizes multisensory data based on a deep
autoencoder model. The proposed framework integrates heterogeneous data streams
collected from various robot sensors, including RGB and depth cameras, a
microphone, and a force-torque sensor. The integrated data is used to train a
deep autoencoder to construct latent representations of the multisensory data
that indicate the normal status. Anomalies can then be identified by error
scores measured by the difference between the trained encoder's latent values
and the latent values of reconstructed input data. In order to evaluate the
proposed framework, we conducted an experiment that mimics an object slip by a
mobile service robot operating in a real-world environment with diverse
household objects and different moving patterns. The experimental results
verified that the proposed framework reliably detects anomalies in object slip
situations despite various object types and robot behaviors, and visual and
auditory noise in the environment.