alphaXiv

ICSC - Centro Nazionale di Ricerca in High Performance Computing

26 May 2025

computer-science machine-learning embedding-methods

Rotary Masked Autoencoders are Versatile Learners

Imperial College London University of Trieste Scuola Internazionale Superiore di Studi Avanzati (SISSA)University of Nova Gorica Abdus Salam International Centre for Theoretical Physics (ICTP)ICSC - Centro Nazionale di Ricerca in High Performance Computing INFN National Institute for Nuclear Physics

Rotary Masked Autoencoders (RoMAE) extends the MAE framework by integrating continuous Rotary Positional Embeddings (RoPE), creating a versatile Transformer model capable of learning representations from irregular, multi-dimensional time-series data, images, and audio. The model achieved an F-score of 0.6770 on the DESC ELAsTiCC Challenge and an RMSE of 0.0183 on the Spirals 2D interpolation task, outperforming specialized architectures.

08 Nov 2025

computer-science machine-learning embedding-methods

Rotary Masked Autoencoders are Versatile Learners

Applying Transformers to irregular time-series typically requires specializations to their baseline architecture, which can result in additional computational overhead and increased method complexity. We present the Rotary Masked Autoencoder (RoMAE), which utilizes the popular Rotary Positional Embedding (RoPE) method for continuous positions. RoMAE is an extension to the Masked Autoencoder (MAE) that enables interpolation and representation learning with multidimensional continuous positional information while avoiding any time-series-specific architectural specializations. We showcase RoMAE's performance on a variety of modalities including irregular and multivariate time-series, images, and audio, demonstrating that RoMAE surpasses specialized time-series architectures on difficult datasets such as the DESC ELAsTiCC Challenge while maintaining MAE's usual performance across other modalities. In addition, we investigate RoMAE's ability to reconstruct the embedded continuous positions, demonstrating that including learned embeddings in the input sequence breaks RoPE's relative position property.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Rotary Masked Autoencoders are Versatile Learners

Rotary Masked Autoencoders are Versatile Learners

Events

AI for Law

Personalize Your Feed