Mamba or Transformer for Time Series Forecasting? Mixture of Universals (MoU) Is All You Need

Authors: Sijia Peng, Yun Xiong, Yangyong Zhu, Zhiqiang Shen

Abstract: Time series forecasting requires balancing short-term and long-term
dependencies for accurate predictions. Existing methods mainly focus on
long-term dependency modeling, neglecting the complexities of short-term
dynamics, which may hinder performance. Transformers are superior in modeling
long-term dependencies but are criticized for their quadratic computational
cost. Mamba provides a near-linear alternative but is reported less effective
in time series longterm forecasting due to potential information loss. Current
architectures fall short in offering both high efficiency and strong
performance for long-term dependency modeling. To address these challenges, we
introduce Mixture of Universals (MoU), a versatile model to capture both
short-term and long-term dependencies for enhancing performance in time series
forecasting. MoU is composed of two novel designs: Mixture of Feature
Extractors (MoF), an adaptive method designed to improve time series patch
representations for short-term dependency, and Mixture of Architectures (MoA),
which hierarchically integrates Mamba, FeedForward, Convolution, and
Self-Attention architectures in a specialized order to model long-term
dependency from a hybrid perspective. The proposed approach achieves
state-of-the-art performance while maintaining relatively low computational
costs. Extensive experiments on seven real-world datasets demonstrate the
superiority of MoU. Code is available at https://github.com/lunaaa95/mou/.

Source: http://arxiv.org/abs/2408.15997v1

About the Author

Leave a Reply

Your email address will not be published. Required fields are marked *

You may also like these