Authors: Valeria Ruscio, Fabrizio Silvestri
Abstract: Rotary Positional Embeddings (RoPE) enhance positional encoding in
Transformer models, yet their full impact on model dynamics remains
underexplored. This paper studies how RoPE introduces position-dependent
rotations, causing phase shifts in token embeddings that influence
higher-frequency components within the model’s internal representations.
Through spectral analysis, we demonstrate that RoPE’s rotation matrices induce
oscillatory behaviors in embeddings, affecting information retention across
layers and shaping temporal modeling capabilities. We show that activation
functions in feed-forward networks interact with RoPE-modulated embeddings to
generate harmonics, leading to constructive or destructive interference based
on phase alignment. Our findings reveal that phase alignment amplifies
activations and sharpens attention, while misalignment weakens activations and
disrupts focus on positional patterns. This study underscores the importance of
frequency components as intrinsic elements of model behavior, offering new
insights beyond traditional analyses.
Source: http://arxiv.org/abs/2410.18067v1