Authors: Roman Klypa, Alberto Bietti, Sergei Grudinin
Abstract: Designing RNA molecules that interact with specific proteins is a critical
challenge in experimental and computational biology. Existing computational
approaches require a substantial amount of experimentally determined RNA
sequences for each specific protein or a detailed knowledge of RNA structure,
restricting their utility in practice. To address this limitation, we develop
RNA-BAnG, a deep learning-based model designed to generate RNA sequences for
protein interactions without these requirements. Central to our approach is a
novel generative method, Bidirectional Anchored Generation (BAnG), which
leverages the observation that protein-binding RNA sequences often contain
functional binding motifs embedded within broader sequence contexts. We first
validate our method on generic synthetic tasks involving similar localized
motifs to those appearing in RNAs, demonstrating its benefits over existing
generative approaches. We then evaluate our model on biological sequences,
showing its effectiveness for conditional RNA sequence design given a binding
protein.
Source: http://arxiv.org/abs/2502.21274v1