Fundus2Video: Cross-Modal Angiography Video Generation from Static Fundus Photography with Clinical Knowledge Guidance

Authors: Weiyi Zhang, Siyu Huang, Jiancheng Yang, Ruoyu Chen, Zongyuan Ge, Yingfeng Zheng, Danli Shi, Mingguang He

Abstract: Fundus Fluorescein Angiography (FFA) is a critical tool for assessing retinal
vascular dynamics and aiding in the diagnosis of eye diseases. However, its
invasive nature and less accessibility compared to Color Fundus (CF) images
pose significant challenges. Current CF to FFA translation methods are limited
to static generation. In this work, we pioneer dynamic FFA video generation
from static CF images. We introduce an autoregressive GAN for smooth,
memory-saving frame-by-frame FFA synthesis. To enhance the focus on dynamic
lesion changes in FFA regions, we design a knowledge mask based on clinical
experience. Leveraging this mask, our approach integrates innovative knowledge
mask-guided techniques, including knowledge-boosted attention, knowledge-aware
discriminators, and mask-enhanced patchNCE loss, aimed at refining generation
in critical areas and addressing the pixel misalignment challenge. Our method
achieves the best FVD of 1503.21 and PSNR of 11.81 compared to other common
video generation approaches. Human assessment by an ophthalmologist confirms
its high generation quality. Notably, our knowledge mask surpasses supervised
lesion segmentation masks, offering a promising non-invasive alternative to
traditional FFA for research and clinical applications. The code is available
at https://github.com/Michi-3000/Fundus2Video.