Authors: Minh Vu, Ben Nebgen, Erik Skau, Geigh Zollicoffer, Juan Castorena, Kim Rasmussen, Boian Alexandrov, Manish Bhattarai
Abstract: As Machine Learning (ML) applications rapidly grow, concerns about
adversarial attacks compromising their reliability have gained significant
attention. One unsupervised ML method known for its resilience to such attacks
is Non-negative Matrix Factorization (NMF), an algorithm that decomposes input
data into lower-dimensional latent features. However, the introduction of
powerful computational tools such as Pytorch enables the computation of
gradients of the latent features with respect to the original data, raising
concerns about NMF’s reliability. Interestingly, naively deriving the
adversarial loss for NMF as in the case of ML would result in the
reconstruction loss, which can be shown theoretically to be an ineffective
attacking objective. In this work, we introduce a novel class of attacks in NMF
termed Latent Feature Attacks (LaFA), which aim to manipulate the latent
features produced by the NMF process. Our method utilizes the Feature Error
(FE) loss directly on the latent features. By employing FE loss, we generate
perturbations in the original data that significantly affect the extracted
latent features, revealing vulnerabilities akin to those found in other ML
techniques. To handle large peak-memory overhead from gradient back-propagation
in FE attacks, we develop a method based on implicit differentiation which
enables their scaling to larger datasets. We validate NMF vulnerabilities and
FE attacks effectiveness through extensive experiments on synthetic and
real-world data.
Source: http://arxiv.org/abs/2408.03909v1