Authors: Susmit Agrawal, Deepika Vemuri, Sri Siddarth Chakaravarthy P, Vineeth N. Balasubramanian
Abstract: Concept-based methods have emerged as a promising direction to develop
interpretable neural networks in standard supervised settings. However, most
works that study them in incremental settings assume either a static concept
set across all experiences or assume that each experience relies on a distinct
set of concepts. In this work, we study concept-based models in a more
realistic, dynamic setting where new classes may rely on older concepts in
addition to introducing new concepts themselves. We show that concepts and
classes form a complex web of relationships, which is susceptible to
degradation and needs to be preserved and augmented across experiences. We
introduce new metrics to show that existing concept-based models cannot
preserve these relationships even when trained using methods to prevent
catastrophic forgetting, since they cannot handle forgetting at concept, class,
and concept-class relationship levels simultaneously. To address these issues,
we propose a novel method – MuCIL – that uses multimodal concepts to perform
classification without increasing the number of trainable parameters across
experiences. The multimodal concepts are aligned to concepts provided in
natural language, making them interpretable by design. Through extensive
experimentation, we show that our approach obtains state-of-the-art
classification performance compared to other concept-based models, achieving
over 2$\times$ the classification performance in some cases. We also study the
ability of our model to perform interventions on concepts, and show that it can
localize visual concepts in input images, providing post-hoc interpretations.
Source: http://arxiv.org/abs/2502.20393v1