FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing

Abstract

Text-to-image diffusion models can be fine-tuned in custom domains to adapt to specific user preferences, but such unconstrained adaptability has also been utilized for illegal purposes, such as forging public figures’ portraits and duplicating copyrighted artworks. Most existing work focuses on detecting the illegally generated contents, but cannot prevent or mitigate illegal adaptations of diffusion models. Other schemes of model unlearning and reinitialization, similarly, cannot prevent users from relearning the knowledge of illegal model adaptation with custom data. In this paper, we present FreezeAsGuard, a new technique that addresses these limitations and enables irreversible mitigation of illegal adaptations of diffusion models. The basic approach is that the model publisher selectively freezes tensors in pre-trained diffusion models that are critical to illegal model adaptations, to mitigate the fine-tuned model’s representation power in illegal domains but minimize the impact on legal model adaptations in other domains. Such tensor freezing can be enforced via APIs provided by the model publisher for fine-tuning, can motivate users’ adoption due to its computational savings. Experiment results with datasets in multiple domains show that FreezeAsGuard provides stronger power in mitigating illegal model adaptations of generating fake public figures’ portraits, while having the minimum impact on model adaptation in other legal domains.

Publication
In arXiv preprint

Overview

Our design of FreezeAsGuard builds on bilevel optimization, which embeds one optimization problem within another. As shown in the figure below, the lower-level optimization problem is a simulated user loop that the user fine-tunes the diffusion model towards convergence by minimizing the loss over both illegal and innocent domains. The upper-level problem is a mask learning loop that the model publisher learns m to mitigate the diffusion model’s representation power when being fine-tuned in illegal domains, without affecting fine-tuning in innocent domains.

FreezeAsGuard Overview

Qualitative Examples of Generated Images

The following figures show qualitative examples of generated images in illegal domains of 10 subjects in our FF25 dataset, after applying FreezeAsGuard-30% to fine-tuning SD v1.5. Each prompt adopts the same seed for generation.

Qualitative Examples 1
Qualitative Examples 2

Kai Huang
Kai Huang
Graduated PhD

Ph.D. in Electrical and Computer Engineering

Wei Gao
Wei Gao
Associate Professor

Associate Professor at University of Pittsburgh