Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models

Abstract

Generative models (e.g., GANs and diffusion models) learn the underlying data distribution in an unsupervised manner. However, many applications of interest require sampling from a particular region of the output space or sampling evenly over a range of characteristics. For efficient sampling in these scenarios, we propose Generative Visual Prompt (PromptGen), a framework for distributional control over pre-trained generative models by incorporating knowledge of other off-the-shelf models. PromptGen defines control as energy-based models (EBMs) and samples images in a feed-forward manner by approximating the EBM with invertible neural networks, which avoids optimization at inference. Our experiments show that PromptGen can efficiently sample from several unconditional generative models (e.g., StyleGAN2, StyleNeRF, diffusion autoencoder, NVAE) in a controlled or/and de-biased manner using various off-the-shelf models: (1) with the CLIP model as control, PromptGen can sample images guided by text, (2) with image classifiers as control, PromptGen can help de-bias generative models across a set of attributes or attribute combinations, and (3) with inverse graphics models as control, PromptGen can sample images of the same identity in different poses. (4) Finally, PromptGen reveals that the CLIP model shows a "reporting bias" when used as control, and PromptGen can further de-bias this controlled distribution in an iterative manner.

Pose Control

De-biasing a Generator

Biases in the data sneak into the generative models. For instance, a CLIP guided PromptGen using the sentence "Photo of a person witout makeup" will result in mostly females. To balance this undesired effect, PromptGen can use a classifier control and make the generator more fare with respect to gender.

BibTeX

@inproceedings{promptgen2022, title={Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models}, author={Chen Henry Wu and Saman Motamed and Shaunak Srivastava and Fernando De la Torre}, booktitle={Thirty-Sixth Conference on Neural Information Processing Systems}, year={2022}, url={https://openreview.net/forum?id=Gsbnnc--bnw} }

Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models

Abstract

CLIP Guidance

Pose Control

De-biasing a Generator

BibTeX