ReVersion: Diffusion-Based Relation Inversion from Images

Abstract

Diffusion models gain increasing popularity for their generative capabilities. Recently, there have been surging needs to generate customized images by inverting diffusion models from exemplar images, and existing inversion methods mainly focus on capturing object appearances (i.e., the "look"). However, how to invert object relations, another important pillar in the visual world, remains unexplored. In this work, we propose the Relation Inversion task, which aims to learn a specific relation (represented as "relation prompt") from exemplar images. Specifically, we learn a relation prompt with a frozen pre-trained text-to-image diffusion model. The learned relation prompt can then be applied to generate relation-specific images with new objects, backgrounds, and styles.

To tackle the Relation Inversion task, we propose the ReVersion Framework. Specifically, we propose a novel "relation-steering contrastive learning" scheme to steer the relation prompt towards relation-dense regions, and disentangle it away from object appearances. We further devise "relation-focal importance sampling" to emphasize high-level interactions over low-level appearances (e.g., texture, color). To comprehensively evaluate this new task, we contribute the ReVersion Benchmark, which provides various exemplar images with diverse relations. Extensive experiments validate the superiority of our approach over existing methods across a wide range of visual relations. Our proposed task and method could be good inspirations for future research in various domains like generative inversion, few-shot learning, and visual relation detection.

BibTeX

If you find our work useful, please consider citing our paper:

@inproceedings{huang2023reversion, title={{ReVersion}: Diffusion-Based Relation Inversion from Images}, author={Huang, Ziqi and Wu, Tianxing and Jiang, Yuming and Chan, Kelvin C.K. and Liu, Ziwei}, booktitle={SIGGRAPH Asia 2024 Conference Papers}, year={2024} }

ReVersion : Diffusion-Based
Relation Inversion from Images

Video

Abstract

ReVersion Framework

Diverse Relations

Inverting diverse relations and apply them on new objects.

More results.

Diverse Backgrounds and Styles

The relation inverted by ReVersion can be applied robustly to related entities in scenes with diverse backgrounds or styles.

Robust Entity Combinations

Inverted relation can be robustly applied to arbitrary entity combinations.

BibTeX

ReVersion : Diffusion-Based Relation Inversion from Images

Video

Abstract

ReVersion Framework

Diverse Relations

Inverting diverse relations and apply them on new objects.

More results.

Diverse Backgrounds and Styles

The relation inverted by ReVersion can be applied robustly to related entities in scenes with diverse backgrounds or styles.

Robust Entity Combinations

Inverted relation can be robustly applied to arbitrary entity combinations.

BibTeX

ReVersion : Diffusion-Based
Relation Inversion from Images