LoRA-Composer

LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models

For example, we extend following 16 customized concepts. 👇 👇 👇

→

Five Concepts

V^potter, V^dogA, V^cat and V^hermione, at V^rock

V^bengio, V^dogA, V^cat and V^hinton, at V^rock

V^hina, V^tezuka, V^ai, V^son and V^kaori, near a lake

V^hina, V^misuha, V^ai, V^son and V^kaori, near a lake

→

Four Concepts

V^potter, V^dogA, V^hinton and V^lecun, near a lake

V^hermione, V^cat, V^hinton and V^potter, near a lake

V^tezuka, V^ai, V^kaori and V^son, near a mountain

→

Three Concepts

V^dogB, V^cat and V^dogA, in the forest

V^potter and V^hermione shaking hands, at V^pyramid

V^bengio, V^lecun and V^hinton, near a lake

V^tezuka in white suit, V^ai hodding teddy bear, and V^kaori wear a hat, near a lake

→

Two Concepts

V^bengio and V^potter, simple background

V^bengio, V^hinton and V^lecun, in front of eiffel tower

V^tezuka and V^son, near a lake

V^ai and V^kaori, near a lake

Abstract

Customization generation techniques have significantly advanced the synthesis of specific concepts across varied contexts. Multi-concept customization emerges as the challenging task within this domain. Existing approaches often rely on training a Low-Rank Adaptations (LoRA) fusion matrix of multiple LoRA to merge various concepts into a single image. However, we identify this straightforward method faces two major challenges: 1) concept confusion, which occurs when the model cannot preserve distinct individual characteristics, and 2) concept vanishing, where the model fails to generate the intended subjects. To address these issues, we introduce LoRA-Composer, a training-free framework designed for seamlessly integrating multiple LoRAs, thereby enhancing the harmony among different concepts within generated images. LoRA-Composer addresses concept vanishing through Concept Injection Constraints, enhancing concept visibility via an expanded cross-attention mechanism. To combat concept confusion, Concept Isolation Constraints are introduced, refining the self-attention computation. Furthermore, Latent Re-initialization is proposed to effectively stimulate concept-specific latent within designated regions. Our extensive testing showcases a notable enhancement in LoRA-Composer's performance compared to standard baselines, especially when eliminating the image-based conditions like canny edge or pose estimations.

Main Observation

Our method distinguishes itself from Mix-of-Show eliminating the image-based conditions (the sketch shown above) and the requirement to train a LoRA fusion matrix. Furthermore, we highlight the limitations of Mix-of-Show through the demonstration of failure cases. In the top row, we illustrate two key issues: concept vanishing, marked by the absence of intended concepts in the image, and concept confusion, where the model mistakenly merges and confuses distinct concepts

Method Overview

LoRA-Composer utilizes textual, layout, and image-based conditions (optional) to integrate and customize multiple concepts through Latent Re-initialization for precise layout generation.
Modifications to the Stable Diffusion U-Net in LoRA-Composer Block include concept isolation in self-attention and Concept Injection in cross-attention, optimizing for accurate concept placement while preventing feature leakage across concepts.

Three Highlights of LoRA-Composer

Multi-Concept Generation (without image-based conditions)

Multi-Concept Generation (with image-based conditions)

The yellow box emphasizes the issue of concept confusion, while red boxes underscore instances of concept vanishing.

Bibtex


    @article{yang2024loracomposer,
        title   = {LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models},
        author  = {Yang Yang and Wen Wang and Liang Peng and Chaotian Song and Yao Chen and Hengjia Li and Xiaolong Yang and Qinglin Lu and Deng Cai and Boxi Wu and Wei Liu},
        year    = {2024},
        journal = {arXiv preprint arXiv: 2403.11627}
      }

¹State Key Lab of CAD&CG, Zhejiang University, China	²The School of Software Technology, Zhejiang University, China
³Fabu Inc. China	⁴Tencent Inc. China