Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models

CVPR 2023

Understanding and Mitigating Copying in Diffusion Models

NeurIPS 2023

Abstract

Cutting-edge diffusion models produce images with high quality and customizability, enabling them to be used for commercial art and graphic design purposes. But do diffusion models create unique works of art, or are they replicating content directly from their training sets? In this work, we study image retrieval frameworks that enable us to compare generated images with training samples and detect when content has been replicated. Applying our frameworks to diffusion models trained on multiple datasets including Oxford flowers, Celeb-A, ImageNet, and LAION, we discuss how factors such as training set size impact rates of content replication. We also identify cases where diffusion models, including the popular Stable Diffusion model, blatantly copy from their training data.

Cite us


  @inproceedings{somepalli2022diffusion,
    title={Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models},
    author={Somepalli, Gowthami and Singla, Vasu and Goldblum, Micah and Geiping, Jonas and Goldstein, Tom},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year={2023}
  }
      }

  @article{somepalli2023understanding,
    title={Understanding and mitigating copying in diffusion models},
    author={Somepalli, Gowthami and Singla, Vasu and Goldblum, Micah and Geiping, Jonas and Goldstein, Tom},
    journal={Advances in Neural Information Processing Systems},
    volume={36},
    pages={47783--47803},
    year={2023}
  }
    }

Last updated Apr 2, 2024