Guest Blog
When the company Open AI launched their new and paid version of the AI-tool DALLE-2, something also happened with their licensing terms. In this short post we explain, why we love new advances in text-to-image generation technology, but why we also encourage you to be careful, if you plan to use any of these images for commercial purposes.
Table of Contents
The technology used in DALLE 2 (sorry if this becomes a bit geeky)
Have you heard about DALLE 2 from Open-AI? The AI-tool that magically creates images from text. If you have, then you need to read this too … On the surface it seems pretty amazing. Ask DALLE to generate a photo of a flight attendant, and here is what you get:
source: DALL·E 2 (openai.com)
Some have been arguing that too many of the DALLE-2 results are stereotypes and biased, but that is not the real problem here. The real problem is the legality of these images.
Some of the images created are as close to old-school stock photos, as you can get, and they are so for a reason. Even though Open AI refuses to disclose all of the datasets that they have been using to train their AI, it’s clear that these images are not an artistic AI imagining what a flight attendant would look like. They are old image databases and poor stock photos almost replicated 1:1.
If you look deeper into the documentation of DALLE-2, what you find are a number of research papers:
[2204.06125] Hierarchical Text-Conditional Image Generation with CLIP Latents (arxiv.org)
[2112.10741] GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models (arxiv.org)
https://cdn.openai.com/papers/dall-e-2.pdf
If you look deeper into the documentation of DALLE-2, what you find are a number of research papers:
[2204.06125] Hierarchical Text-Conditional Image Generation with CLIP Latents (arxiv.org)
[2112.10741] GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models (arxiv.org)
https://cdn.openai.com/papers/dall-e-2.pdf
According to the papers, the only public dataset (there might be more that the company doesn’t want to publish in public) that has been used to train DALLE 2 is the so called COCO-dataset.
This is a dataset that was collected as part of a project sponsored by Microsoft and others. Sorry to become a bit technical now, but the dataset consists of two things:
1) The annotations on the images. An annotation basically means a note, text and/or description added to each image. These annotations have been released under the so called Creative Commons Attribution 4.0 License.
2) The images themselves. These images do NOT belong to the COCO-project. They have been sourced from Flickr and must therefore abide by the Flickr Terms of Use.
And as it says on the website: “The users of the images accept full responsibility for the use of the dataset, including but not limited to the use of any copies of copyrighted images that they may create from the dataset.”
So what does this mean for you as a normal user of photos, videos etc. and for your daily marketing and communication? You need to be careful and think twice!
Read further here: Be careful using DALLE-2 images for commercial purposes (jumpstory.com)