AI, Data Protection, Copyright

Who owns the image? Who is responsible for the images? Do AI-generated images have to be labeled? What were AI models trained on? What do I need to consider?

[Translate to English:] Symbolbild für K I — [Translate to English:] Credits: Nora Prinz

Who owns the image? Who is responsible for the images?
Under the current Copyright Act (UrhG), those responsible are the individuals who created the work and/or published it (e.g., on their website). Anyone who produces and/or shares violent or otherwise illegal content with the help of AI can be held legally accountable. Copyright for an AI-generated image exists only if the AI played a subordinate role in the creative process. A prompt alone is not currently considered sufficient for this. (2025) This means that images generated with AIs such as DALL·E, Midjourney, or Stable Diffusion are not protected by copyright. The question of how to prove a sufficient creative process has not yet been resolved.

Do AI-generated images have to be labeled as such?
AI-generated images that depict real people or could be mistaken for real people must always be labeled as AI-generated. Different providers of image-generation AI have varying regulations regarding labeling requirements, which should be observed. In most cases, it is sufficient to name the software provider below the image.

What were the models trained on?
AI models require millions of images to function. For example, the dataset on which Stable Diffusion was trained contains more than 5.8 billion references to images and their descriptions. These images were collected from the internet by “crawlers”, which search the web and gather large quantities of images and their descriptions to be used as raw material for training AI models.

These images include personal data such as nude images, bank information and copyrighted works by designers and creatives. Training AI applications using individual artists' image files allows the public to create "mimicries" — reproductions of a specific artist's style. You can protect your own images using a technique known as “data poisoning.” An article on two tools for protecting your own images can be found here: Link

The Stable Diffusion training dataset is one of the few that is publicly accessible. The dataset can be searched on the website haveibeentrained.com (Link) to check whether personal photos or works have been used. The EU AI Act is a first step towards mandating the disclosure of AI training data for all providers (2025).

What do I need to consider?

Anyone who wants to train their own models should either use only their own images or obtain permission from the copyright holders. It is also possible to use images that are licensed for such use (e.g., under Creative Commons licenses).
AI-generated images are usually free of copyright. However, different AI services have their own rules regarding the use of generated images. Therefore, the guidelines of the respective providers should be checked.
If AI-generated images reference existing works such as pictures, films, novels, or fictional characters, they may infringe the copyright of the original creators. However, the “pastiche Schranke” applies: parodies, caricatures, and pastiches (e.g., homage or satire) are permitted, even when they reference copyrighted works.

Sources
https://www.zdf.de/nachrichten/wirtschaft/urheberrecht-kuenstliche-intelligenz-ki-internet-100.html
lecture: (Neue) Regeln zu künstlicher Intelligenz (KI): Spotlight Urheberrecht: https://vimeo.com/889103149/8870a48e7d
https://interaktiv.br.de/ki-trainingsdaten/index.html

Today (6)