January 12, 2024 | Posted in News
About Innovation, Development, Technology and whatever comes along. | Airport IT | Aviation | Standards | Android | PostgreSQL | D3.js | Angular | Cybersecurity | Visualization | Amazon AWS | MongoDB | Machine Learning |
No doubts, since 2022 the generative AI space was exploding. It left the research stage and became mainstream in 2023. It is getting harder and harder – at least for the human brain – to spot genuinely created content (images, videos, text, audio) nowadays. You won’t even know if this blog post was created by me reflecting about the topic and actually writing the text or just me throwing a text prompt at some text generation service and copying it here.
It started with first attempts to create images with Generative Adversarial Networks (GAN) in 2014 (Ian Goodfellow, University of Montreal). All made possible with the advent of deep learning and (convolutional) neural networks utilizing massive CPU (GPU) processing power and a vast amount of data available today. Most people do not understand what is going on under the hood of all this or grasp the concept, except you are really into computer science and ai. The results are not transparent at all, eventually it will improve due to regulation and legislation (to be proven!).
I enjoy using the tools and a couple of my blog posts talk about GenAI or using the tools for content creation, mostly images using tools like DALL-E or Midjourney or local Stable Diffusion.
I believe I (still?) can distinguish and spot generated images, especially the ones that are used unaltered out of the image generator, mostly with simple text prompts. More and more images in my LinkedIn stream (presented to me) are artificially created, no problem with that if it serves a decorative purpose as a design element, the problem starts when the images is supposed to document facts and actual events or people.
A few hints how to spot these images, using an actual, more or less random, website as example. I am not picking on the content or the authors, I just highlight how you can identify generated images. Let’s have a look at this website about travel and mobility technology (TNMT). I came across this article about passenger frustration in my LinkedIn stream, the lead image grabbed my attention (doing a good job because that’s its purpose). See below screenshot of the post.
At first sight it looks real, a passenger in a crowd or queue, a frustrating scene (delay, cancellation,..). The primary subject is very photorealistic, but the background reveals the source not being a real photo shot. (Quiten often it is the background that pinches me).
Usually generated images are not highlighted as such and there is no mention of the photographer or the image source.
Some artefacts in the image that support the suspicion:
Another important tool, is the image reverse search. Check if the image is used anywhere else. Last time editorial images were dominated by stock images and you get dozens of references, but not for generated images. Why bother to copy/steal a generated image if you can create a new one with 5 mouse-clicks. You can use TinEye or the respective Google service. Above image:
I recommend doing this if you doubt the source of images in the news, especially coming from social media platforms.
Additionally, you can try to use online tools to identify generated images, but be aware, you will get very different responses. Some tests in no particular order:
Let’s dissect a few more images from the same website. It seems they started to use generated images instead of stock images in Nov 2022. Before that time all images were stock images.
My conclusion and recommendation: Feel free to use generated images for decorative or editorial purposes. Maybe you should mark them as being generated, at least somewhere in the text. Avoid image rubbish and obvious artefacts.