Artificial intelligence has been making the headlines, especially since the brilliant release of ChatGPT3 a few months ago. Beyond the technical prowess, it’s the arrival of generative artificial intelligence and its ability to create texts, images and other contents with an amazing quality that has impressed the most. But, should we say AI a revolution or is it just an expected, technologic evolution? A threat to the creative industries, or a potential ally for creators and the art industry? Let’s start by understanding what’s behind this AI “thing”.
Simulation, not expression
We have to remember that finally, even if we talk about intelligence, we have to think about very well done mechanics. Artificial intelligence, today, is still algorithms, mathematical formulas translated into 0 and 1 by very, very high level computer engineers. When we “discuss” with ChatGPT, the answers are not so much the expression of an intelligence, but the result of a complex mathematical formula. The system simulates – very well – but does not express itself in the human sense.
It still looks a lot like the “Mechanical turk” of the 18th century. A human-sized automaton, gifted with the ability to play chess. It bluffed all the courts of Europe, until Napoleon. Under its base was hidden in fact a real chess player,in flesh and blood, able to both animate the automaton and of course to take decisions. Simulation.
The big difference with artificial intelligence is that there is no human behind the machine in real time to help it make decisions. The human has created a set of codes and logics upstream that will make the machine, in an autonomous way, give the impression of a spontaneous answer.
Continuous technological progress
Today, the machine is capable of learning and making decisions thanks to complex mathematical models. Some of the most popular ones today mimic the structure of our brain by mathematically reproducing its neural networks.
The first attempts date back to 1957, the concept of Machine Learning dates back to 1959. The recent progress is, in addition to the great advances on the work of algorithms, mainly linked to the calculation capacity of our machines, which allow to create very complex neural networks, and the volume of data available (thanks to the internet), which allows to build learning databases (datasets) composed of billions of information.
By interacting, in natural language, therefore by composing quite normal sentences, with a conversational AI like ChatGPT, our words are analyzed one by one. These are statistical algorithms that, through the absorption of a huge volume of information, will classify and define the probability of each word that will compose the response sentence, choose the information to be introduced, summarize and remember the messages previously exchanged.
Existing and impactful solutions
What can artificial intelligence bring us? Let’s take a concrete example through a project realized in part by Google engineers: the analysis of an archive of Japanese print illustrations from the National Institute of Japanese Literature, Kyoto University Rare Materials Digital Archive and Keio University Media Center (KaoKore dataset etc. Yingtao Tian et al. arXiv:2002.08595 )
Composed of several thousand images, it was difficult or impossible to extract all the information available in both the texts and images of this background manually. Thanks to artificial intelligence, the analysis of illustrations and texts could be automated.
For the texts, it was necessary to adapt the existing OCR (optical Character Recognition) algorithms so that the machine could interpret the handwritten ideograms and transcribe the texts.
For the images, a database of faces had to be created. 8,573 portraits of 250×250 pixels were extracted from the prints and catalogued according to their gender (male, female), status (warrior, nobleman, etc.) to train the machine to recognize and classify the faces.
For what result? New statistical data to better understand this set of prints. How many women were represented? Were most of the women noble, were the warriors always men? One of the powers of artificial intelligence is to increase our ability to classify, to extract meaning and knowledge.
To analyze the text. Analyze the image. That’s easy. But with these datasets and models, engineers have also been able to create. Create new algorithms to mimic the style of the prints. Flat colors, brush styles. And then instruct the AI to create its own prints. What is very interesting in this example, is to see how the power of algorithms today allows to finally confront any problem that could seem until now very difficult to solve.
Available to all
We are not all computer engineers but everyone can already access these tools, in a very intuitive and spontaneous way. And companies can fund their own development for a very controlled cost.
We talked about ChatGPT and OpenAi, but there are many others. Stable Diffusion, Dall-e, MidJourney are among the most famous examples of generative AI for images. There is now such a plethora of solutions that directories of solutions have been created, such as https://topai.tools/ai-assist, or https://www.aisearchtool.com.
All of them work on the same model: algorithms fed to learn via extremely dense databases, sometimes of several billion images. At the heart of the interaction, the “prompt”. A more or less precise textual description, which will give all the details of what its author expects as a result.
For example, you can ask MidJourney for “a portrait of the Little Prince and his fox, painted by Fra Angelico”, and then try with other famous artists (we tried with Lucian Freud, Arthur Rackham and others, it’s amazing). In all these cases, MidJourney was able to create an image that was largely faithful to the artist’s style.
Why is that? Because this artificial intelligence has in its database enough images that are referenced and identified as being created by the said authors. Again, we are talking here about recopy ability, rather than mere creative potential.
Limited for all?
We must also think that each artificial intelligence, each engine that is available online today will have its own dataset. Therefore, it will be more or less reliable and efficient depending on the type of data it has been offered for its learning phase.
Generative AIs are not only image creaters, they can also expand an image, improve its resolution, remove elements, etc. We can easily imagine a company like Adobe increasing the capacities of its artificial intelligence engines to help artists improve their creative tools. This is already the case. You might just not have noticed.
The first big challenge for any creator is to master these new tools. Generative AI works a lot on the basis of “prompts”, which you have to understand the mechanics of. It’s a new way of dialoguing with the machine, like we are used to searching on Google with keywords. Writing prompts, detailed descriptions the machine will understand to produce the final picture, iterate, until the result suits your needs. Learning takes time, is not yet “documented” and might change from one tool to the other.
But there are other limitations. Generalized learning models can be very limited to answer… specific questions. OpenAi has launched its “plug-ins” to be able to ingest new dimensions of data into the system and make its answers more relevant. The future of Ai will surely be personalized.
Special models to meet specific needs. Hopefully the answers will be reliable… and unfiltered….
Still many issues to solve
Since Ai is developed by humans (for how long?), these algorithms are subject to human bias. Gender, race, opinions. Unintentional biases, or not. Try to ask a reference to Tian’anmen to some Chinese Ai… It’s also hard to trust anything the Ai tells us in general. No algorithm is yet 100% reliable, and some – including ChatGPT – have “hallucinations”. A completely wrong result despite the original quality information. There is no mention of the sources in the results (Bing does offer some).
And who chooses the data sources for the learning phase? Which data has been considered “true”, or “fake”? The stakes go beyond Ai and its pure technological prowess.
Another challenge, already being intensely discussed, is the protection and attribution of copyright. What kind of data can Ai use for its learning? Are the styles reproducible? Who can be considered as the author of the images produced? The one who writes the prompt, the company that provides the generative engine or the person who created the reference images, often exploited without his consent… The question remains complex even if the first provisions are coming.
An American court has ruled that the images produced by Ai must be free of rights, while American authors have sued MidJourney for copyright infringement, having discovered that their images were part of the learning dataset.
The European Union itself is in the process of finalizing what it has called the Artificial Intelligence Act. In the same way that it has already ruled once the use of our private data with the GDPR, it is now in the process of setting up a regulation on the usages and risks of artificial intelligence, which of course, does not only concern an artistic environment, but also has many other issues on all industries.
Progress for all, whether we like it or not
Artificial intelligence already exists. It is making progress in an accelerated way, which is the case for most of the industrial revolutions that have already changed our daily lives in the past.
It is a technological revolution that will push the limits of how we create, consume, distribute, sell goods and artistic creation. And everything else.
In fact, it’s already happening. You’ve probably already interacted with AI without knowing it.
For the art world, rather than talking about artificial intelligence, I think we should talk about augmented intelligence, or augmented creativity. There is still a lot of space for the human being and finally artificial intelligence will be one more tool that will surely revolutionize our practices. As it has already been the case for video and cinema, as it has been the case upstream for sound and radio, as it has also been the case for printing and photography, so it is also a new tool for creation.
Rather than fighting against revolutions, we must know how to deal with them, better understand them, and – why not – how to regulate them. And above all optimize them so that their impact is more positive than negative. Augmented intelligence, rather than artificial.