Professional artists and photographers upset about generative AI companies using their work to train their technology may soon have an effective way to respond without having to go to court.
Generative AI started to gain popularity with the launch of OpenAI’s ChatGPT chatbot almost a year ago. These tools are very adept at communicating in a very natural and human-like way, but to gain these capabilities, they have to be trained using tons of data scraped from the web.
Similar generative AI tools are also capable of generating images from text commands, but like ChatGPT, they are trained by retrieving images published on the web.
This means that the work of artists and photographers is being used – without approval or compensation – by technology companies to develop their generative AI tools.
To address this, the research team has developed a tool called Nightshade that is capable of confusing the training model, causing it to output incorrect images in response to commands.
Outlined recently in an article by MIT Technology ReviewNightshade “poisons” the training data by adding invisible pixels to a work of art before it is uploaded to the web.
“Using it to ‘poison’ this training data could harm future iterations of image-generating AI models, such as DALL-E, Midjourney, and Stable Diffusion, by rendering some of their output useless — dogs into cats, cars into cows, and so on,” MIT reports, adding that the research behind Nightshade has been submitted for peer review.
While image-generating tools are impressive and continually improving, the way they are trained has proven controversial, with many makers of such tools currently facing lawsuits from artists who claim their work has been used without permission or payment.
University of Chicago professor Ben Zhao, who leads the research team behind Nightshade, said such tools could help shift the balance of power back into the hands of artists, and provide a warning to tech companies that ignore copyright and intellectual property.
“Data sets for large AI models can consist of billions of images, so the more toxic images fed into the model, the greater the damage caused by the technique,” MIT Technology Review said in its report.
When Nightshade is released, the team plans to open source it so others can improve it and make it more effective.
Aware of its disruptive potential, the team behind Nightshade says it should be used as “a last defense for content creators against web scrapers” who don’t respect their rights.
In an effort to solve this problem, DALL-E creator OpenAI recently started allowing artists to delete their work from its training data, but the process is described as particularly burdensome as it requires artists to submit a copy of each image they wish to remove, along with a description of that image, and each request requires its own implementation.
Making the removal process easier might discourage artists from using tools like Nightshade, which could cause more problems for OpenAI and others in the long run.