A study reveals that many workers training the AI models would themselves use AI in the course of their work. What to question the valuation of these training tasks, often extremely low.
MIT’s technology magazine, the MIT Technology Review, reports a study by the École Polytechnique Fédérale de Lausanne on the way in which artificial intelligences are trained. They would be more and more trained… by other artificial intelligences. The fault of too low remuneration of workers, encouraged to automate their tasks.
To be reliable, AI needs training
Let us recall how intelligence models are trained. As you surely know, they need gigantic amounts of data. However, not all data is equal: it must be the most accurate and reliable, since it will affect the capabilities of the final AI.
As reminded by MIT Technology Review“ many companies pay casual workers on platforms“, taking the example of Mechanical Turk from Amazon, the best known. Captcha resolution, data labeling or text annotations: so many ” micro-tasks to be carried out, most often by inhabitants of poor or developing countries. A way of working highlighted in particular by Antonio Casilli in his book Waiting for robots .
To earn more, you have to work faster: the solution is AI
These workers are paid by the task, a few cents each time. To arrive at a suitable hourly rate, they are encouraged to do it as quickly as possible. To understand the mechanics of training, 44 people were hired by a team of researchers from the École Polytechnique Fédérale de Lausanneviathe Mechanical Turk platform to summarize 16 excerpts from medical research articles.
The researchers analyzed the produced summaries using a self-trained AI model designed to determine whether or not a text was generated by ChatGPT. The signs are diverse: similar sentence forms, lack of variety in word choice. We also learn that they checked the keystrokes to find out if the hired workers had copied and pasted their summaries.
The resulting estimate is that between 33 and 46 percent of the 44 workers would have used text generation models like OpenAI’s ChatGPT. For researchers Veniamin Veselovsky, Manoel Horta Ribeiro and Robert West, this percentage could increase in the years to come, as AIs become more and more powerful and more accessible. Co-author Robert West clarified his thinking:I don’t think this is the end of crowdsourcing platforms. It just changes the dynamic.»
Why upgrading AI training work is needed
The problem with this use of ChatGPT in AI training is that it could gradually lead to errors in the already error-prone models, which can be seen extremely well with ChatGPT or Midjourney.
Two AIs, represented by robots, compete // Source: Image created by CssTricks with Midjourney[/caption]
The aforementioned study shows the need to verify whether data has been produced by an AI or by a human, and this, all the more so on training platforms. Controls of the latter should then be reinforced and AI companies would have more interest in internalizing this training phase. A mode of subcontracting which results in the exploitation of poor workers: this was shown by a survey by theTimelast January. Kenyan workers were paid less than two dollars an hour to train AI models developed by OpenAI.
If a system is built on exploiting and underpaying workers, it’s always vulnerable to “cheating” (=from their perspective, stopping wage theft). No need to ban text generation tools. We do need to pay decent wages and to acknowledge the status of workers. https://t.co/I5N8S75xtv
— Casilli | @firstname.lastname@example.org (@AntonioCasilli) June 27, 2023
For sociologist Antonio Casilli, the problem is quite simple: “Sf a system is built on the exploitation and underpayment of workers, it is always vulnerable to ‘cheating’.“According to him, he is”no need to ban text generation tools. We must pay decent wages and recognize the status of workers.He is in favor of the generalization of the status of employee and not self-employed for those who train AI, in particular on specialized platforms.
The Watt Else newsletter is THE unmissable event dedicated to the mobility of the future. Register here!