Google has been utilizing a resource to accumulate data necessary for AI-focused applications, including autonomous vehicles, via our internet interaction habits. Although corporations may categorize us simply as «users» or «customers,» this language can obscure their true motives.
As debate and concerns surrounding the impact of AI on our everyday lives, jobs, and leisure activities continue to grow, there is an often overlooked aspect. For AI systems to mimic human behaviors, they need to learn from us. In reality, virtually everyone plays a part in this process.
On a daily basis, we all unknowingly contribute to the training of proprietary AI systems by engaging in various «tasks» on our phones or computers. AI companies are proficient in leveraging our efforts without our explicit knowledge. Hundreds of small actions we take daily, such as browsing the web or using our mobile devices, are capitalized upon by AI firms to instruct their systems in human-like behavior.
Contents
The Invisible Workforce: How We Train AI Without Knowing It
Our research group delved into these manipulative practices. The finding of our research, The unwitting labourer: extracting humanness in AI training, was explicit: everyone acting as AI trainers is currently being exploited as non-paid workers.
It’s common knowledge that most AI applications employ a process known as machine learning. Through this process, machines can learn to execute human tasks (like playing chess, composing songs, painting) after consuming vast amounts of data specifically relevant to the human capability it’s being programmed to replicate. However, data is more multifaceted than most perceive. Data includes readily available information, such as our age, gender, location, and buying behaviors. But it also encapsulates the outcomes of countless activities that are uniquely human and that companies covertly compel us to perform.
Take, for instance, the tool reCAPTCHA. It’s an irritating online test that we encounter when attempting to log into an online service or access a webpage. Oddly enough, it requests us to confirm that we’re not robots. This tool was initially devised to differentiate between humans and bots to prevent bot access to services. However, since 2014, Google has been utilizing it to amass training data for AI-driven applications, like autonomous vehicles.
From Music to reCAPTCHA: How Everyday Activities Feed AI Systems
Content recommendation systems offer another prime example of how AI companies exploit human activities. These systems, such as Spotify’s, utilize human-generated data to suggest new music to users. Spotify employs numerous bots to analyze a wide range of music blogs and reviews to understand how music is characterized and what associations people make. Ultimately, recommendation systems become profitable by suggesting things they’ve learned about from other people, legally extracting human essence without obtaining explicit permission.
AI systems that produce text, music, visual art, and other outputs from textual prompts have undoubtedly been in the news. Chat-GPT has made headlines this year, but there are numerous systems creating human-like content. All of this is possible because we all unknowingly train these systems.
Publicly available image datasets on the internet, like ImageNet, serve as the training data for these systems. These datasets, scraped from sites such as Flickr, YouTube, Instagram, etc., contain photos taken and art created by humans. Even our interaction with spam filters contributes to AI training data. This collaborative filtering method, while used for content removal rather than promotion, aids in spam classification.
The ethical dilemma here is that people interacting with these systems are generally unaware of these data harvesting practices and the value they’re providing. Moreover, they receive no compensation despite the substantial profits these corporations reap from their data. The claim we put forward is that this unpaid labor, generating surplus value for private companies without consent or compensation, equates to labor exploitation.
These corporations might prefer to label us as “users” or “customers,” but this effectively hides the exploitation that’s taking place. The power disparity between unsuspecting workers and technology companies is vast. Our societies have sanctioned a business model based on exploitation to thrive. It should be incumbent upon governments to rectify this power imbalance.
The Springer journal AI & Society. It can be accessed freely at this link.
If you want to know about more check our