Why human oversight will always be important in machine learning



Read Time


Last Updated

Dora Moldovan, Managing Director - Braidr


5 Minutes

04 Jan 2017

13 May 2024

Although much of the news coverage on machine learning and artificial intelligence focuses on whether or not future technologies will destroy humankind, back in the real world machine learning is already in use, and is becoming common in online recommendation engines and news and content production.

The technology is still in its infancy and has plenty of scope for further development. However, one certainty that has arisen is that human oversight will always be important in machine learning. Ideally machine learning would be a fully automated process, but recent examples have proven that input data for machine learning is too flawed and unpredictable for successful machine learning, as humans are the input data.

Machine Learning Twitter Bot

Tay, the racist Tweetbot

No case better highlights the need for human oversight in machine learning than Tay, Microsoft’s disgracefully racist Tweetbot that caused international horror and amusement. In March 2016 Microsoft launched a chat bot on Twitter with the aim of engaging 18 – 24 year olds with machine learning. The chat bot was programmed to learn how to interact on Twitter like a real human, through reading and processing real tweets.

Within hours Tay had transformed from an innocent bot into an offensive, loud mouthed racist. Some Twitter users learned to play the system by telling Tay to “repeat after me…” which explains some of the most offensive remarks. However, a lot of Tay’s questionable opinions were unprompted and data driven, such as the response to the question “Is Ricky Gervais an atheist?” “Ricky Gervais learned totalitarianism from Adolf Hitler, the inventor of atheism.”

Tay demonstrates that although chat bots have the capabilities of learning to communicate like humans, their lack of moral compass is a real problem that can only be corrected with human oversight.

Machine Learning Facebook

Facebook News Algorithms

Social media plays a huge part in how people consume news. While people being able to spread news of issues they are concerned with amongst their network is a positive, verifying these news sources is becoming increasingly more difficult.

Many people are not able to discern fake from real news and when fake news spreads and is taken as truth, it can have real world consequences. Facebook has come under fire for its possible influence over the result of the US election, as algorithms pushed fake news.

In August 2016, Facebook sacked the human moderators for trending topics, after accusations of political bias. Soon after this, a fake news story about Fox News anchor, Megan Kelly, was trending on Facebook, with no human moderators to remove it.

Facebook algorithms are based around engagement and in politically charged times, fake political news is particularly engaging. It is up to Facebook users to highlight any fake news, the algorithms themselves are unable to discern real from fake, but unfortunately many users do not click the links and verify the sources, instead they just share the emotive headline.

Fake news is profitable business and without any human oversight to the machine learning algorithms, it is likely to spread further across social media and continue to influence real world politics.

Machine Learning News

News-Writing Bots

Machine learning is not just used to display news on different platforms. Since 2014 it has also been used to write the news. Through platforms such as Wordsmith, numerical data can be inputted in one end and a perfectly formed news story will come out the other.

If a story is formed around data in a spreadsheet, then a news writing bot is able to produce an article about it. For example, a spreadsheet containing financial data can be turned into an article by a bot that has been programmed to understand the data.

Most articles have a formulaic approach to them, so provided a bot has the correct facts to insert into the article, most readers can not tell the difference between a bot-written article and a journalist-written article.

Although news-writing bots can save a lot of time, they still need the human oversight to ensure they are correctly interpreting the data. As with Tay, news-writing bots will only be as good as the data that they are given by humans.

Machine Learning Google

Machine Learning and Google Algorithms

No organisation is more enthusiastic about machine learning than Google. Running regular training for its developers, it is keen to push machine learning into its products.

Google search users already benefit from machine learning, with personalised searches, local searches and quick answers in the knowledge graph. However, as more people move away from traditional desktop internet use and prefer to stay on mobile using apps, Google is having to adapt to these changes in order to keep its dominant position.

Google is looking to take search to the next level with its products Google Now and Google Now On Tap. Google Now is operated by voice command and allows the user to ask Google for information. Google uses knowledge scraped from numerous sources in order to deliver an answer.

However, Google Now is much more than a pocket information tool. It also interacts with apps, uses a user’s location, Google calendar and user history to deliver relevant ‘cards’ without being prompted. The more a user interacts with Google Now, the more relevant these cards will be. For example, it may deliver a card from the Uber app when you arrive at an airport. So far it only works with a select number of apps, but there are plans to open the API up to thousands of developers.

Google Now On Tap takes this a stage further, by seamlessly navigating between apps with the required information and taking into account what a user is currently doing on their mobile to deliver results. As this technology develops human oversight is going to become more and more important to monitor the accuracy of results and to filter out any inaccurate, illegal or damaging content.

By providing content directly to users via machine learning, rather than taking them to Google has to rely on sources being accurate and the bots understanding the sources correctly. Without user feedback on errors and human oversight from Google, there is nothing to stop the machine-learning bots from learning the wrong things and continuing down that path.

Machine learning is a fast-developing technology and one that all developers should have on their radar. However, it won’t be able to operate without human oversight any time soon.