In A.I. We Trust? Tackling Fake News Through Big Data

"The real problem is not misinformation per se as jokes can also be categorised as misinformation," Twitter chief Jack Dorsey addressed a packed audience at the Indian Institute of Technology, Delhi last year. "But misinformation that is spread with the intent to mislead people is a real problem."

But as the problem of fake news continues to rear its ugly head again and again, and as companies like Facebook and Google continue to invest in fact-checking, their methods have been largely reactive.

In the aftermath of U.S. presidential elections in 2016, Facebook CEO Mark Zuckerberg dismissed fake news as "a pretty crazy idea," while failing to account for the fact that a significant chunk of users get their news from social media, with more than 82 percent of students having trouble distinguishing fake news from real news. What's more, a Pew Research Center study found that 20 percent of social media users have modified their stance on a social or political issue because of content seen on social media.

Facebook may not have taken responsibility for the content posted on the social network, but it did eventually begin to ask users to flag articles for their use of "misleading language," before announcing partnerships with outside fact-checkers like Snopes, FactCheck.org, Politifact, ABC News and the AP to mark them "as disputed and there will be a link to the corresponding article explaining why." (Snopes last week pulled out of the partnership citing disagreements over lack of financial benefits for the journalists involved.)

But the results have largely bore little fruition. That's because fake news is not an easy problem to solve. Not only is it difficult to detect fake news, fact-checking by itself is time-consuming, not to mention the pressure to expose false claims before they create too much damage.

With Artificial Intelligence contributing to the flood of fake news in the form AI bots generating, amplifying and disseminating false news to different audiences, it only seems fair that artificial intelligence and big data have emerged as powerful tools to track news stories and identify fake news.

Artificial Intelligence (AI) algorithms today make it easy to identify underlying patterns in large volumes of data, and make decisions with minimal human intervention. By leveraging this technology, it makes it easy to spot fake news by sifting through news stories that were reported as inaccurate by users of a platform, say Facebook, in the past.

In addition, the reputation of news sources can also be of great importance in determining whether a news story is fake or not. An AI model trained on a website's reputation, along with the website domain name and its Alexa web rank, can be used to proactively predict a website's reputation.

When sensational words are used to spread disinformation, it helps it gain further traction. Artificial intelligence can once again come in handy in such cases to determine the veracity of a news story by using keywords as an input.

Last November, the Massachusetts Institute of Technology (MIT), working with the Qatar Computing Research Institute (QCRI), employed this strategy to rate the value of news by rating the sources producing it.

Researchers at MIT's Computer Science and Artificial Intelligence Lab (CSAIL) and QCRI developed a machine learning system that examines a range of sources and rates them on factors such as language, sentence structure, complexity, and emphasis on features like fairness or loyalty. The project used data from Media Bias/Fact Check (MBFC), whose human fact-checkers rate the accuracy of about 2,000 large and small sites, and fed this information to the algorithm "and programmed it to classify news sites the same way as MBFC."

That's not all. Other variables selected for the system include: articles from the site, its Wikipedia page, Twitter account, URL structure and web traffic, along with searching for keywords and linguistic features that indicate strong political bias or misinformation. The result is an open-source database of more than 1,000 sources with ratings for accuracy and bias.

Although the project is still in its initial stages, there is no doubt that such a tool will help existing fact-checking services, allowing them to "instantly check our 'fake news' scores for those outlets to determine how much validity to give to different perspectives."

A key aspect to accurate ratings hinges on reliable training data, which in this case came from MBFC. But the algorithm, as we have seen, will be able to identify websites in advance, alerting fact-checkers and media watchdogs when a news outlet starts publishing fake news, or is distorting facts by making use of persuasive and sensational language.

Along similar lines, researchers from Arizona State University and Michigan State University have developed machine learning models for "early fake news detection," identifying potential factors to spot specific social media accounts and source websites that spread fake news.

Efforts to combat fake news are making progress, but the battle can only be won by a combination of efforts that harnesses the power of citizen journalism with artificial intelligence. The takeaway is this: people will have to be always involved in fact checking in one form or the other, and to help people understand what's true or not. It's ultimately a very human-centric system, but AI and machine learning algorithms can help make the process a lot more efficient.

As Preslav Nakov, a senior scientist at QCRI and one of the researchers on the study, said, "It is like fighting spam. We will never stop fake news completely, but we can put them under control."

Comments