The spread of misinformation and disinformation has profound and far-reaching consequences. The economic implications can be quantified in terms of lost revenues or market instabilities, but the social and human costs—like erosion of trust, deepening societal divides, and mental stress—are profound and often less tangible. Politically, the intentional spread of falsehoods can disrupt the very core of democratic processes. In the age of digital information, combating these challenges requires collective efforts from governments, organizations, and individuals to promote media literacy, critical thinking, and open dialogue. The Impact of Misinformation and Disinformation Project Alternus Vera
Welcome to the Alternus Vera Research Project. This open source project is an ongoing project under the supervision of Dr. Ali Arsanjani.
This project focuses on creating and providing access to open source capabilities to students from various universities that Dr. Arsanjani advises, including the Computer Engineering Department of at San Jose State University and University of California, San Diego, HDSI (Halicioglu Data Science Institute), to work on this problem as part of a Machine Learning and Deep Learning courses.
For UCSD Capstone, check here.
For SJSU Master’s Projects, check here.
The objective of this project is to assist the process of detection, ranking, mitigation of misinformation and disinformation. For example, to build a “fake news” detection model that will detect and score the degree of intended misinformation or fakeness of a natural language corpus: e.g., detect fake news.
With the expanded scope of fake news [2], intentional misinformation, etc., it is important to leverage AI and ML to mitigate this risk to various facets of people’s lives, communities, nations and the world as a whole.
One of the major sources of training has been [4], the Liar Liar dataset, amalgamated with several more [5].
This project aims to detect and provide: a) a score of veracity for an input corpus written by a set of authors published on some medium, b) a tensor that represents the AlternusVera score for that corpus.
It is a confluence of several disciplines [3] and areas of research in AI and Machine / Deep Learning: namely, Explainability, NLP, NLU, Distillation, Amalgamation, Fake New detection, providing a score for veracity of information content, across the following 35 factors that include elements of cognitive bias, topic modeling, sentiment analysis, entity resolution, media content, source and site reliability/credibility, etc..
We categorize methods of detecting, ranking and mitigating Veracity in the AlternusVera Project into :
Factuality Factors — see table below.
NodeRank | Political
affiliation |
Sentimental Analysis | Topic
features |
Post/Social media activities |
Spam | Visual
based |
Sensationalism | Writing
Style |
Reliable
source |
Context
Veracity |
Verifiable Authenticity | Misleading intentions | Confirmation
Bias |
Psychology
utility |
Content
Statistics |
Social
credibility |
Frequency
heuristic |
Credibility and Reliability | Stance |
Source
Reputation |
Echo
Chamber |
Location / Geography | Education | Biases |
News
Coverage |
Malicious
Account |
Naive
Realism |
Network-based | Corpus
Structure |
Political
bias |
Event
Coverage |
Title vs Body | ClickBait | Micro-patterns of Misinformation |
See the full list of Factuality Factors and Veracity Vectors here. For implementations strategies look here.
A lie gets halfway around the world before the truth has a chance to get its pants on. -Winston Churchill
The increasingly wider and more commonplace dissemination of “fake news”, i.e., news of dubious quality promoting intentionally false information calls for its mitigation as it threatens to shake the foundations of trust and ethics that drive social institutions.
The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Fake news detection on social media has recently become an emerging research that is attracting tremendous attention.
“fake news” is intentionally written to mislead its readers into believing false information, accusations or extrapolate from misleading or false data.
This mal-intention makes it increasingly difficult and nontrivial to detect based on mere content. It is therefore paramount to amalgamate additional contextual information features and factors : such as user social engagements on social media, sentiment, political stance, etc. to help make a more definitive and authoritative determination.
In addition, exploiting this additional contextual information is challenging in and of itself as users’ social engagements with fake news produce data that is big, incomplete, unstructured, and noisy.
"Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media." [1]
[1] Shu, Kai et al. “Fake News Detection on Social Media: A Data Mining Perspective.” SIGKDD Explorations 19 (2017): 22-36.
[2] Sharma, Karishma et al. “Combating Fake News: A Survey on Identification and Mitigation Techniques.” CoRR abs/1901.06437 (2019): n. pag.
[3] Oshikawa, Ray et al. “A Survey on Natural Language Processing for Fake News Detection.” CoRR abs/1811.00770 (2018): n. pag.
[4]William Yang Wang, “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection.
[5] submitted not yet accepted. Arsanjani et al., “AlternusVera: A Veracity Tensor for Fake News and Misinformation Mitigation”, Proceedings of EMNLP-IJCNLP 2019 : Conference on Empirical Methods in Natural Language Processing & International Joint Conference on Natural Language Processing 2019.
[6] Shu, Kai, 2019, “FakeNewsNet”, https://doi.org/10.7910/DVN/UEMMHS, Harvard Dataverse, V1, UNF:6:f65WJbC1l58CmjSZdPw5ew== [fileUNF].