Browsing by Author "Srinivasa, K."

Now showing 1 - 2 of 2

Clustering and bootstrapping based framework for news knowledge base completion
(Slovak Academy of Sciences, 2021) Srinivasa, K.; Santhi Thilagam, P.S.
Extracting the facts, namely entities and relations, from unstructured sources is an essential step in any knowledge base construction. At the same time, it is also necessary to ensure the completeness of the knowledge base by incrementally extracting the new facts from various sources. To date, the knowledge base completion is studied as a problem of knowledge refinement where the missing facts are inferred by reasoning about the information already present in the knowledge base. However, facts missed while extracting the information from multilingual sources are ignored. Hence, this work proposed a generic framework for knowledge base completion to enrich a knowledge base of crime-related facts extracted from online news articles in the English language, with the facts extracted from low resourced Indian language Hindi news articles. Using the framework, information from any low-resourced language news articles can be extracted without using language-specific tools like POS tags and using an appropriate machine translation tool. To achieve this, a clustering algorithm is proposed, which explores the redundancy among the bilingual collection of news articles by representing the clusters with knowledge base facts unlike the existing Bag of Words representation. From each cluster, the facts extracted from English language articles are bootstrapped to extract the facts from comparable Hindi language articles. This way of bootstrapping within the cluster helps to identify the sentences from a low-resourced language that are enriched with new information related to the facts extracted from a high-resourced language like English. The empirical result shows that the proposed clustering algorithm produced more accurate and high-quality clusters for monolingual and cross-lingual facts, respectively. Experiments also proved that the proposed framework achieves a high recall rate in extracting the new facts from Hindi news articles. © 2021 Slovak Academy of Sciences. All rights reserved.
Multi-layer perceptron based fake news classification using knowledge base triples
(Springer, 2023) Srinivasa, K.; Santhi Thilagam, P.S.
Recent attempts to detect fake news have relied on the implementation of machine or deep learning models that have been trained on text. These models, on the other hand, are insufficient for classifying knowledge base facts or triples as fake or true. However, it is critical to assess the credibility of facts before they are included to the knowledge base. Hence, this paper suggests using a Multi-layer Perceptron to categorize a given triple as fake or true. Furthermore, extant works embed the features using either frequency or prediction based word embedding models, and thus both document and word level features are not captured. To address this issue, a data modeling approach is proposed that vectorizes the triples using two cutting-edge word embedding models, Wrod2Vec and GloVe, as well as TF-IDF and Counter Vectorizer. Empirical results show that the Multi-layer Perceptron with GloVe and count vectorizer outperforms the baseline model in terms of accuracy. Moreover, named entity tags associated with the entities, such as PERSON, add an extra feature for training the models. As a result, an algorithm that jointly extracts the triples along with named entity tags is also proposed. Experiments demonstrated that models trained on triples with named entity tags produce high accuracy. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.