Machine Learning Based Data Quality Model for COVID-19 Related Big Data
No Thumbnail Available
Date
2022
Journal Title
Journal ISSN
Volume Title
Publisher
Springer Science and Business Media Deutschland GmbH
Abstract
Big Data is being used in various aspects of technology. The quality of the data being used is essential and needs to be accurate, reliable, and free of defects. The difficulty in improving the quality of big data can be overcome by leveraging computing resources and advanced techniques. In this paper, we propose a solution that utilizes a machine learning (ML) model combined with a data quality model to improve the quality of data. An auto encoder neural network that detects the anomalies in the data is used as the Machine Learning model. This is followed by using the data quality model to ensure the data meets appropriate data quality characteristics. The results obtained from our solution show that the quality of data can be improved efficiently and effortlessly which in turn aids researchers to achieve better results. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
Description
Keywords
Anomaly detection, COVID-19, Data encoders, Data quality, Machine learning
Citation
Lecture Notes on Data Engineering and Communications Technologies, 2022, Vol.91, , p. 561-571
