BENN: Balanced Ensemble Neural Network for Handling Class Imbalance in Big Data
No Thumbnail Available
Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
John Wiley and Sons Inc
Abstract
Class imbalance is a critical challenge in big data analytics, often leading to biased predictive models. This imbalance can lead to biased models that perform well on the majority class but poorly on the minority class. Many machine learning models tend to be biased towards the majority class because they aim to minimise overall error, often leading to poor performance on the minority class. This paper presents the balanced ensemble neural network, a novel solution to effectively address class imbalance in big data. Balanced ensemble neural network combines the robust capabilities of neural networks with the power of ensemble learning, incorporating class balancing strategies to ensure fair representation of minority classes. The methodology involves integrating multiple neural networks, each trained on balanced subsets of data using techniques like Synthetic Minority Over-sampling Technique and Random Undersampling. This integration aims to leverage the strengths of individual networks while reducing their inherent biases. Our extensive experiments across various datasets reveal that BENN achieves an AUC-ROC score of 0.94, surpassing other models such as random forest (0.88), support vector (0.84) and single neural net (0.80). It was also observed that BENN's performance is better compared to traditional neural network models and standard ensemble methods in key metrics like accuracy, precision, recall, F1-score and AUC-ROC. The results specifically highlight BENN's effectiveness in accurately classifying instances of minority classes, a notable challenge in many existing models. These findings underscore BENN's potential as a substantial advancement in handling class imbalance within big data environments, offering a promising direction for future research and application in machine learning. © 2024 John Wiley & Sons Ltd.
Description
Keywords
Contrastive Learning, Data assimilation, Federated learning, Neural network models, Class imbalance, Concept drifts, Critical challenges, Data analytics, Decision tree regression, Ensemble neural network, Machine-learning, National health dataset, Predictive models, Random forests, Adversarial machine learning
Citation
Expert Systems, 2025, 42, 2, pp. -
