Makam, S.K.Hiranmayi, M.Y.Kumar, P.Bhat, P.Patil, N.2026-02-062025Lecture Notes in Networks and Systems, 2025, Vol.1265 LNNS, , p. 65-7723673370https://doi.org/10.1007/978-981-96-2299-3_5https://idr.nitk.ac.in/handle/123456789/28649One of the main causes of fatalities in the global population is cardiovascular disease (CVD), commonly called heart disease. Early detection of CVD risks is a major area of interest in clinical data analysis. This study focuses on devising strategies for improving the predictive abilities of CVD risk detection algorithms. We experiment with binary and multiclass classification techniques on public UCI machine learning repository datasets, namely, Cleveland for training and Statlog and Hungarian for evaluation. The techniques include feature selection by best subset generation and data balancing using Binary and Multiclass SMOTE and their variants. Every technique is assessed by tenfold cross-validation on six classifiers: K-Nearest Neighbors (KNNs), Naive Bayes, Logistic Regression (LR), Support Vector Machine (SVM), Neural Network, and Vote (a hybrid technique combining Naïve Bayes and Logistic Regression). Experimental results show a rise in average classifier F1-score of 4.36% after feature selection and Binary SMOTE. Top-performing models include Logistic Regression, Neural Networks, and Voting. KNN shows a significant rise of 8.5 and 5.05% in accuracy, after employing Binary and Multiclass SMOTE techniques, respectively. Multiclass SMOTE results can be used as a benchmark but possess scope for further research and enhancement. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.Cardiovascular disease (CVD)Data mining techniquesEnsemble classifiersExhaustive feature space searchHybrid algorithmsSynthetic minority oversampling technique (SMOTE)Exploring Various Data Mining Techniques to Predict Heart Disease