Lakhtaria, D.L, D.H.Chhabra, R.Taparia, R.Anand Kumar, M.A.2026-02-082024Signals and Communication Technology, 2024, Vol.Part F2556, , p. 59-7018604862https://doi.org/10.1007/978-3-031-89771-9_70https://idr.nitk.ac.in/handle/123456789/33567The growing popularity of social media as a means of self-expression and self-discovery has sparked a heightened curiosity in utilizing the Myers–Briggs Type Indicator (MBTI) to investigate human personalities. Despite the increasing use of word-embedding techniques, machine learning algorithms, and imbalanced data-handling techniques to predict MBTI personality types, further research is needed to explore how these approaches can enhance the accuracy of the results. Our research aimed to use the GPT model to address the problem of class imbalance. We have implemented several machine learning models such as RCNN, LSTM, XGBoost, and Random Forest. We have also tried using two-word embedding including Word2Vec and GloVe Embedding. According to our findings, the approach we used can attain a considerably high F1-score, which is dependent on the selected model for the prediction and classification of MBTI personality. The ability to accurately predict and classify MBTI personality through our approach has the potential to improve our comprehension of MBTI. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.GPTMBTIOversamplingPredictive modelsText generationGenerating Synthetic Text Data for Improving Class Balance in Personality Prediction