Generating Synthetic Text Data for Improving Class Balance in Personality Prediction

dc.contributor.authorLakhtaria, D.
dc.contributor.authorL, D.H.
dc.contributor.authorChhabra, R.
dc.contributor.authorTaparia, R.
dc.contributor.authorAnand Kumar, M.A.
dc.date.accessioned2026-02-08T16:50:00Z
dc.date.issued2024
dc.description.abstractThe growing popularity of social media as a means of self-expression and self-discovery has sparked a heightened curiosity in utilizing the Myers–Briggs Type Indicator (MBTI) to investigate human personalities. Despite the increasing use of word-embedding techniques, machine learning algorithms, and imbalanced data-handling techniques to predict MBTI personality types, further research is needed to explore how these approaches can enhance the accuracy of the results. Our research aimed to use the GPT model to address the problem of class imbalance. We have implemented several machine learning models such as RCNN, LSTM, XGBoost, and Random Forest. We have also tried using two-word embedding including Word2Vec and GloVe Embedding. According to our findings, the approach we used can attain a considerably high F1-score, which is dependent on the selected model for the prediction and classification of MBTI personality. The ability to accurately predict and classify MBTI personality through our approach has the potential to improve our comprehension of MBTI. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
dc.identifier.citationSignals and Communication Technology, 2024, Vol.Part F2556, , p. 59-70
dc.identifier.issn18604862
dc.identifier.urihttps://doi.org/10.1007/978-3-031-89771-9_70
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/33567
dc.publisherSpringer Science and Business Media Deutschland GmbH
dc.subjectGPT
dc.subjectMBTI
dc.subjectOversampling
dc.subjectPredictive models
dc.subjectText generation
dc.titleGenerating Synthetic Text Data for Improving Class Balance in Personality Prediction

Files

Collections