Generating Synthetic Text Data for Improving Class Balance in Personality Prediction

No Thumbnail Available

Date

2024

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Science and Business Media Deutschland GmbH

Abstract

The growing popularity of social media as a means of self-expression and self-discovery has sparked a heightened curiosity in utilizing the Myers–Briggs Type Indicator (MBTI) to investigate human personalities. Despite the increasing use of word-embedding techniques, machine learning algorithms, and imbalanced data-handling techniques to predict MBTI personality types, further research is needed to explore how these approaches can enhance the accuracy of the results. Our research aimed to use the GPT model to address the problem of class imbalance. We have implemented several machine learning models such as RCNN, LSTM, XGBoost, and Random Forest. We have also tried using two-word embedding including Word2Vec and GloVe Embedding. According to our findings, the approach we used can attain a considerably high F1-score, which is dependent on the selected model for the prediction and classification of MBTI personality. The ability to accurately predict and classify MBTI personality through our approach has the potential to improve our comprehension of MBTI. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Description

Keywords

GPT, MBTI, Oversampling, Predictive models, Text generation

Citation

Signals and Communication Technology, 2024, Vol.Part F2556, , p. 59-70

Collections

Endorsement

Review

Supplemented By

Referenced By