Fine-Tuned Sentence Transformer for Multilingual Job Title Matching

dc.contributor.authorBhangale, C.S.
dc.contributor.authorGabhane, P.A.
dc.contributor.authorAnand Kumar, M.
dc.date.accessioned2026-02-06T06:33:18Z
dc.date.issued2025
dc.description.abstractMatching job titles is critical task in various fields, such as resume evaluation, and job recommendation platforms. Many companies use different job title for similar roles which creates ambiguity. This research tackles the issue by developing a machine learning-based strategy that makes use of a Sentence Transformer model paraphrase-multilingual-mpnet-base v2, finetuned for the job title matching task. The training dataset consists of job titles paired with their corresponding similar job titles across three languages—English, Spanish, and German—while the validation data includes a query file and a corpus file, each containing job titles in the same languages. To ensure data consistency, preprocessing steps are applied, like handling missing values, normalizing text and removing special characters. Cached Multiple Negatives Ranking Loss is used to improve retrieval accuracy, which helps the model to distinguish between similar and dissimilar job titles. After training, the embeddings are generated for each job title in query and corpus file. Cosine similarity is used to compute similarity scores between the query and corpus job title embeddings. Finally, for each query job title, corpus job titles are ranked based on their similarity scores. The model’s performance evaluated using standard retrieval metrics, including Mean Average Precision (MAP), Mean Reciprocal Rank (MRR), and Precision@K. The fine-tuned model achieved an average MAP score of 0.49 across English, Spanish, and German languages on the validation data, and 0.45 on the test data. © 2025 Copyright for this paper by its authors.
dc.identifier.citationCEUR Workshop Proceedings, 2025, Vol.4038, , p. 4411-4420
dc.identifier.issn16130073
dc.identifier.urihttps://doi.org/
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/28584
dc.publisherCEUR-WS
dc.subjectCached Multiple Negatives Ranking Loss
dc.subjectJob title
dc.subjectSentence Transformer
dc.titleFine-Tuned Sentence Transformer for Multilingual Job Title Matching

Files