{"title":"Leveraging LIME explainability and Gustafson-Kessel fuzzy clustering for resume grouping and text summarization","authors":"Ravi Mudavath, Atul Negi","doi":"10.1016/j.knosys.2025.114621","DOIUrl":null,"url":null,"abstract":"<div><div>Over the years a very large number of classification methods have been developed, which are now being referred to as “classical machine learning”. However, a noticeable gap remains in research linking unsupervised learning techniques with explainable artificial intelligence (XAI) methods. In this study, we address this gap by proposing a novel method to enhance the interpretability of unsupervised learning, particularly for textual data. We integrate XAI techniques with the Gustafson-Kessel (GK) fuzzy clustering algorithm to enhance the capture of semantic relationships in text, in particular, resumes for employment. Our approach leverages the light-weight Sentence-BERT model to generate contextual embeddings, that offer a deeper semantic understanding of resume data. These embeddings provide richer representations compared to traditional textual feature extraction methods. Similar resumes are clustered using the GK fuzzy clustering algorithm to identify common patterns across resumes. Subsequently, informative summaries are used for employment purposes, enhancing resume categorization and job matching. The GK algorithm, was chosen over others as it is especially effective at handling complex structures as compared to other clustering methods. In clustering practice, an evaluation of clustering quality is generally performed. In this work, we conduct statistical analysis, ablation studies, and assess performance using various clustering metrics. We also incorporate Local Interpretable Model-Agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP) to interpret the cluster memberships of individual resumes, thereby enhancing the transparency and trustworthiness of the clustering process. Our approach, when applied correctly, provides potential employers with clear and interpretable insights into how resumes are grouped. We present results on a resume data set that were summarized by our proposed method. The effective and interpretable clustering is shown in comparison with other clustering methods. The outcome is expected to improve the processing efficiency of applicant profiles and is helpful to human resource management.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114621"},"PeriodicalIF":7.6000,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125016600","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Over the years a very large number of classification methods have been developed, which are now being referred to as “classical machine learning”. However, a noticeable gap remains in research linking unsupervised learning techniques with explainable artificial intelligence (XAI) methods. In this study, we address this gap by proposing a novel method to enhance the interpretability of unsupervised learning, particularly for textual data. We integrate XAI techniques with the Gustafson-Kessel (GK) fuzzy clustering algorithm to enhance the capture of semantic relationships in text, in particular, resumes for employment. Our approach leverages the light-weight Sentence-BERT model to generate contextual embeddings, that offer a deeper semantic understanding of resume data. These embeddings provide richer representations compared to traditional textual feature extraction methods. Similar resumes are clustered using the GK fuzzy clustering algorithm to identify common patterns across resumes. Subsequently, informative summaries are used for employment purposes, enhancing resume categorization and job matching. The GK algorithm, was chosen over others as it is especially effective at handling complex structures as compared to other clustering methods. In clustering practice, an evaluation of clustering quality is generally performed. In this work, we conduct statistical analysis, ablation studies, and assess performance using various clustering metrics. We also incorporate Local Interpretable Model-Agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP) to interpret the cluster memberships of individual resumes, thereby enhancing the transparency and trustworthiness of the clustering process. Our approach, when applied correctly, provides potential employers with clear and interpretable insights into how resumes are grouped. We present results on a resume data set that were summarized by our proposed method. The effective and interpretable clustering is shown in comparison with other clustering methods. The outcome is expected to improve the processing efficiency of applicant profiles and is helpful to human resource management.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.