{"title":"Efficient opinion mining for imbalanced customer reviews in last-mile services","authors":"Sangbaek Kim , Hongchul Lee , Jiho Kim","doi":"10.1016/j.datak.2025.102466","DOIUrl":null,"url":null,"abstract":"<div><div>Last-mile (LM) service manages the final stage of delivering products to customers in supply chains and logistics. Consumer opinion mining has recently become essential for providing high-level LM service quality. However, existing methods face challenges with domain-specific terminology and class imbalance. Therefore, we propose LM-BERT, a BERT-based text classification model specialized in LM service sentiment analysis. In addition, we introduce a teacher–student LM-BERT framework that alleviates data imbalance in online e-commerce reviews through high-confidence pseudo-labeling. After evaluating six Transformer models, KLUE-BERT was identified as the most suitable for our baseline. Experimental results demonstrate that domain-specific knowledge transfer improves performance by 1.78 % on seen data and 1.31 % on unseen data. Statistical verification and explainable artificial intelligence techniques were employed to confirm the reliability of our approach to enhance qualitative performance and expand domain knowledge. We also conducted an ablation study confirming that high-confidence pseudo-labeling (<em>t</em> = 0.99) outperforms the traditional resampling method. The proposed LM-BERT model can effectively support LM service quality evaluation and management based on the voice of the customer in e-commerce.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"160 ","pages":"Article 102466"},"PeriodicalIF":2.7000,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X25000618","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Last-mile (LM) service manages the final stage of delivering products to customers in supply chains and logistics. Consumer opinion mining has recently become essential for providing high-level LM service quality. However, existing methods face challenges with domain-specific terminology and class imbalance. Therefore, we propose LM-BERT, a BERT-based text classification model specialized in LM service sentiment analysis. In addition, we introduce a teacher–student LM-BERT framework that alleviates data imbalance in online e-commerce reviews through high-confidence pseudo-labeling. After evaluating six Transformer models, KLUE-BERT was identified as the most suitable for our baseline. Experimental results demonstrate that domain-specific knowledge transfer improves performance by 1.78 % on seen data and 1.31 % on unseen data. Statistical verification and explainable artificial intelligence techniques were employed to confirm the reliability of our approach to enhance qualitative performance and expand domain knowledge. We also conducted an ablation study confirming that high-confidence pseudo-labeling (t = 0.99) outperforms the traditional resampling method. The proposed LM-BERT model can effectively support LM service quality evaluation and management based on the voice of the customer in e-commerce.
期刊介绍:
Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.