Uncovering the roots of customer dissatisfaction via Amazon reviews: a hybrid ensemble-deep learning approach for E-commerce quality management

IF 4.5 3区管理学 Q1 OPERATIONS RESEARCH & MANAGEMENT SCIENCE

Annals of Operations Research Pub Date : 2025-09-08 DOI:10.1007/s10479-025-06770-x

Rahul Kumar, Shubhadeep Mukherjee, Divya Choudhary

{"title":"Uncovering the roots of customer dissatisfaction via Amazon reviews: a hybrid ensemble-deep learning approach for E-commerce quality management","authors":"Rahul Kumar, Shubhadeep Mukherjee, Divya Choudhary","doi":"10.1007/s10479-025-06770-x","DOIUrl":null,"url":null,"abstract":"<div><p>Multi-class labelling in the absence of ground truth is a known hard problem in the computational intelligence paradigms. This problem is amplified in the case of e-commerce due to both high volume and high velocity of information. Specifically, it is hard to find labels for mass online reviews, rendering it unsuitable for supervised learning. Till date, the most sought solution is manual labelling, which remains a labour-intensive and time-consuming task. The purpose of this study is to develop an end-to-end approach for identifying the sources of quality-stimulated customer dissatisfaction and automatically assigning them in the context of e-commerce. The above objective is achieved by using a novel ensemble-based semi supervised pseudo-labelling technique on a large corpus of Amazon.com reviews. As a first step, a subset is manually labelled, followed by an ensemble approach of retaining commonly labelled (pseudo) class to iteratively label the entire dataset. We then apply Large Language Models (LLMs) and Deep Learning (DL) architectures on the (pseudo) labelled data to accomplish a multi-class classification problem. We contrast and showcase statistically significant improvement to the baseline machine learning models, where the pre-trained transformer models demonstrate best performance. Our approach proposes a roadmap to streamline automatically identifying sources of quality-related dissatisfaction in e-commerce channels using an amalgamation of ensemble and sophisticated computational techniques. We believe that our approach, if adopted, can bolster grievance redressal for online customers.</p></div>","PeriodicalId":8215,"journal":{"name":"Annals of Operations Research","volume":"353 2","pages":"545 - 574"},"PeriodicalIF":4.5000,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Operations Research","FirstCategoryId":"91","ListUrlMain":"https://link.springer.com/article/10.1007/s10479-025-06770-x","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPERATIONS RESEARCH & MANAGEMENT SCIENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Multi-class labelling in the absence of ground truth is a known hard problem in the computational intelligence paradigms. This problem is amplified in the case of e-commerce due to both high volume and high velocity of information. Specifically, it is hard to find labels for mass online reviews, rendering it unsuitable for supervised learning. Till date, the most sought solution is manual labelling, which remains a labour-intensive and time-consuming task. The purpose of this study is to develop an end-to-end approach for identifying the sources of quality-stimulated customer dissatisfaction and automatically assigning them in the context of e-commerce. The above objective is achieved by using a novel ensemble-based semi supervised pseudo-labelling technique on a large corpus of Amazon.com reviews. As a first step, a subset is manually labelled, followed by an ensemble approach of retaining commonly labelled (pseudo) class to iteratively label the entire dataset. We then apply Large Language Models (LLMs) and Deep Learning (DL) architectures on the (pseudo) labelled data to accomplish a multi-class classification problem. We contrast and showcase statistically significant improvement to the baseline machine learning models, where the pre-trained transformer models demonstrate best performance. Our approach proposes a roadmap to streamline automatically identifying sources of quality-related dissatisfaction in e-commerce channels using an amalgamation of ensemble and sophisticated computational techniques. We believe that our approach, if adopted, can bolster grievance redressal for online customers.

Abstract Image

查看原文本刊更多论文

通过亚马逊评论揭示客户不满的根源：电子商务质量管理的混合集成-深度学习方法

在没有基础真值的情况下，多类标记是计算智能范式中一个已知的难题。在电子商务的情况下，由于信息的高容量和高速度，这个问题被放大了。具体来说，很难找到大量在线评论的标签，这使得它不适合监督学习。迄今为止，最受欢迎的解决方案是手动标签，这仍然是一项劳动密集型和耗时的任务。本研究的目的是开发一种端到端的方法，用于识别质量刺激的客户不满的来源，并在电子商务背景下自动分配它们。上述目标是通过在亚马逊评论的大型语料库上使用一种新颖的基于集成的半监督伪标签技术来实现的。作为第一步，手动标记子集，然后使用保留常用标记（伪）类的集成方法来迭代标记整个数据集。然后，我们在（伪）标记数据上应用大型语言模型（llm）和深度学习（DL）架构来完成多类分类问题。我们对比并展示了对基线机器学习模型的统计显着改进，其中预训练的变压器模型表现出最佳性能。我们的方法提出了一个路线图，利用集成和复杂的计算技术，简化自动识别电子商务渠道中质量相关不满的来源。我们相信，如果采用我们的方法，可以加强对在线客户的申诉。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Annals of Operations Research 管理科学-运筹学与管理科学

CiteScore

7.90

自引率

16.70%

发文量

596

审稿时长

8.4 months

期刊介绍： The Annals of Operations Research publishes peer-reviewed original articles dealing with key aspects of operations research, including theory, practice, and computation. The journal publishes full-length research articles, short notes, expositions and surveys, reports on computational studies, and case studies that present new and innovative practical applications. In addition to regular issues, the journal publishes periodic special volumes that focus on defined fields of operations research, ranging from the highly theoretical to the algorithmic and the applied. These volumes have one or more Guest Editors who are responsible for collecting the papers and overseeing the refereeing process.