{"title":"Learning-to-Spell: Weak Supervision based Query Correction in E-Commerce Search with Small Strong Labels","authors":"Madhura Pande, Vishal Kakkar, M. Bansal, Surender Kumar, Chinmay Sharma, Himanshu Malhotra, Praneet Mehta","doi":"10.1145/3511808.3557113","DOIUrl":null,"url":null,"abstract":"For an E-commerce search engine, users finding the right product critically depend on spell correction. A misspelled query can fetch totally unrelated results which in turn leads to a bad customer experience. Around 32% of queries have spelling mistakes on our e-commerce search engine. The spell problem becomes more challenging when most spell errors arise from customers with little or no exposure to the English language besides the usual source of accidental mistyping on keyboard. These spell errors are heavily influenced by the colloquial and spoken accents of the customers. This limits the benefit from using generic spell correction systems which are learnt from cleaner English sources like Brown Corpus and Wikipedia with a very low focus on phonetic/vernacular spell errors. In this work, we present a novel approach towards spell correction that effectively solves a very diverse set of spell errors and outperforms several state-of-the-art systems in the domain of E-commerce search. Our strategy combines Learning-to-Rank on a small strongly labelled data with multiple learners trained with weakly labelled data. We report the effectiveness of our solution WellSpell (Weak and strong Labels for Learning to Spell) with both the offline evaluations and online A/B experiment.","PeriodicalId":389624,"journal":{"name":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3511808.3557113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
For an E-commerce search engine, users finding the right product critically depend on spell correction. A misspelled query can fetch totally unrelated results which in turn leads to a bad customer experience. Around 32% of queries have spelling mistakes on our e-commerce search engine. The spell problem becomes more challenging when most spell errors arise from customers with little or no exposure to the English language besides the usual source of accidental mistyping on keyboard. These spell errors are heavily influenced by the colloquial and spoken accents of the customers. This limits the benefit from using generic spell correction systems which are learnt from cleaner English sources like Brown Corpus and Wikipedia with a very low focus on phonetic/vernacular spell errors. In this work, we present a novel approach towards spell correction that effectively solves a very diverse set of spell errors and outperforms several state-of-the-art systems in the domain of E-commerce search. Our strategy combines Learning-to-Rank on a small strongly labelled data with multiple learners trained with weakly labelled data. We report the effectiveness of our solution WellSpell (Weak and strong Labels for Learning to Spell) with both the offline evaluations and online A/B experiment.