{"title":"使用暹罗网络进行食物搜索的语义嵌入","authors":"Rutvik Vijjali, Anurag Mishra, Srinivas Nagamalla, Jairaj Sathyanarayna","doi":"10.1145/3443279.3443303","DOIUrl":null,"url":null,"abstract":"Efficient and effective search is a key driver of business in e-commerce. Functionally, most search systems consist of retrieval and ranking phases. While the use of methods like Learning to Rank (LTR) for (re)ranking has been studied widely, most retrieval systems in the industry are still predominantly based on variants of text matching. Because text matching cannot capture the semantic intent of the query, most out-of-vocabulary (OOV) queries are either not handled at all or poorly handled by matching to similarly-spelled entities. For niche e-commerce like food delivery apps operating on phonetically spelled, non-Western dish names, this problem is even more acute. Pre-trained word embedding models are of limited help because the majority of dish names are words that occur rarely or not at all in most openly available vocabularies. In this work, we present experiments and efficient Siamese network based models to learn dish embeddings from scratch. Compared to current baselines, we demonstrate that these models lead to a 3--5% improvement in Mean Reciprocal Rank (MRR) and Recall@k. We also quantify, using a combination of in-house Food Taxonomy and the Davies-Bouldin (DB) index, that the new embeddings capture semantic information with an improvement of up to 20% over baseline.","PeriodicalId":414366,"journal":{"name":"Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Semantic Embeddings for Food Search Using Siamese Networks\",\"authors\":\"Rutvik Vijjali, Anurag Mishra, Srinivas Nagamalla, Jairaj Sathyanarayna\",\"doi\":\"10.1145/3443279.3443303\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Efficient and effective search is a key driver of business in e-commerce. Functionally, most search systems consist of retrieval and ranking phases. While the use of methods like Learning to Rank (LTR) for (re)ranking has been studied widely, most retrieval systems in the industry are still predominantly based on variants of text matching. Because text matching cannot capture the semantic intent of the query, most out-of-vocabulary (OOV) queries are either not handled at all or poorly handled by matching to similarly-spelled entities. For niche e-commerce like food delivery apps operating on phonetically spelled, non-Western dish names, this problem is even more acute. Pre-trained word embedding models are of limited help because the majority of dish names are words that occur rarely or not at all in most openly available vocabularies. In this work, we present experiments and efficient Siamese network based models to learn dish embeddings from scratch. Compared to current baselines, we demonstrate that these models lead to a 3--5% improvement in Mean Reciprocal Rank (MRR) and Recall@k. We also quantify, using a combination of in-house Food Taxonomy and the Davies-Bouldin (DB) index, that the new embeddings capture semantic information with an improvement of up to 20% over baseline.\",\"PeriodicalId\":414366,\"journal\":{\"name\":\"Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3443279.3443303\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th International Conference on Natural Language Processing and Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3443279.3443303","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Semantic Embeddings for Food Search Using Siamese Networks
Efficient and effective search is a key driver of business in e-commerce. Functionally, most search systems consist of retrieval and ranking phases. While the use of methods like Learning to Rank (LTR) for (re)ranking has been studied widely, most retrieval systems in the industry are still predominantly based on variants of text matching. Because text matching cannot capture the semantic intent of the query, most out-of-vocabulary (OOV) queries are either not handled at all or poorly handled by matching to similarly-spelled entities. For niche e-commerce like food delivery apps operating on phonetically spelled, non-Western dish names, this problem is even more acute. Pre-trained word embedding models are of limited help because the majority of dish names are words that occur rarely or not at all in most openly available vocabularies. In this work, we present experiments and efficient Siamese network based models to learn dish embeddings from scratch. Compared to current baselines, we demonstrate that these models lead to a 3--5% improvement in Mean Reciprocal Rank (MRR) and Recall@k. We also quantify, using a combination of in-house Food Taxonomy and the Davies-Bouldin (DB) index, that the new embeddings capture semantic information with an improvement of up to 20% over baseline.