在线商务中优化包裹递送的多语言命名实体识别解决方案:识别个人和组织名称

M. Pajas, Aleksander Radovan, I. O. Biskupic
{"title":"在线商务中优化包裹递送的多语言命名实体识别解决方案:识别个人和组织名称","authors":"M. Pajas, Aleksander Radovan, I. O. Biskupic","doi":"10.23919/MIPRO57284.2023.10159789","DOIUrl":null,"url":null,"abstract":"This paper presents a comprehensive solution to enhance parcel delivery in online commerce by implementing multilingual named entity recognition. The solution is designed to accurately identify person and organization names, with a primary emphasis on correctly identifying recipients. The ultimate goal is to use this information to automatically validate recipients and select the most accurate one to improve data accuracy and reliability for parcel delivery. The process begins by collecting a large dataset of online commerce data, including customer search queries, and annotating it with person and organization names. The data is then preprocessed, cleaned to eliminate irrelevant information, and prepared for training a named entity recognition model. Next, the model is trained and evaluated using this data to ensure its ability to identify named entities and extract recipients from queries accurately. The process employs an iterative training process and data generation techniques, while also addressing the issue of noisy data and iterative training introducing unwanted patterns by retraining the model on the subset of the original annotated dataset. Our experiments conclude a consistent increase of F1 score over the baseline and best iteration using this method of training and fine-tuning.","PeriodicalId":177983,"journal":{"name":"2023 46th MIPRO ICT and Electronics Convention (MIPRO)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multilingual Named Entity Recognition Solution for Optimizing Parcel Delivery in Online Commerce: Identifying Person and Organization Names\",\"authors\":\"M. Pajas, Aleksander Radovan, I. O. Biskupic\",\"doi\":\"10.23919/MIPRO57284.2023.10159789\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a comprehensive solution to enhance parcel delivery in online commerce by implementing multilingual named entity recognition. The solution is designed to accurately identify person and organization names, with a primary emphasis on correctly identifying recipients. The ultimate goal is to use this information to automatically validate recipients and select the most accurate one to improve data accuracy and reliability for parcel delivery. The process begins by collecting a large dataset of online commerce data, including customer search queries, and annotating it with person and organization names. The data is then preprocessed, cleaned to eliminate irrelevant information, and prepared for training a named entity recognition model. Next, the model is trained and evaluated using this data to ensure its ability to identify named entities and extract recipients from queries accurately. The process employs an iterative training process and data generation techniques, while also addressing the issue of noisy data and iterative training introducing unwanted patterns by retraining the model on the subset of the original annotated dataset. Our experiments conclude a consistent increase of F1 score over the baseline and best iteration using this method of training and fine-tuning.\",\"PeriodicalId\":177983,\"journal\":{\"name\":\"2023 46th MIPRO ICT and Electronics Convention (MIPRO)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 46th MIPRO ICT and Electronics Convention (MIPRO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/MIPRO57284.2023.10159789\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 46th MIPRO ICT and Electronics Convention (MIPRO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/MIPRO57284.2023.10159789","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文提出了一种综合的解决方案,通过实现多语言命名实体识别来增强电子商务中的包裹递送。该解决方案旨在准确识别个人和组织名称,主要强调正确识别收件人。最终目标是使用这些信息自动验证收件人,并选择最准确的收件人,以提高包裹投递的数据准确性和可靠性。该流程首先收集大量在线商务数据集,包括客户搜索查询,并用个人和组织名称对其进行注释。然后对数据进行预处理,清除不相关的信息,并为训练命名实体识别模型做准备。接下来,使用这些数据对模型进行训练和评估,以确保其识别命名实体和准确地从查询中提取收件人的能力。该过程采用迭代训练过程和数据生成技术,同时还通过在原始注释数据集的子集上重新训练模型来解决噪声数据和迭代训练引入不需要的模式的问题。我们的实验得出结论,使用这种训练和微调方法,F1分数在基线和最佳迭代上持续增加。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multilingual Named Entity Recognition Solution for Optimizing Parcel Delivery in Online Commerce: Identifying Person and Organization Names
This paper presents a comprehensive solution to enhance parcel delivery in online commerce by implementing multilingual named entity recognition. The solution is designed to accurately identify person and organization names, with a primary emphasis on correctly identifying recipients. The ultimate goal is to use this information to automatically validate recipients and select the most accurate one to improve data accuracy and reliability for parcel delivery. The process begins by collecting a large dataset of online commerce data, including customer search queries, and annotating it with person and organization names. The data is then preprocessed, cleaned to eliminate irrelevant information, and prepared for training a named entity recognition model. Next, the model is trained and evaluated using this data to ensure its ability to identify named entities and extract recipients from queries accurately. The process employs an iterative training process and data generation techniques, while also addressing the issue of noisy data and iterative training introducing unwanted patterns by retraining the model on the subset of the original annotated dataset. Our experiments conclude a consistent increase of F1 score over the baseline and best iteration using this method of training and fine-tuning.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信