C. Hirway, Enda Fallon, Paul Connolly, Kieran Flanagan, D. Yadav
{"title":"预测收货电子邮件中最大限度地减少假阴性的深度学习方法","authors":"C. Hirway, Enda Fallon, Paul Connolly, Kieran Flanagan, D. Yadav","doi":"10.1109/ICCA56443.2022.10039606","DOIUrl":null,"url":null,"abstract":"Businesses generate receipts for their customers that include information such as the products purchased, their cost, the date and time of purchase, the store id etc. After an online purchase of item/s is made, a receipt is often emailed to the buyer's email address. For this evaluation, a classified database with receipt and non-receipt emails was available. Previously, Machine Learning (ML) algorithms for determining receipt validity had been implemented on this test database. The results showed that the Random Forest technique performed better than Naive Bayes and Support Vector Machine. In this paper, a Deep Learning algorithm named Long Short-Term Memory [LSTM] is implemented and its results compared with the previous implementation. The capacity of this recurrent network to handle the exploding/vanishing gradient problem, which is a challenge when training recurrent or very deep neural networks, is one factor in its success. It was found that LSTM is more effective in terms of accuracy compared to the previous ML approach. Also, the false negative values predicted by LSTM were fewer that those predicted by the ML approach. In the classification of receipt emails, processing an email without receipt data incurs a relatively low cost, yet failing to detect a receipt email results in the loss of important data. As a result, the system needs to be tuned to minimize false negatives while permitting a wider tolerance for false positives since the cost of false negatives in this situation is substantially higher than that of false positives.","PeriodicalId":153139,"journal":{"name":"2022 International Conference on Computer and Applications (ICCA)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Deep Learning Approach for Minimizing False Negatives in Predicting Receipt Emails\",\"authors\":\"C. Hirway, Enda Fallon, Paul Connolly, Kieran Flanagan, D. Yadav\",\"doi\":\"10.1109/ICCA56443.2022.10039606\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Businesses generate receipts for their customers that include information such as the products purchased, their cost, the date and time of purchase, the store id etc. After an online purchase of item/s is made, a receipt is often emailed to the buyer's email address. For this evaluation, a classified database with receipt and non-receipt emails was available. Previously, Machine Learning (ML) algorithms for determining receipt validity had been implemented on this test database. The results showed that the Random Forest technique performed better than Naive Bayes and Support Vector Machine. In this paper, a Deep Learning algorithm named Long Short-Term Memory [LSTM] is implemented and its results compared with the previous implementation. The capacity of this recurrent network to handle the exploding/vanishing gradient problem, which is a challenge when training recurrent or very deep neural networks, is one factor in its success. It was found that LSTM is more effective in terms of accuracy compared to the previous ML approach. Also, the false negative values predicted by LSTM were fewer that those predicted by the ML approach. In the classification of receipt emails, processing an email without receipt data incurs a relatively low cost, yet failing to detect a receipt email results in the loss of important data. As a result, the system needs to be tuned to minimize false negatives while permitting a wider tolerance for false positives since the cost of false negatives in this situation is substantially higher than that of false positives.\",\"PeriodicalId\":153139,\"journal\":{\"name\":\"2022 International Conference on Computer and Applications (ICCA)\",\"volume\":\"83 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Computer and Applications (ICCA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCA56443.2022.10039606\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Computer and Applications (ICCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCA56443.2022.10039606","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Deep Learning Approach for Minimizing False Negatives in Predicting Receipt Emails
Businesses generate receipts for their customers that include information such as the products purchased, their cost, the date and time of purchase, the store id etc. After an online purchase of item/s is made, a receipt is often emailed to the buyer's email address. For this evaluation, a classified database with receipt and non-receipt emails was available. Previously, Machine Learning (ML) algorithms for determining receipt validity had been implemented on this test database. The results showed that the Random Forest technique performed better than Naive Bayes and Support Vector Machine. In this paper, a Deep Learning algorithm named Long Short-Term Memory [LSTM] is implemented and its results compared with the previous implementation. The capacity of this recurrent network to handle the exploding/vanishing gradient problem, which is a challenge when training recurrent or very deep neural networks, is one factor in its success. It was found that LSTM is more effective in terms of accuracy compared to the previous ML approach. Also, the false negative values predicted by LSTM were fewer that those predicted by the ML approach. In the classification of receipt emails, processing an email without receipt data incurs a relatively low cost, yet failing to detect a receipt email results in the loss of important data. As a result, the system needs to be tuned to minimize false negatives while permitting a wider tolerance for false positives since the cost of false negatives in this situation is substantially higher than that of false positives.