Feature Extraction aligned Email Classification based on Imperative Sentence Selection through Deep Learning

Journal of Artificial Intelligence and Systems Pub Date : 1900-01-01 DOI:10.33969/ais.2021.31007

Nashit Ali, Anum Fatima, Hureeza Shahzadi, Aman Ullah, K. Polat

{"title":"Feature Extraction aligned Email Classification based on Imperative Sentence Selection through Deep Learning","authors":"Nashit Ali, Anum Fatima, Hureeza Shahzadi, Aman Ullah, K. Polat","doi":"10.33969/ais.2021.31007","DOIUrl":null,"url":null,"abstract":"Most commonly used channel for communication among peoples is emails. In this era where everyone is so busy in their routine and work, it is very difficult to check all email when one receives huge amount of emails. Previous research has done work on email categorization in which they have mostly done spam filtration. The problem with spam filtration is that sometimes person mistakenly mark an important email received from high authority as spam and according to previous research, this email will be filtered as spam that can cause a great threat for job of an employee. In this research, we are introducing a methodology which classifies email text into three categories i.e. order, request and general on basis of imperative sentences. This research use Word2Wec for words conversion into vector and use two approaches of deep learning i.e. Convolutional neural network and Recurrent neural network for email classification. We conduct experiment on Dataset collected from Personal Gmail account and Enron which consists of 1000 emails. The experiment result show that RNN gives better accuracy than CNN. We also compare our methods with previously used method Fuzzy ANN results and Our proposed methods CNN and RNN gives better results than Fuzzy ANN. This research has also included different experimental result in which CNN and RNN applied on different ratios of training and testing dataset. These experiment show that increasing in the ratio of training dataset results in increasing accuracy of algorithm.","PeriodicalId":273028,"journal":{"name":"Journal of Artificial Intelligence and Systems","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Artificial Intelligence and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33969/ais.2021.31007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Most commonly used channel for communication among peoples is emails. In this era where everyone is so busy in their routine and work, it is very difficult to check all email when one receives huge amount of emails. Previous research has done work on email categorization in which they have mostly done spam filtration. The problem with spam filtration is that sometimes person mistakenly mark an important email received from high authority as spam and according to previous research, this email will be filtered as spam that can cause a great threat for job of an employee. In this research, we are introducing a methodology which classifies email text into three categories i.e. order, request and general on basis of imperative sentences. This research use Word2Wec for words conversion into vector and use two approaches of deep learning i.e. Convolutional neural network and Recurrent neural network for email classification. We conduct experiment on Dataset collected from Personal Gmail account and Enron which consists of 1000 emails. The experiment result show that RNN gives better accuracy than CNN. We also compare our methods with previously used method Fuzzy ANN results and Our proposed methods CNN and RNN gives better results than Fuzzy ANN. This research has also included different experimental result in which CNN and RNN applied on different ratios of training and testing dataset. These experiment show that increasing in the ratio of training dataset results in increasing accuracy of algorithm.

查看原文本刊更多论文

通过深度学习，基于强制性句子选择的特征提取与电子邮件分类相一致

电子邮件是人们最常用的沟通渠道。在这个每个人都忙于日常事务和工作的时代，当一个人收到大量电子邮件时，要检查所有电子邮件是非常困难的。以往的研究对电子邮件进行了分类，其中主要是垃圾邮件过滤。垃圾邮件过滤的问题在于，有时人们会错误地将从高层收到的重要邮件标记为垃圾邮件，而根据以往的研究，这封邮件会被过滤为垃圾邮件，从而对员工的工作造成极大的威胁。在这项研究中，我们引入了一种方法，根据命令句将电子邮件文本分为三类，即命令、请求和一般。本研究使用 Word2Wec 将单词转换为向量，并使用两种深度学习方法，即卷积神经网络和循环神经网络进行电子邮件分类。我们在从个人 Gmail 账户和安然公司收集的数据集上进行了实验，该数据集由 1000 封电子邮件组成。实验结果表明，RNN 比 CNN 的准确率更高。我们还将我们的方法与以前使用过的方法 Fuzzy ANN 的结果进行了比较，我们提出的 CNN 和 RNN 方法比 Fuzzy ANN 的结果更好。这项研究还包括不同的实验结果，其中 CNN 和 RNN 应用于不同比例的训练和测试数据集。这些实验表明，训练数据集比例的增加会提高算法的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Artificial Intelligence and Systems

自引率

0.00%

发文量