{"title":"Effective topic modeling for email","authors":"Hiep Hong, Teng-Sheng Moh","doi":"10.1109/HPCSim.2015.7237060","DOIUrl":null,"url":null,"abstract":"Emails have been increasingly popular and have become an indispensible tool for communication and document exchange. Because of its convenience, people use emails every day at work, at school, and for personal matters. Consequently, the number of emails people receive daily keeps on increasing, causing them to spend more time organizing the emails. People often need to classify and move email into folders so that they can go back and read them later. Most email client tools available today allow the users to filter and organize emails by defining rules on how to handle incoming emails. However, this manual process requires users to know their expected emails very well, and to make good use of these tools users need to understand how filtering rules work and how to apply them correctly. In reality, most users do not know what their incoming emails will be. The work described in this paper aims to take the burden of organizing emails away from users by using the Latent Dirichlet Allocation (LDA) [10] to automatically extract topics from emails and group them into folders of common topics. Experiments have shown that the proposed method is able to correctly group emails in appropriate topics with 77% accuracy.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCSim.2015.7237060","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Emails have been increasingly popular and have become an indispensible tool for communication and document exchange. Because of its convenience, people use emails every day at work, at school, and for personal matters. Consequently, the number of emails people receive daily keeps on increasing, causing them to spend more time organizing the emails. People often need to classify and move email into folders so that they can go back and read them later. Most email client tools available today allow the users to filter and organize emails by defining rules on how to handle incoming emails. However, this manual process requires users to know their expected emails very well, and to make good use of these tools users need to understand how filtering rules work and how to apply them correctly. In reality, most users do not know what their incoming emails will be. The work described in this paper aims to take the burden of organizing emails away from users by using the Latent Dirichlet Allocation (LDA) [10] to automatically extract topics from emails and group them into folders of common topics. Experiments have shown that the proposed method is able to correctly group emails in appropriate topics with 77% accuracy.