{"title":"Hello & Goodbye: Conversation Boundary Identification Using Text Classification","authors":"J. Dunne, David Malone","doi":"10.1109/ISSC.2018.8585376","DOIUrl":null,"url":null,"abstract":"One of the main challenges in discourse analysis is the process of segmenting text into meaningful topic segments. While this problem has been studied over the past thirty years, previous topic segmentation studies ignore crucial elements of a conversation: an opening and closing remark. Our motivation to revisit this problem space is the rise of instant message usage. We consider the problem of topic segmentation as a machine learning classification one. Using both enterprise and open source datasets, we address the question as to whether a machine learning algorithm can be trained to identify salutations and valedictions within multi-party real-time chat conversations. Our results show that both Naïve Bayes (NB) and Support Vector Machine (SVM) algorithms provide a reasonable degree of precision(mean F1 score: 0.58).","PeriodicalId":174854,"journal":{"name":"2018 29th Irish Signals and Systems Conference (ISSC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 29th Irish Signals and Systems Conference (ISSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSC.2018.8585376","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
One of the main challenges in discourse analysis is the process of segmenting text into meaningful topic segments. While this problem has been studied over the past thirty years, previous topic segmentation studies ignore crucial elements of a conversation: an opening and closing remark. Our motivation to revisit this problem space is the rise of instant message usage. We consider the problem of topic segmentation as a machine learning classification one. Using both enterprise and open source datasets, we address the question as to whether a machine learning algorithm can be trained to identify salutations and valedictions within multi-party real-time chat conversations. Our results show that both Naïve Bayes (NB) and Support Vector Machine (SVM) algorithms provide a reasonable degree of precision(mean F1 score: 0.58).