{"title":"句子边界检测任务的序列标记方法","authors":"T. A. Le","doi":"10.1145/3380688.3380703","DOIUrl":null,"url":null,"abstract":"One of the keys to enable chatbots to communicate with human in a more natural way is the ability to handle long and complex user's utterances. In order to achieve this goal, we propose to integrate the Sentence Boundary Detection (SBD) module into the chatbot architecture, whose role is to take as input a user's utterance from an automatic speech recognition device, in which sentence boundaries are not available, and output the corresponding list of punctuated sentences for downstream modules such as Intent Detection, Topic Classification, Sentiment Analysis, Named Entity Recognition, as well as Coreference Recognition. To address the SBD task, we reformulate it as a sequence labeling task. In this way, both deep neural network models (e.g., Bi-directional Long Short-Term Memory, Convolutional Neural Network) and structured prediction models (e.g., Hidden Markov Model, Maximum Entropy Model, Conditional Random Field) can be leveraged. After reformulating the SBD task, we built a hybrid deep neural network model and achieved good performance on both CornellMovie-Dialog and DailyDialog datasets.","PeriodicalId":414793,"journal":{"name":"Proceedings of the 4th International Conference on Machine Learning and Soft Computing","volume":"212 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Sequence Labeling Approach to the Task of Sentence Boundary Detection\",\"authors\":\"T. A. Le\",\"doi\":\"10.1145/3380688.3380703\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the keys to enable chatbots to communicate with human in a more natural way is the ability to handle long and complex user's utterances. In order to achieve this goal, we propose to integrate the Sentence Boundary Detection (SBD) module into the chatbot architecture, whose role is to take as input a user's utterance from an automatic speech recognition device, in which sentence boundaries are not available, and output the corresponding list of punctuated sentences for downstream modules such as Intent Detection, Topic Classification, Sentiment Analysis, Named Entity Recognition, as well as Coreference Recognition. To address the SBD task, we reformulate it as a sequence labeling task. In this way, both deep neural network models (e.g., Bi-directional Long Short-Term Memory, Convolutional Neural Network) and structured prediction models (e.g., Hidden Markov Model, Maximum Entropy Model, Conditional Random Field) can be leveraged. After reformulating the SBD task, we built a hybrid deep neural network model and achieved good performance on both CornellMovie-Dialog and DailyDialog datasets.\",\"PeriodicalId\":414793,\"journal\":{\"name\":\"Proceedings of the 4th International Conference on Machine Learning and Soft Computing\",\"volume\":\"212 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 4th International Conference on Machine Learning and Soft Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3380688.3380703\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 4th International Conference on Machine Learning and Soft Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3380688.3380703","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Sequence Labeling Approach to the Task of Sentence Boundary Detection
One of the keys to enable chatbots to communicate with human in a more natural way is the ability to handle long and complex user's utterances. In order to achieve this goal, we propose to integrate the Sentence Boundary Detection (SBD) module into the chatbot architecture, whose role is to take as input a user's utterance from an automatic speech recognition device, in which sentence boundaries are not available, and output the corresponding list of punctuated sentences for downstream modules such as Intent Detection, Topic Classification, Sentiment Analysis, Named Entity Recognition, as well as Coreference Recognition. To address the SBD task, we reformulate it as a sequence labeling task. In this way, both deep neural network models (e.g., Bi-directional Long Short-Term Memory, Convolutional Neural Network) and structured prediction models (e.g., Hidden Markov Model, Maximum Entropy Model, Conditional Random Field) can be leveraged. After reformulating the SBD task, we built a hybrid deep neural network model and achieved good performance on both CornellMovie-Dialog and DailyDialog datasets.