{"title":"基于深度语境化词嵌入模型的旅游领域问题分类","authors":"Charmy Weerakoon, Surangika Ranathunga","doi":"10.1109/MERCon52712.2021.9525789","DOIUrl":null,"url":null,"abstract":"Question answering can be considered as a key area in Natural Language Processing and Information Retrieval, where users construct queries in natural language and receive suitable answers in return. In the travel domain, most questions are “content questions”, where the expected answer is not the equivalent of “yes” or “no”, but rather factual information. Replying to a free-form factual question based on a large collection of text is challenging. Previous research has shown that the accuracy of question answering systems can be improved by adding a classification phase based on the expected answer type. This paper focuses on implementing a multi-level, multi-class question classification system focusing on the travel domain. Existing research for the travel domain is conducted using language-specific features and traditional Machine Learning models. In contrast, this research employs transformer-based state-of-the-art deep contextualized word embedding models for question classification. The proposed method improves the coarse class Micro F1-Score by 5.43% compared to the baseline. Fine-grain Micro F1-Score has also improved by 3.8%. We also present an empirical analysis of the effectiveness of different transformer-based deep contextualized word embedding models for multi-level multi-class classification.","PeriodicalId":6855,"journal":{"name":"2021 Moratuwa Engineering Research Conference (MERCon)","volume":"24 1","pages":"573-578"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Question Classification for the Travel Domain using Deep Contextualized Word Embedding Models\",\"authors\":\"Charmy Weerakoon, Surangika Ranathunga\",\"doi\":\"10.1109/MERCon52712.2021.9525789\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Question answering can be considered as a key area in Natural Language Processing and Information Retrieval, where users construct queries in natural language and receive suitable answers in return. In the travel domain, most questions are “content questions”, where the expected answer is not the equivalent of “yes” or “no”, but rather factual information. Replying to a free-form factual question based on a large collection of text is challenging. Previous research has shown that the accuracy of question answering systems can be improved by adding a classification phase based on the expected answer type. This paper focuses on implementing a multi-level, multi-class question classification system focusing on the travel domain. Existing research for the travel domain is conducted using language-specific features and traditional Machine Learning models. In contrast, this research employs transformer-based state-of-the-art deep contextualized word embedding models for question classification. The proposed method improves the coarse class Micro F1-Score by 5.43% compared to the baseline. Fine-grain Micro F1-Score has also improved by 3.8%. We also present an empirical analysis of the effectiveness of different transformer-based deep contextualized word embedding models for multi-level multi-class classification.\",\"PeriodicalId\":6855,\"journal\":{\"name\":\"2021 Moratuwa Engineering Research Conference (MERCon)\",\"volume\":\"24 1\",\"pages\":\"573-578\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Moratuwa Engineering Research Conference (MERCon)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MERCon52712.2021.9525789\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Moratuwa Engineering Research Conference (MERCon)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MERCon52712.2021.9525789","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Question Classification for the Travel Domain using Deep Contextualized Word Embedding Models
Question answering can be considered as a key area in Natural Language Processing and Information Retrieval, where users construct queries in natural language and receive suitable answers in return. In the travel domain, most questions are “content questions”, where the expected answer is not the equivalent of “yes” or “no”, but rather factual information. Replying to a free-form factual question based on a large collection of text is challenging. Previous research has shown that the accuracy of question answering systems can be improved by adding a classification phase based on the expected answer type. This paper focuses on implementing a multi-level, multi-class question classification system focusing on the travel domain. Existing research for the travel domain is conducted using language-specific features and traditional Machine Learning models. In contrast, this research employs transformer-based state-of-the-art deep contextualized word embedding models for question classification. The proposed method improves the coarse class Micro F1-Score by 5.43% compared to the baseline. Fine-grain Micro F1-Score has also improved by 3.8%. We also present an empirical analysis of the effectiveness of different transformer-based deep contextualized word embedding models for multi-level multi-class classification.