{"title":"多任务学习作为BERT的问题回答","authors":"Shishir Roy, Nayeem Ehtesham, Md Saiful Islam, Sabir Ismail","doi":"10.1109/ICCIT54785.2021.9689900","DOIUrl":null,"url":null,"abstract":"Question Answering demands a deep understanding of semantic relations among question, answer, and context. Multi-Task Learning (MTL) and Meta Learning with deep neural networks have recently shown impressive performance in many Natural Language Processing (NLP) tasks, particularly when there is inadequate data for training. But a little work has been done for a general NLP architecture that spans over many NLP tasks. In this paper, we present a model that can generalize to ten different NLP tasks. We demonstrate that multi-pointer-generator decoder and pre-trained language model is key to success and suppress all previous state-of-the-art baselines by 74 decaScore which is more than 12% absolute improvement over all of the datasets.","PeriodicalId":166450,"journal":{"name":"2021 24th International Conference on Computer and Information Technology (ICCIT)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multitask Learning as Question Answering with BERT\",\"authors\":\"Shishir Roy, Nayeem Ehtesham, Md Saiful Islam, Sabir Ismail\",\"doi\":\"10.1109/ICCIT54785.2021.9689900\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Question Answering demands a deep understanding of semantic relations among question, answer, and context. Multi-Task Learning (MTL) and Meta Learning with deep neural networks have recently shown impressive performance in many Natural Language Processing (NLP) tasks, particularly when there is inadequate data for training. But a little work has been done for a general NLP architecture that spans over many NLP tasks. In this paper, we present a model that can generalize to ten different NLP tasks. We demonstrate that multi-pointer-generator decoder and pre-trained language model is key to success and suppress all previous state-of-the-art baselines by 74 decaScore which is more than 12% absolute improvement over all of the datasets.\",\"PeriodicalId\":166450,\"journal\":{\"name\":\"2021 24th International Conference on Computer and Information Technology (ICCIT)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 24th International Conference on Computer and Information Technology (ICCIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCIT54785.2021.9689900\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 24th International Conference on Computer and Information Technology (ICCIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIT54785.2021.9689900","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multitask Learning as Question Answering with BERT
Question Answering demands a deep understanding of semantic relations among question, answer, and context. Multi-Task Learning (MTL) and Meta Learning with deep neural networks have recently shown impressive performance in many Natural Language Processing (NLP) tasks, particularly when there is inadequate data for training. But a little work has been done for a general NLP architecture that spans over many NLP tasks. In this paper, we present a model that can generalize to ten different NLP tasks. We demonstrate that multi-pointer-generator decoder and pre-trained language model is key to success and suppress all previous state-of-the-art baselines by 74 decaScore which is more than 12% absolute improvement over all of the datasets.