{"title":"多语言答题方法","authors":"Dmytro Dashenkov, Kirill Smelyakov, Oleksii Turuta","doi":"10.1109/PICST54195.2021.9772145","DOIUrl":null,"url":null,"abstract":"In this paper we explore different approaches to solving the question-answering task. The task is approached with the goal of targeting multiple natural languages at once in mind. Presently, Ukrainian and English languages are considered, and training data in both languages is sourced. The research considers different machine learning models of natural language processing, based on the state-of-the-art models presented by researchers in the last few years. We compare the outputs of different models fine-tuned on different data to improve the precision of the predictions. We show how fine-tuning a language model on one language may increase the precision of that model's predictions in other languages on the example of fine-tuning a model on English in order to increase performance on Ukrainian. At last, we point out the drawbacks of such an approach and emphasize the need for a large dedicated Ukrainian language dataset, that would be assembled specifically targeting the question-answering task.","PeriodicalId":391592,"journal":{"name":"2021 IEEE 8th International Conference on Problems of Infocommunications, Science and Technology (PIC S&T)","volume":"355 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Methods of Multilanguage Question Answering\",\"authors\":\"Dmytro Dashenkov, Kirill Smelyakov, Oleksii Turuta\",\"doi\":\"10.1109/PICST54195.2021.9772145\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we explore different approaches to solving the question-answering task. The task is approached with the goal of targeting multiple natural languages at once in mind. Presently, Ukrainian and English languages are considered, and training data in both languages is sourced. The research considers different machine learning models of natural language processing, based on the state-of-the-art models presented by researchers in the last few years. We compare the outputs of different models fine-tuned on different data to improve the precision of the predictions. We show how fine-tuning a language model on one language may increase the precision of that model's predictions in other languages on the example of fine-tuning a model on English in order to increase performance on Ukrainian. At last, we point out the drawbacks of such an approach and emphasize the need for a large dedicated Ukrainian language dataset, that would be assembled specifically targeting the question-answering task.\",\"PeriodicalId\":391592,\"journal\":{\"name\":\"2021 IEEE 8th International Conference on Problems of Infocommunications, Science and Technology (PIC S&T)\",\"volume\":\"355 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 8th International Conference on Problems of Infocommunications, Science and Technology (PIC S&T)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PICST54195.2021.9772145\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 8th International Conference on Problems of Infocommunications, Science and Technology (PIC S&T)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PICST54195.2021.9772145","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
In this paper we explore different approaches to solving the question-answering task. The task is approached with the goal of targeting multiple natural languages at once in mind. Presently, Ukrainian and English languages are considered, and training data in both languages is sourced. The research considers different machine learning models of natural language processing, based on the state-of-the-art models presented by researchers in the last few years. We compare the outputs of different models fine-tuned on different data to improve the precision of the predictions. We show how fine-tuning a language model on one language may increase the precision of that model's predictions in other languages on the example of fine-tuning a model on English in order to increase performance on Ukrainian. At last, we point out the drawbacks of such an approach and emphasize the need for a large dedicated Ukrainian language dataset, that would be assembled specifically targeting the question-answering task.