{"title":"Question Answering System for low resource language using Transfer Learning","authors":"Aarushi Phade, Y. Haribhakta","doi":"10.1109/iccica52458.2021.9697268","DOIUrl":null,"url":null,"abstract":"This paper proposes a Question Answering System for Marathi language using Transfer Learning. A well performing Question Answering system leverages the word embeddings used in the system. Producing word embeddings for a language from the scratch is a drawn-out task and requires tremendous dataset and huge computing resources. Utilizing word embeddings created from a limited dataset in NLP tasks prompts average per-formance. Instead utilizing word embeddings from pre-trained models saves a lot of time, and gives great performance, since these models have more learnable parameters and are trained on huge datasets. Our framework uses Multilingual BERT model as pre-trained source model having 110M parameters which leads to effective word representation. We have fine-tuned this BERT model for QAS with the assistance of a small, custom dataset similar to SQuAD, intended for this framework. The system uses Bert-score and F1-score as its evaluation methods. It achieves F1-score of 56.7% and Bert-score of 69.08%. The system being the first of its kind in Marathi language lays the groundwork for future research.","PeriodicalId":327193,"journal":{"name":"2021 International Conference on Computational Intelligence and Computing Applications (ICCICA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Computational Intelligence and Computing Applications (ICCICA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iccica52458.2021.9697268","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
This paper proposes a Question Answering System for Marathi language using Transfer Learning. A well performing Question Answering system leverages the word embeddings used in the system. Producing word embeddings for a language from the scratch is a drawn-out task and requires tremendous dataset and huge computing resources. Utilizing word embeddings created from a limited dataset in NLP tasks prompts average per-formance. Instead utilizing word embeddings from pre-trained models saves a lot of time, and gives great performance, since these models have more learnable parameters and are trained on huge datasets. Our framework uses Multilingual BERT model as pre-trained source model having 110M parameters which leads to effective word representation. We have fine-tuned this BERT model for QAS with the assistance of a small, custom dataset similar to SQuAD, intended for this framework. The system uses Bert-score and F1-score as its evaluation methods. It achieves F1-score of 56.7% and Bert-score of 69.08%. The system being the first of its kind in Marathi language lays the groundwork for future research.