Performance Analysis of Machine Learning and Deep Learning Techniques for Answer Type extraction of Marathi Questions

Dhiraj Amin, S. Govilkar, Sagar Kulkarni
{"title":"Performance Analysis of Machine Learning and Deep Learning Techniques for Answer Type extraction of Marathi Questions","authors":"Dhiraj Amin, S. Govilkar, Sagar Kulkarni","doi":"10.1109/ICNTE56631.2023.10146625","DOIUrl":null,"url":null,"abstract":"Question answering systems involve extraction of correct answers for natural language questions provided as input. Question classification is an important part of the question processing phase where natural language questions are categorized into predefined classes which specifies expected type of answer for the question. Extraction of answer type can be performed using machine learning and deep learning classification techniques which helps to reduce the list of possible correct answers for a question. In this paper we have compared various classification techniques which can be used for building a Marathi question classification system. Additionally, we have created a Marathi question classification dataset by translating existing TREC dataset available in English language. We observed that fine tuning the RoBERTa based monolingual language model for question classification was the best classification technique with accuracy of 91% in the coarse grained category of question classification and accuracy of 85% in the fine grained category of question classification.","PeriodicalId":158124,"journal":{"name":"2023 5th Biennial International Conference on Nascent Technologies in Engineering (ICNTE)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 5th Biennial International Conference on Nascent Technologies in Engineering (ICNTE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNTE56631.2023.10146625","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Question answering systems involve extraction of correct answers for natural language questions provided as input. Question classification is an important part of the question processing phase where natural language questions are categorized into predefined classes which specifies expected type of answer for the question. Extraction of answer type can be performed using machine learning and deep learning classification techniques which helps to reduce the list of possible correct answers for a question. In this paper we have compared various classification techniques which can be used for building a Marathi question classification system. Additionally, we have created a Marathi question classification dataset by translating existing TREC dataset available in English language. We observed that fine tuning the RoBERTa based monolingual language model for question classification was the best classification technique with accuracy of 91% in the coarse grained category of question classification and accuracy of 85% in the fine grained category of question classification.
马拉地语问题答案类型提取的机器学习和深度学习技术性能分析
问答系统包括提取作为输入的自然语言问题的正确答案。问题分类是问题处理阶段的重要组成部分,自然语言问题被分类到预定义的类中,这些类指定了问题的预期答案类型。答案类型的提取可以使用机器学习和深度学习分类技术来执行,这有助于减少问题可能正确答案的列表。在本文中,我们比较了各种分类技术,可以用来建立一个马拉地语问题分类系统。此外,我们通过翻译现有的英语版本的TREC数据集创建了马拉地语问题分类数据集。我们观察到,对基于RoBERTa的单语语言模型进行微调是问题分类的最佳分类技术,在粗粒度问题分类中准确率为91%,在细粒度问题分类中准确率为85%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信