Poet Attribution for Urdu: Finding Optimal Configuration for Short Text

M. A. Rao, Tafseer Ahmed
{"title":"Poet Attribution for Urdu: Finding Optimal Configuration for Short Text","authors":"M. A. Rao, Tafseer Ahmed","doi":"10.51153/kjcis.v4i2.58","DOIUrl":null,"url":null,"abstract":"\n \n \n \nThis study presents a machine learning system to identify the poet of a given poetic piece consisting of 2 lines (i.e. a couplet) or more. The task is more difficult than the general task of author attribution, as the number of words in verses and poems are usually less than the number of articles present in author attribution datasets. We applied classification algorithms with different sets of feature configurations to run several experiments and found that the system performs best when support vector machine using a combination of unigram and bigram are used . The best system (for 5 Urdu poets) has the accuracy of 88.7%. \n \n \n \n","PeriodicalId":299009,"journal":{"name":"KIET Journal of Computing and Information Sciences","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"KIET Journal of Computing and Information Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.51153/kjcis.v4i2.58","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

This study presents a machine learning system to identify the poet of a given poetic piece consisting of 2 lines (i.e. a couplet) or more. The task is more difficult than the general task of author attribution, as the number of words in verses and poems are usually less than the number of articles present in author attribution datasets. We applied classification algorithms with different sets of feature configurations to run several experiments and found that the system performs best when support vector machine using a combination of unigram and bigram are used . The best system (for 5 Urdu poets) has the accuracy of 88.7%.
乌尔都语的诗人归属:寻找短文本的最佳配置
本研究提出了一个机器学习系统,用于识别由两行(即一对对联)或更多组成的给定诗歌作品的诗人。这个任务比一般的作者归属任务更困难,因为诗句和诗歌中的单词数量通常少于作者归属数据集中的文章数量。我们应用不同特征配置集的分类算法运行了几个实验,发现当使用单图和双图组合的支持向量机时,系统表现最好。最好的系统(5位乌尔都语诗人)的准确率为88.7%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信