Extractive text summarisation in hindi

S. Vijay, V. Rai, Sorabh Gupta, Anshuman Vijayvargia, D. Sharma
{"title":"Extractive text summarisation in hindi","authors":"S. Vijay, V. Rai, Sorabh Gupta, Anshuman Vijayvargia, D. Sharma","doi":"10.1109/IALP.2017.8300607","DOIUrl":null,"url":null,"abstract":"With immense amount of data growing on web in Hindi, a text summariser would be helpful in summarising Government data, medical reports, news, and research articles. Hindi is the fourth most-spoken first language in the world. Hindi written in the Devanagari script is the official language of the Government of India. There is no public dataset for extractive summarisation available in Hindi and thus a dataset of 24253 News articles was extracted and the extractive summaries results were evaluated on various parameters with manual gold summaries of exactly 60 words each.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Asian Language Processing (IALP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IALP.2017.8300607","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

With immense amount of data growing on web in Hindi, a text summariser would be helpful in summarising Government data, medical reports, news, and research articles. Hindi is the fourth most-spoken first language in the world. Hindi written in the Devanagari script is the official language of the Government of India. There is no public dataset for extractive summarisation available in Hindi and thus a dataset of 24253 News articles was extracted and the extractive summaries results were evaluated on various parameters with manual gold summaries of exactly 60 words each.
印地语摘录文本摘要
随着印度语网络上大量数据的增长,文本摘要器将有助于总结政府数据、医疗报告、新闻和研究文章。印地语是世界上第四大母语。用Devanagari书写的印地语是印度政府的官方语言。在印地语中没有可用于提取摘要的公共数据集,因此提取了24253篇新闻文章的数据集,提取摘要结果在各种参数上进行了评估,每个摘要的手动黄金摘要正好为60个单词。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信