Extractive text summarisation in hindi

2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI:10.1109/IALP.2017.8300607

S. Vijay, V. Rai, Sorabh Gupta, Anshuman Vijayvargia, D. Sharma

引用次数: 9

Abstract

With immense amount of data growing on web in Hindi, a text summariser would be helpful in summarising Government data, medical reports, news, and research articles. Hindi is the fourth most-spoken first language in the world. Hindi written in the Devanagari script is the official language of the Government of India. There is no public dataset for extractive summarisation available in Hindi and thus a dataset of 24253 News articles was extracted and the extractive summaries results were evaluated on various parameters with manual gold summaries of exactly 60 words each.

查看原文本刊更多论文

印地语摘录文本摘要

随着印度语网络上大量数据的增长，文本摘要器将有助于总结政府数据、医疗报告、新闻和研究文章。印地语是世界上第四大母语。用Devanagari书写的印地语是印度政府的官方语言。在印地语中没有可用于提取摘要的公共数据集，因此提取了24253篇新闻文章的数据集，提取摘要结果在各种参数上进行了评估，每个摘要的手动黄金摘要正好为60个单词。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 International Conference on Asian Language Processing (IALP)

自引率

0.00%

发文量