Construction of Narrative Text Component Recognition Corpus

Feng Zhang, Yingqi Han, Jiong Wang, Jie Liu
{"title":"Construction of Narrative Text Component Recognition Corpus","authors":"Feng Zhang, Yingqi Han, Jiong Wang, Jie Liu","doi":"10.1109/CCET55412.2022.9906339","DOIUrl":null,"url":null,"abstract":"Textual structure analysis is an important part of Automatic Essay Score (AES), and is also one of the important research directions in Natural Language Processing. At present, there are still deficiencies in the research of narrative textual structure in China, one of the main reasons is the lack of data available for research. To solve this problem, this paper proposes and constructs a corpus for the textual component identification of narrative essay. This paper divides the text structure of narrative essay, and forms a corpus for the narrative essay component identification. The paper finally annotated 3024 articles with 21128 sentences in total. This paper combines manual annotation and the automatic annotation of the model to build corpus, and conducts statistical analysis on the distribution of the corpus content and the consistency of the corpus annotation. The experiment shows text component recognition performance achieves 80.75% F 1 score. The work provided basic data for the research of AES.","PeriodicalId":329327,"journal":{"name":"2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCET55412.2022.9906339","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Textual structure analysis is an important part of Automatic Essay Score (AES), and is also one of the important research directions in Natural Language Processing. At present, there are still deficiencies in the research of narrative textual structure in China, one of the main reasons is the lack of data available for research. To solve this problem, this paper proposes and constructs a corpus for the textual component identification of narrative essay. This paper divides the text structure of narrative essay, and forms a corpus for the narrative essay component identification. The paper finally annotated 3024 articles with 21128 sentences in total. This paper combines manual annotation and the automatic annotation of the model to build corpus, and conducts statistical analysis on the distribution of the corpus content and the consistency of the corpus annotation. The experiment shows text component recognition performance achieves 80.75% F 1 score. The work provided basic data for the research of AES.
叙事文本成分识别语料库的构建
文本结构分析是自动作文评分(AES)的重要组成部分,也是自然语言处理的重要研究方向之一。目前,国内对叙事文本结构的研究还存在不足,其中一个主要原因是缺乏可用于研究的数据。为了解决这一问题,本文提出并构建了一个叙事性短文语篇成分识别的语料库。本文对叙事性散文的文本结构进行了划分,形成了叙事性散文成分识别的语料库。论文最终注释了3024篇文章,共计21128个句子。本文将人工标注与模型自动标注相结合构建语料库,并对语料库内容的分布和语料库标注的一致性进行统计分析。实验表明,文本成分识别性能达到80.75%的f1分。该工作为AES的研究提供了基础数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信