{"title":"Construction of Narrative Text Component Recognition Corpus","authors":"Feng Zhang, Yingqi Han, Jiong Wang, Jie Liu","doi":"10.1109/CCET55412.2022.9906339","DOIUrl":null,"url":null,"abstract":"Textual structure analysis is an important part of Automatic Essay Score (AES), and is also one of the important research directions in Natural Language Processing. At present, there are still deficiencies in the research of narrative textual structure in China, one of the main reasons is the lack of data available for research. To solve this problem, this paper proposes and constructs a corpus for the textual component identification of narrative essay. This paper divides the text structure of narrative essay, and forms a corpus for the narrative essay component identification. The paper finally annotated 3024 articles with 21128 sentences in total. This paper combines manual annotation and the automatic annotation of the model to build corpus, and conducts statistical analysis on the distribution of the corpus content and the consistency of the corpus annotation. The experiment shows text component recognition performance achieves 80.75% F 1 score. The work provided basic data for the research of AES.","PeriodicalId":329327,"journal":{"name":"2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCET55412.2022.9906339","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Textual structure analysis is an important part of Automatic Essay Score (AES), and is also one of the important research directions in Natural Language Processing. At present, there are still deficiencies in the research of narrative textual structure in China, one of the main reasons is the lack of data available for research. To solve this problem, this paper proposes and constructs a corpus for the textual component identification of narrative essay. This paper divides the text structure of narrative essay, and forms a corpus for the narrative essay component identification. The paper finally annotated 3024 articles with 21128 sentences in total. This paper combines manual annotation and the automatic annotation of the model to build corpus, and conducts statistical analysis on the distribution of the corpus content and the consistency of the corpus annotation. The experiment shows text component recognition performance achieves 80.75% F 1 score. The work provided basic data for the research of AES.