句法复杂性的度量及其随时间的变化(以俄语为例)

Tatiana Y. Sherstinova, Evgenia Ushakova, Aleksey Mel'nik
{"title":"句法复杂性的度量及其随时间的变化(以俄语为例)","authors":"Tatiana Y. Sherstinova, Evgenia Ushakova, Aleksey Mel'nik","doi":"10.23919/fruct49677.2020.9211027","DOIUrl":null,"url":null,"abstract":"Syntactic complexity is an important feature of any text, both written and oral. The information about syntactic complexity is crucial for successful solution of many practical NLP tasks starting from intellectual understanding of texts and ending with automatic machine translation. Because of this, syntactic complexity and its measures are in the center of attention of NLP developers. Thus far, quite a series of different measures of syntactic complexity have been developed; in this paper, it is proposed to consider 10 syntactic measures that have been proposed for syntactic stylometric analysis. The pilot experiment described in this paper was made on automatic syntactic text annotation made by UDPipe syntactic parser, which was manually corrected. In our approach, particular attention is paid to the analysis of stability of certain measures of syntactic complexity and the analysis of their variation. Thus, we try to evaluate, which syntactic properties of Russian texts may be considered as inherent for the language as a whole, and which of them undergo some changes. To achieve this task, we analyze the corpus of Russian literary texts for three decades. Due to their high stylistic variability, texts of fiction may be considered as excellent data for assessing different levels of complexity. The obtained results show the effectiveness of different measures for estimating text syntactic complexity and revealing their correlation.","PeriodicalId":149674,"journal":{"name":"2020 27th Conference of Open Innovations Association (FRUCT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Measures of Syntactic Complexity and their Change over Time (the Case of Russian)\",\"authors\":\"Tatiana Y. Sherstinova, Evgenia Ushakova, Aleksey Mel'nik\",\"doi\":\"10.23919/fruct49677.2020.9211027\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Syntactic complexity is an important feature of any text, both written and oral. The information about syntactic complexity is crucial for successful solution of many practical NLP tasks starting from intellectual understanding of texts and ending with automatic machine translation. Because of this, syntactic complexity and its measures are in the center of attention of NLP developers. Thus far, quite a series of different measures of syntactic complexity have been developed; in this paper, it is proposed to consider 10 syntactic measures that have been proposed for syntactic stylometric analysis. The pilot experiment described in this paper was made on automatic syntactic text annotation made by UDPipe syntactic parser, which was manually corrected. In our approach, particular attention is paid to the analysis of stability of certain measures of syntactic complexity and the analysis of their variation. Thus, we try to evaluate, which syntactic properties of Russian texts may be considered as inherent for the language as a whole, and which of them undergo some changes. To achieve this task, we analyze the corpus of Russian literary texts for three decades. Due to their high stylistic variability, texts of fiction may be considered as excellent data for assessing different levels of complexity. The obtained results show the effectiveness of different measures for estimating text syntactic complexity and revealing their correlation.\",\"PeriodicalId\":149674,\"journal\":{\"name\":\"2020 27th Conference of Open Innovations Association (FRUCT)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 27th Conference of Open Innovations Association (FRUCT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/fruct49677.2020.9211027\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 27th Conference of Open Innovations Association (FRUCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/fruct49677.2020.9211027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

句法复杂性是任何文本的一个重要特征,无论是书面的还是口头的。关于句法复杂性的信息对于成功解决许多实用的NLP任务至关重要,从对文本的智能理解到以自动机器翻译结束。正因为如此,句法复杂性及其度量成为NLP开发者关注的焦点。到目前为止,已经发展了一系列不同的句法复杂性衡量标准;在本文中,建议考虑已提出的用于句法文体分析的10种句法措施。本文所描述的先导实验是对UDPipe语法解析器自动标注的句法文本进行人工纠错。在我们的方法中,特别注意分析某些句法复杂性度量的稳定性及其变化。因此,我们试图评估俄语文本的哪些句法属性可能被认为是整个语言固有的,哪些句法属性经历了一些变化。为了完成这一任务,我们分析了三十年来俄罗斯文学文本的语料库。由于其高度的文体可变性,小说文本可以被认为是评估不同复杂程度的优秀数据。实验结果表明,不同的度量方法可以有效地估计文本句法复杂性并揭示它们之间的相关性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Measures of Syntactic Complexity and their Change over Time (the Case of Russian)
Syntactic complexity is an important feature of any text, both written and oral. The information about syntactic complexity is crucial for successful solution of many practical NLP tasks starting from intellectual understanding of texts and ending with automatic machine translation. Because of this, syntactic complexity and its measures are in the center of attention of NLP developers. Thus far, quite a series of different measures of syntactic complexity have been developed; in this paper, it is proposed to consider 10 syntactic measures that have been proposed for syntactic stylometric analysis. The pilot experiment described in this paper was made on automatic syntactic text annotation made by UDPipe syntactic parser, which was manually corrected. In our approach, particular attention is paid to the analysis of stability of certain measures of syntactic complexity and the analysis of their variation. Thus, we try to evaluate, which syntactic properties of Russian texts may be considered as inherent for the language as a whole, and which of them undergo some changes. To achieve this task, we analyze the corpus of Russian literary texts for three decades. Due to their high stylistic variability, texts of fiction may be considered as excellent data for assessing different levels of complexity. The obtained results show the effectiveness of different measures for estimating text syntactic complexity and revealing their correlation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信