Eva Zehentner, M. Hundt, G. Schneider, M. Röthlisberger
{"title":"Differences in syntactic annotation affect retrieval","authors":"Eva Zehentner, M. Hundt, G. Schneider, M. Röthlisberger","doi":"10.1075/ijcl.21104.zeh","DOIUrl":null,"url":null,"abstract":"\nPrepositional phrases (PPs) play an important part in English argument structure constructions, but pose considerable challenges for linguistic investigations of any kind. In addition to the fact that PP-attachment is generally notoriously difficult to model computationally, a particularly striking methodological challenge in investigating verb-dependent PPs across (synchronic and/or diachronic) corpora is that such cross-corpus studies may have to rely on material annotated with different tools. This study evaluates the impact that such differences in corpus annotation may have on retrieval of verb-attached PPs by means of data from Early and Late Modern English corpora. Our intrinsic (recall/precision) and extrinsic parser evaluation shows that annotation does play a role, but that the noise introduced is negligible as far as frequency developments are concerned.","PeriodicalId":46843,"journal":{"name":"International Journal of Corpus Linguistics","volume":" ","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2023-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Corpus Linguistics","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1075/ijcl.21104.zeh","RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Prepositional phrases (PPs) play an important part in English argument structure constructions, but pose considerable challenges for linguistic investigations of any kind. In addition to the fact that PP-attachment is generally notoriously difficult to model computationally, a particularly striking methodological challenge in investigating verb-dependent PPs across (synchronic and/or diachronic) corpora is that such cross-corpus studies may have to rely on material annotated with different tools. This study evaluates the impact that such differences in corpus annotation may have on retrieval of verb-attached PPs by means of data from Early and Late Modern English corpora. Our intrinsic (recall/precision) and extrinsic parser evaluation shows that annotation does play a role, but that the noise introduced is negligible as far as frequency developments are concerned.
期刊介绍:
The International Journal of Corpus Linguistics (IJCL) publishes original research covering methodological, applied and theoretical work in any area of corpus linguistics. Through its focus on empirical language research, IJCL provides a forum for the presentation of new findings and innovative approaches in any area of linguistics (e.g. lexicology, grammar, discourse analysis, stylistics, sociolinguistics, morphology, contrastive linguistics), applied linguistics (e.g. language teaching, forensic linguistics), and translation studies. Based on its interest in corpus methodology, IJCL also invites contributions on the interface between corpus and computational linguistics.