{"title":"捷克语分析动词形式的注释 - 复杂案例","authors":"Vladimír Petkevic, Hana Skoumalová","doi":"10.2478/jazcas-2023-0041","DOIUrl":null,"url":null,"abstract":"Abstract The article deals with complex cases of determining the attribute verbtag, which contains the values of morphosyntactic categories of analytic verb forms. The latest corpora of contemporary written Czech from the SYN series are tagged with this attribute. In this paper, we focus on cases where it is difficult to identify values of verbtag categories. These include, e.g. the identification of the auxiliary verb být ‘to be’, recognition of the mood and tense of coordinated participles, or determining the number in compound forms in which the individual parts have a different morphological number. Some of the problems are of a theoretical nature, since it is not clear what the correct solution should be. Here we have arbitrarily opted for one option that was offered. Other problems are due to imperfections in the algorithms we use for annotation. The solution here is to improve these algorithms.","PeriodicalId":262732,"journal":{"name":"Journal of Linguistics/Jazykovedný casopis","volume":"21 1","pages":"234 - 243"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Annotation of Analytic Verb Forms in Czech – Complex Cases\",\"authors\":\"Vladimír Petkevic, Hana Skoumalová\",\"doi\":\"10.2478/jazcas-2023-0041\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract The article deals with complex cases of determining the attribute verbtag, which contains the values of morphosyntactic categories of analytic verb forms. The latest corpora of contemporary written Czech from the SYN series are tagged with this attribute. In this paper, we focus on cases where it is difficult to identify values of verbtag categories. These include, e.g. the identification of the auxiliary verb být ‘to be’, recognition of the mood and tense of coordinated participles, or determining the number in compound forms in which the individual parts have a different morphological number. Some of the problems are of a theoretical nature, since it is not clear what the correct solution should be. Here we have arbitrarily opted for one option that was offered. Other problems are due to imperfections in the algorithms we use for annotation. The solution here is to improve these algorithms.\",\"PeriodicalId\":262732,\"journal\":{\"name\":\"Journal of Linguistics/Jazykovedný casopis\",\"volume\":\"21 1\",\"pages\":\"234 - 243\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Linguistics/Jazykovedný casopis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2478/jazcas-2023-0041\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Linguistics/Jazykovedný casopis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/jazcas-2023-0041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
摘要 本文讨论了确定 verbtag 属性的复杂情况,该属性包含分析动词形式的形态句法类别值。SYN 系列中最新的当代捷克语书面语语料库都标有该属性。在本文中,我们将重点讨论难以确定 verbtag 类别值的情况。这些情况包括助动词 být 'to be'的识别、协调分词的情态和时态的识别,或在各部分具有不同形态数的复合形式中确定数。有些问题是理论性的,因为不清楚正确的解决办法是什么。在这里,我们武断地选择了所提供的一种方案。其他问题是由于我们使用的标注算法不完善造成的。解决办法就是改进这些算法。
Annotation of Analytic Verb Forms in Czech – Complex Cases
Abstract The article deals with complex cases of determining the attribute verbtag, which contains the values of morphosyntactic categories of analytic verb forms. The latest corpora of contemporary written Czech from the SYN series are tagged with this attribute. In this paper, we focus on cases where it is difficult to identify values of verbtag categories. These include, e.g. the identification of the auxiliary verb být ‘to be’, recognition of the mood and tense of coordinated participles, or determining the number in compound forms in which the individual parts have a different morphological number. Some of the problems are of a theoretical nature, since it is not clear what the correct solution should be. Here we have arbitrarily opted for one option that was offered. Other problems are due to imperfections in the algorithms we use for annotation. The solution here is to improve these algorithms.