A. Miletic, M. Bras, Louise Esher, J. Sibille, Marianne Vergez-Couret
{"title":"为欧西坦建立一个树库:对罗曼蒂克语料库有什么用?","authors":"A. Miletic, M. Bras, Louise Esher, J. Sibille, Marianne Vergez-Couret","doi":"10.18653/v1/W19-8002","DOIUrl":null,"url":null,"abstract":"This paper describes the application of delexicalized cross-lingual parsing on Occitan with a view to building the first dependency treebank of this language. Occitan is a Romance language spoken in the south of France and in parts of Italy and Spain. It is a relatively low-resourced language and does not have a syntactically annotated corpus as of yet. In order to facilitate the manual annotation process, we train parsing models on the existing Romance corpora from the Universal Dependencies project and apply them to Occitan. Special attention is given to the effect of this cross-lingual annotation on the work of human annotators in terms of annotation speed and ease.","PeriodicalId":294555,"journal":{"name":"Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Building a treebank for Occitan: what use for Romance UD corpora?\",\"authors\":\"A. Miletic, M. Bras, Louise Esher, J. Sibille, Marianne Vergez-Couret\",\"doi\":\"10.18653/v1/W19-8002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes the application of delexicalized cross-lingual parsing on Occitan with a view to building the first dependency treebank of this language. Occitan is a Romance language spoken in the south of France and in parts of Italy and Spain. It is a relatively low-resourced language and does not have a syntactically annotated corpus as of yet. In order to facilitate the manual annotation process, we train parsing models on the existing Romance corpora from the Universal Dependencies project and apply them to Occitan. Special attention is given to the effect of this cross-lingual annotation on the work of human annotators in terms of annotation speed and ease.\",\"PeriodicalId\":294555,\"journal\":{\"name\":\"Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/W19-8002\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/W19-8002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Building a treebank for Occitan: what use for Romance UD corpora?
This paper describes the application of delexicalized cross-lingual parsing on Occitan with a view to building the first dependency treebank of this language. Occitan is a Romance language spoken in the south of France and in parts of Italy and Spain. It is a relatively low-resourced language and does not have a syntactically annotated corpus as of yet. In order to facilitate the manual annotation process, we train parsing models on the existing Romance corpora from the Universal Dependencies project and apply them to Occitan. Special attention is given to the effect of this cross-lingual annotation on the work of human annotators in terms of annotation speed and ease.