Mitchell P. Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz
{"title":"构建大型带注释的英语语料库:宾州树库","authors":"Mitchell P. Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz","doi":"10.21236/ada273556","DOIUrl":null,"url":null,"abstract":"Abstract : As a result of this grant, the researchers have now published oil CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, with over 3 million words of that material assigned skeletal grammatical structure. This material now includes a fully hand-parsed version of the classic Brown corpus. About one half of the papers at the ACL Workshop on Using Large Text Corpora this past summer were based on the materials generated by this grant.","PeriodicalId":360119,"journal":{"name":"Comput. Linguistics","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1993-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8621","resultStr":"{\"title\":\"Building a Large Annotated Corpus of English: The Penn Treebank\",\"authors\":\"Mitchell P. Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz\",\"doi\":\"10.21236/ada273556\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract : As a result of this grant, the researchers have now published oil CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, with over 3 million words of that material assigned skeletal grammatical structure. This material now includes a fully hand-parsed version of the classic Brown corpus. About one half of the papers at the ACL Workshop on Using Large Text Corpora this past summer were based on the materials generated by this grant.\",\"PeriodicalId\":360119,\"journal\":{\"name\":\"Comput. Linguistics\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1993-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8621\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Comput. Linguistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21236/ada273556\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Comput. Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21236/ada273556","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Building a Large Annotated Corpus of English: The Penn Treebank
Abstract : As a result of this grant, the researchers have now published oil CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, with over 3 million words of that material assigned skeletal grammatical structure. This material now includes a fully hand-parsed version of the classic Brown corpus. About one half of the papers at the ACL Workshop on Using Large Text Corpora this past summer were based on the materials generated by this grant.