{"title":"基于C&C和Boxer的语言动机大规模NLP","authors":"J. Curran, S. Clark, Johan Bos","doi":"10.3115/1557769.1557781","DOIUrl":null,"url":null,"abstract":"The statistical modelling of language, together with advances in wide-coverage grammar development, have led to high levels of robustness and efficiency in NLP systems and made linguistically motivated large-scale language processing a possibility (Matsuzaki et al., 2007; Kaplan et al., 2004). This paper describes an NLP system which is based on syntactic and semantic formalisms from theoretical linguistics, and which we have used to analyse the entire Gigaword corpus (1 billion words) in less than 5 days using only 18 processors. This combination of detail and speed of analysis represents a break-through in NLP technology.","PeriodicalId":429352,"journal":{"name":"Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions - ACL '07","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"330","resultStr":"{\"title\":\"Linguistically Motivated Large-Scale NLP with C&C and Boxer\",\"authors\":\"J. Curran, S. Clark, Johan Bos\",\"doi\":\"10.3115/1557769.1557781\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The statistical modelling of language, together with advances in wide-coverage grammar development, have led to high levels of robustness and efficiency in NLP systems and made linguistically motivated large-scale language processing a possibility (Matsuzaki et al., 2007; Kaplan et al., 2004). This paper describes an NLP system which is based on syntactic and semantic formalisms from theoretical linguistics, and which we have used to analyse the entire Gigaword corpus (1 billion words) in less than 5 days using only 18 processors. This combination of detail and speed of analysis represents a break-through in NLP technology.\",\"PeriodicalId\":429352,\"journal\":{\"name\":\"Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions - ACL '07\",\"volume\":\"74 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"330\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions - ACL '07\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3115/1557769.1557781\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions - ACL '07","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3115/1557769.1557781","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 330
摘要
语言的统计建模,以及广泛覆盖语法发展的进步,导致了NLP系统的高水平鲁棒性和效率,并使语言动机的大规模语言处理成为可能(Matsuzaki et al., 2007;Kaplan et al., 2004)。本文描述了一个基于理论语言学的句法和语义形式的自然语言处理系统,我们用它在不到5天的时间里只用了18个处理器就分析了整个Gigaword语料库(10亿个单词)。这种细节和分析速度的结合代表了NLP技术的突破。
Linguistically Motivated Large-Scale NLP with C&C and Boxer
The statistical modelling of language, together with advances in wide-coverage grammar development, have led to high levels of robustness and efficiency in NLP systems and made linguistically motivated large-scale language processing a possibility (Matsuzaki et al., 2007; Kaplan et al., 2004). This paper describes an NLP system which is based on syntactic and semantic formalisms from theoretical linguistics, and which we have used to analyse the entire Gigaword corpus (1 billion words) in less than 5 days using only 18 processors. This combination of detail and speed of analysis represents a break-through in NLP technology.