Michael J. Cafarella, Doug Downey, S. Soderland, Oren Etzioni
{"title":"KnowItNow","authors":"Michael J. Cafarella, Doug Downey, S. Soderland, Oren Etzioni","doi":"10.3115/1220575.1220646","DOIUrl":"https://doi.org/10.3115/1220575.1220646","url":null,"abstract":"Numerous NLP applications rely on search-engine queries, both to extract information from and to compute statistics over the Web corpus. But search engines often limit the number of available queries. As a result, query-intensive NLP applications such as Information Extraction (IE) distribute their query load over several days, making IE a slow, offline process.This paper introduces a novel architecture for IE that obviates queries to commercial search engines. The architecture is embodied in a system called KnowItNow that performs high-precision IE in minutes instead of days. We compare KnowItNow experimentally with the previously-published KnowItAll system, and quantify the tradeoff between recall and speed. KnowItNow's extraction rate is two to three orders of magnitude higher than KnowItAll's.","PeriodicalId":252878,"journal":{"name":"Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing - HLT '05","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125609348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}