Pierre-Henri Paris, Syrine El Aoud, Fabian M. Suchanek
{"title":"The Vagueness of Vagueness in Noun Phrases","authors":"Pierre-Henri Paris, Syrine El Aoud, Fabian M. Suchanek","doi":"10.24432/C5T884","DOIUrl":"https://doi.org/10.24432/C5T884","url":null,"abstract":"Natural language text has a great potential to feed knowledge bases. However, natural language is not always precise – and sometimes intentionally so. In this position paper, we study vagueness in noun phrases. We manually analyze the frequency of vague noun phrases in a Wikipedia corpus, and find that 1/4 of noun phrases exhibit some form of vagueness. We report on their nature and propose a categorization. We then conduct a literature review and present different definitions of vagueness, and different existing methods to deal with the detection and modeling of vagueness. We find that, despite its frequency, vagueness has not yet be addressed in its entirety.","PeriodicalId":371465,"journal":{"name":"Conference on Automated Knowledge Base Construction","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134636799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Winn, M. Venanzi, T. Minka, Ivan Korostelev, J. Guiver, Elena Pochernina, Pavel Mishkov, Alex Spengler, Denise J. Wilkins, Siân E. Lindley, Richard Banks, Sam Webster, Yordan Zaykov
{"title":"Enterprise Alexandria: Online High-Precision Enterprise Knowledge Base Construction with Typed Entities","authors":"J. Winn, M. Venanzi, T. Minka, Ivan Korostelev, J. Guiver, Elena Pochernina, Pavel Mishkov, Alex Spengler, Denise J. Wilkins, Siân E. Lindley, Richard Banks, Sam Webster, Yordan Zaykov","doi":"10.24432/C5JS3X","DOIUrl":"https://doi.org/10.24432/C5JS3X","url":null,"abstract":"We present Enterprise Alexandria, a new system for automatically constructing a knowledge base with high-precision and typed entities from private enterprise data such as emails, documents and intranet pages. Built as an extension of Alexandria [Winn et al., 2019], the key novelty of Enterprise Alexandria is the ability in processing both the textual information and the structured metadata available in each document in an online learning fashion, making use of any manual curations that have happened in the interim. This task is performed entirely eyes-off to respect the privacy of the user and the restricted access their documents. The knowledge discovery process uses a probabilistic program defining the process of generating the data item from a set of unknown typed entities. Using probabilistic inference, Enterprise Alexandria can jointly discover a large set of entities with custom types specific to the organization. Experiments on three real-world datasets show that the system outperforms alternative methods with the ability to work effectively at large scale.","PeriodicalId":371465,"journal":{"name":"Conference on Automated Knowledge Base Construction","volume":"01 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131705728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}