Markus Bundschus, A. Bauer-Mehren, Volker Tresp, L. Furlong, H. Kriegel
{"title":"Digging for knowledge with information extraction: a case study on human gene-disease associations","authors":"Markus Bundschus, A. Bauer-Mehren, Volker Tresp, L. Furlong, H. Kriegel","doi":"10.1145/1871437.1871744","DOIUrl":null,"url":null,"abstract":"We present the information extraction system Text2SemRel. The system (semi-) automatically constructs knowledge bases from textual data consisting of facts about entities using semantic relations. An integral part of the system is a graph-based interactive visualization and search layer. The second contribution in this paper is the presentation of a case study on the (semi-) automatic construction of a knowledge base consisting of gene-disease associations. The resulting knowledge base, the Literature-derived Human Gene-Disease Network (LHGDN), is now an integral part of the Linked Life Data initiative and represents currently the largest publicly available gene-disease repository. The LHGDN is compared against several curated state of the art databases. A unique feature of the LHGDN is that the semantics of the associations constitute a wide variety of biomolecular conditions.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th ACM international conference on Information and knowledge management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1871437.1871744","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
We present the information extraction system Text2SemRel. The system (semi-) automatically constructs knowledge bases from textual data consisting of facts about entities using semantic relations. An integral part of the system is a graph-based interactive visualization and search layer. The second contribution in this paper is the presentation of a case study on the (semi-) automatic construction of a knowledge base consisting of gene-disease associations. The resulting knowledge base, the Literature-derived Human Gene-Disease Network (LHGDN), is now an integral part of the Linked Life Data initiative and represents currently the largest publicly available gene-disease repository. The LHGDN is compared against several curated state of the art databases. A unique feature of the LHGDN is that the semantics of the associations constitute a wide variety of biomolecular conditions.