James R. Johnson, Anita Miller, L. Khan, B. Thuraisingham, Murat Kantarcioglu
{"title":"Identification of related information of interest across free text documents","authors":"James R. Johnson, Anita Miller, L. Khan, B. Thuraisingham, Murat Kantarcioglu","doi":"10.1109/ISI.2011.5984058","DOIUrl":null,"url":null,"abstract":"An approach is presented for finding information of interest in a free text document and then identifying and presenting related information of interest from other free text documents. The goal is to find specific related items of interest within documents whether the documents are of the same category or not. Information of interest is defined with respect to expanded entity phrases and their ontology mappings. Powerful techniques requiring minimal training are described for expanding an entity phrase to include attributes from components of a complex sentence; for measuring relatedness of same-name expanded entity phrases; and for detecting related expanded entity phrases through ontology inferences. A representative dataset is described and preliminary measurements of performance against ground truth are provided.","PeriodicalId":220165,"journal":{"name":"Proceedings of 2011 IEEE International Conference on Intelligence and Security Informatics","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 2011 IEEE International Conference on Intelligence and Security Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISI.2011.5984058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
An approach is presented for finding information of interest in a free text document and then identifying and presenting related information of interest from other free text documents. The goal is to find specific related items of interest within documents whether the documents are of the same category or not. Information of interest is defined with respect to expanded entity phrases and their ontology mappings. Powerful techniques requiring minimal training are described for expanding an entity phrase to include attributes from components of a complex sentence; for measuring relatedness of same-name expanded entity phrases; and for detecting related expanded entity phrases through ontology inferences. A representative dataset is described and preliminary measurements of performance against ground truth are provided.