James R. Johnson, Anita Miller, L. Khan, B. Thuraisingham, Murat Kantarcioglu
{"title":"Extraction of expanded entity phrases","authors":"James R. Johnson, Anita Miller, L. Khan, B. Thuraisingham, Murat Kantarcioglu","doi":"10.1109/ISI.2011.5984059","DOIUrl":null,"url":null,"abstract":"This research is part of a larger integrated approach for extraction of information of interest from free text and the visualization of semantic relatedness between phrases of interest. This paper defines a new structure which is a key component, the expanded entity phrase (EPx). This paper also presents an approach for extracting EPx's from free text. The structure of the EPx's facilitates quantitative comparison with other EPx's. A combination of part of speech-based template matching and ontology-driven NLP provides an effective technique for extracting complex entity structures that cross clause boundaries. This approach also uses ontology-based inferences to lay the ground work for linking EPx's for semantic relatedness assessments involving different named entities not explicitly stated in the text. The real world data used in this research were derived from a collection of law enforcement email messages submitted by hundreds of investigators seeking information or posting information about crimes, incidents, requests, and announcements. Performance data on the approaches used for extracting EPx's and links from this data are presented.","PeriodicalId":220165,"journal":{"name":"Proceedings of 2011 IEEE International Conference on Intelligence and Security Informatics","volume":"9 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 2011 IEEE International Conference on Intelligence and Security Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISI.2011.5984059","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
This research is part of a larger integrated approach for extraction of information of interest from free text and the visualization of semantic relatedness between phrases of interest. This paper defines a new structure which is a key component, the expanded entity phrase (EPx). This paper also presents an approach for extracting EPx's from free text. The structure of the EPx's facilitates quantitative comparison with other EPx's. A combination of part of speech-based template matching and ontology-driven NLP provides an effective technique for extracting complex entity structures that cross clause boundaries. This approach also uses ontology-based inferences to lay the ground work for linking EPx's for semantic relatedness assessments involving different named entities not explicitly stated in the text. The real world data used in this research were derived from a collection of law enforcement email messages submitted by hundreds of investigators seeking information or posting information about crimes, incidents, requests, and announcements. Performance data on the approaches used for extracting EPx's and links from this data are presented.