{"title":"A New Framework for Textual Information Mining over Parse Trees","authors":"Hamid Mousavi, Deirdre Kerr, Markus R Iseli","doi":"10.1109/ICSC.2011.19","DOIUrl":null,"url":null,"abstract":"This paper introduces a new text mining framework using a tree-based Linguistic Query Language, called LQL. The framework generates more than one parse tree for each sentence using a probabilistic parser, and annotates each node of these parse trees with \\textit{main-parts} information which is set of key terms from the node's branch based on the branch's linguistic structure. Using main-parts-annotated parse trees, the system can efficiently answer individual queries as well as mine the text for a given set of queries. The framework can also support grammatical ambiguity through probabilistic rules and linguistic exceptions.","PeriodicalId":408382,"journal":{"name":"2011 IEEE Fifth International Conference on Semantic Computing","volume":"101 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE Fifth International Conference on Semantic Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSC.2011.19","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
This paper introduces a new text mining framework using a tree-based Linguistic Query Language, called LQL. The framework generates more than one parse tree for each sentence using a probabilistic parser, and annotates each node of these parse trees with \textit{main-parts} information which is set of key terms from the node's branch based on the branch's linguistic structure. Using main-parts-annotated parse trees, the system can efficiently answer individual queries as well as mine the text for a given set of queries. The framework can also support grammatical ambiguity through probabilistic rules and linguistic exceptions.