{"title":"Automatic Acquisition of Formal Concepts from Text","authors":"Pablo Gamallo, J. Lopes, Alexandre Agustini","doi":"10.21248/jlcl.23.2008.102","DOIUrl":null,"url":null,"abstract":"This paper describes an unsupervised method for extracting concepts from Part-Of-Speech annotated corpora. The method consists in building bidimensional clusters of both words and their lexico-syntactic contexts. The method is based on Formal Concept Analysis (FCA). Each generated cluster is defined as a formal concept with a set of words describing the extension of the concept and a set of contexts perceived as the intensional attributes (or properties) valid for all the words in the extension. The clustering process relies on two concept operations: abstraction and specification. The former allows us to build a more generic concept by intersecting the intensions of the merged concepts and making the union of their extensions. By contrast, specification makes the union of the intensions and intersects the extensions. The result is a concept lattice that describes the domain-specific ontology underlying the training corpus.","PeriodicalId":346957,"journal":{"name":"LDV Forum","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"LDV Forum","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21248/jlcl.23.2008.102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
This paper describes an unsupervised method for extracting concepts from Part-Of-Speech annotated corpora. The method consists in building bidimensional clusters of both words and their lexico-syntactic contexts. The method is based on Formal Concept Analysis (FCA). Each generated cluster is defined as a formal concept with a set of words describing the extension of the concept and a set of contexts perceived as the intensional attributes (or properties) valid for all the words in the extension. The clustering process relies on two concept operations: abstraction and specification. The former allows us to build a more generic concept by intersecting the intensions of the merged concepts and making the union of their extensions. By contrast, specification makes the union of the intensions and intersects the extensions. The result is a concept lattice that describes the domain-specific ontology underlying the training corpus.