Isidra Ocampo-Guzman, I. Lopez-Arevalo, E. Tello-Leal, V. Sosa-Sosa
{"title":"Towards the Automatic Learning of Ontologies","authors":"Isidra Ocampo-Guzman, I. Lopez-Arevalo, E. Tello-Leal, V. Sosa-Sosa","doi":"10.1109/STIL.2009.23","DOIUrl":null,"url":null,"abstract":"This paper proposes a methodology for the automatic learning of ontologies from a text corpus. The concepts (topics) from documents into the corpus are identified by using the Latent Dirichlet Allocation model. Based on theset of identified topics, for each concept it is constructed its taxonomy by using the terms with greater probability which contribute to define it. WordNet is usedin the construction of these partial topic taxonomies by obtaining the similarity and relatedness between the terms that constitute each topic. The resulting taxonomies are joined to structure the final ontology. The methodology is evaluated with the Lonely Planet corpus.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/STIL.2009.23","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper proposes a methodology for the automatic learning of ontologies from a text corpus. The concepts (topics) from documents into the corpus are identified by using the Latent Dirichlet Allocation model. Based on theset of identified topics, for each concept it is constructed its taxonomy by using the terms with greater probability which contribute to define it. WordNet is usedin the construction of these partial topic taxonomies by obtaining the similarity and relatedness between the terms that constitute each topic. The resulting taxonomies are joined to structure the final ontology. The methodology is evaluated with the Lonely Planet corpus.