{"title":"Estimating Frequency Counts of Concepts in Multiple-Inheritance Hierarchies","authors":"A. Wagner","doi":"10.21248/jlcl.19.2004.59","DOIUrl":null,"url":null,"abstract":"This paper deals with methods for estimating frequencies of concepts in wordnets from corpus data. In particular, it addresses issues which multiple inheritance structures in wordnets raise regarding this task. One of the discussed approaches (tree cut) is problematic in this respect, because it requires a pure tree hierarchy. Applying this approach to a wordnet requires that its DAG structure is transformed into a tree. I propose a mathematically sound method for that purpose and compare this method to a commonly used ad-hoc strategy. This strategy leads to biases in the estimated frequencies which are avoided by the approach proposed here. Experiments with GermaNet demonstrate that these biases have significant impacts.","PeriodicalId":346957,"journal":{"name":"LDV Forum","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2004-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"LDV Forum","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21248/jlcl.19.2004.59","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
This paper deals with methods for estimating frequencies of concepts in wordnets from corpus data. In particular, it addresses issues which multiple inheritance structures in wordnets raise regarding this task. One of the discussed approaches (tree cut) is problematic in this respect, because it requires a pure tree hierarchy. Applying this approach to a wordnet requires that its DAG structure is transformed into a tree. I propose a mathematically sound method for that purpose and compare this method to a commonly used ad-hoc strategy. This strategy leads to biases in the estimated frequencies which are avoided by the approach proposed here. Experiments with GermaNet demonstrate that these biases have significant impacts.