{"title":"Collocation Extraction Method Using Mutual Information Contents","authors":"Iori Fukumura, Sanggyu Shin","doi":"10.1109/IIAIAAI55812.2022.00134","DOIUrl":null,"url":null,"abstract":"In this paper, we obtained nouns and verbs in dependencies from a corpus and extracted collocations using a statistical measure, mutual information content. We compared the extracted collocations with the NINJAL (Japanese corpus published by the National Institute for Japanese Language and Linguistics). This comparison was based on whether the collocation was a correct idiomatic expression and whether it existed in the NINJAL corpus. As a result, we could estimate collocations with a certain degree of accuracy.","PeriodicalId":156230,"journal":{"name":"2022 12th International Congress on Advanced Applied Informatics (IIAI-AAI)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 12th International Congress on Advanced Applied Informatics (IIAI-AAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IIAIAAI55812.2022.00134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we obtained nouns and verbs in dependencies from a corpus and extracted collocations using a statistical measure, mutual information content. We compared the extracted collocations with the NINJAL (Japanese corpus published by the National Institute for Japanese Language and Linguistics). This comparison was based on whether the collocation was a correct idiomatic expression and whether it existed in the NINJAL corpus. As a result, we could estimate collocations with a certain degree of accuracy.