Tomoya Shirai, Kyoko Kanzaki, Hirofumi Yabumoto, H. Isahara
{"title":"复合动词释义表的编制与评价——论“行动控制语言”的发展","authors":"Tomoya Shirai, Kyoko Kanzaki, Hirofumi Yabumoto, H. Isahara","doi":"10.1109/ICAICTA.2015.7335351","DOIUrl":null,"url":null,"abstract":"In order to realize friendly man-machine communication, machines must understand not only surface expressions of human utterance but also deep meanings of human behavior. We started compilation of “paraphrase representation list of compound verbs” as the first step of investigation and standardization of lexical items which is a part of “control language for action”. We processed the corpus and vectorized the data by using Word2Vec. Using the created vector, we performed a calculation of similarity between the compound verbs and verbs in a corpus by cosine similarity, and created a paraphrase representation list. We got paraphrase expressions for 1899 compound verbs among 3289 compound verbs (including orthographic variants) stored in the compound verb lexicon. We found by this method words which do not exist in the Japanese WordNet. We investigated the words that exist only in the result of automatic extraction, and found that there are 213 unknown words and 227 new synonymous relationship. What is worthy of special mention is that there is 14 differences between the unknown word and a new synonymous relationship, which means we could find 14 words which are stored in the Japanese WordNet, but are not considered as synonyms of a word. We can say that the proposed method is useful for the expansion of paraphrase relationship listed by human intuitions.","PeriodicalId":319020,"journal":{"name":"2015 2nd International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA)","volume":"198 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Compilation and evaluation of paraphrase representation list of compound verbs: Toward development of “Control language for action”\",\"authors\":\"Tomoya Shirai, Kyoko Kanzaki, Hirofumi Yabumoto, H. Isahara\",\"doi\":\"10.1109/ICAICTA.2015.7335351\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to realize friendly man-machine communication, machines must understand not only surface expressions of human utterance but also deep meanings of human behavior. We started compilation of “paraphrase representation list of compound verbs” as the first step of investigation and standardization of lexical items which is a part of “control language for action”. We processed the corpus and vectorized the data by using Word2Vec. Using the created vector, we performed a calculation of similarity between the compound verbs and verbs in a corpus by cosine similarity, and created a paraphrase representation list. We got paraphrase expressions for 1899 compound verbs among 3289 compound verbs (including orthographic variants) stored in the compound verb lexicon. We found by this method words which do not exist in the Japanese WordNet. We investigated the words that exist only in the result of automatic extraction, and found that there are 213 unknown words and 227 new synonymous relationship. What is worthy of special mention is that there is 14 differences between the unknown word and a new synonymous relationship, which means we could find 14 words which are stored in the Japanese WordNet, but are not considered as synonyms of a word. We can say that the proposed method is useful for the expansion of paraphrase relationship listed by human intuitions.\",\"PeriodicalId\":319020,\"journal\":{\"name\":\"2015 2nd International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA)\",\"volume\":\"198 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 2nd International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAICTA.2015.7335351\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 2nd International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAICTA.2015.7335351","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Compilation and evaluation of paraphrase representation list of compound verbs: Toward development of “Control language for action”
In order to realize friendly man-machine communication, machines must understand not only surface expressions of human utterance but also deep meanings of human behavior. We started compilation of “paraphrase representation list of compound verbs” as the first step of investigation and standardization of lexical items which is a part of “control language for action”. We processed the corpus and vectorized the data by using Word2Vec. Using the created vector, we performed a calculation of similarity between the compound verbs and verbs in a corpus by cosine similarity, and created a paraphrase representation list. We got paraphrase expressions for 1899 compound verbs among 3289 compound verbs (including orthographic variants) stored in the compound verb lexicon. We found by this method words which do not exist in the Japanese WordNet. We investigated the words that exist only in the result of automatic extraction, and found that there are 213 unknown words and 227 new synonymous relationship. What is worthy of special mention is that there is 14 differences between the unknown word and a new synonymous relationship, which means we could find 14 words which are stored in the Japanese WordNet, but are not considered as synonyms of a word. We can say that the proposed method is useful for the expansion of paraphrase relationship listed by human intuitions.