{"title":"基于对称模式和高频词的高效无监督词分类提取","authors":"Liu Rong, Zhang Zhiping, Pang Ning","doi":"10.1109/ICAIE.2010.5641103","DOIUrl":null,"url":null,"abstract":"This paper presents a novel approach for discovering and extracting sets of words sharing semantic meaning. We utilize meta-patterns of high frequency words and content words in order to discover pattern candidates. Symmetric patterns are then identified using graph-based measures, and word categories are created based on graph clique sets. Our method is the pattern-based method that requires no seed patterns or words provided manually. For Chinese, only POS is carried out in advance. The computation time for large corpora is linear. The result is preferable by manual judgment.","PeriodicalId":216006,"journal":{"name":"2010 International Conference on Artificial Intelligence and Education (ICAIE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient unsupervised extraction of words categories using symmetric patterns and high frequency words\",\"authors\":\"Liu Rong, Zhang Zhiping, Pang Ning\",\"doi\":\"10.1109/ICAIE.2010.5641103\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a novel approach for discovering and extracting sets of words sharing semantic meaning. We utilize meta-patterns of high frequency words and content words in order to discover pattern candidates. Symmetric patterns are then identified using graph-based measures, and word categories are created based on graph clique sets. Our method is the pattern-based method that requires no seed patterns or words provided manually. For Chinese, only POS is carried out in advance. The computation time for large corpora is linear. The result is preferable by manual judgment.\",\"PeriodicalId\":216006,\"journal\":{\"name\":\"2010 International Conference on Artificial Intelligence and Education (ICAIE)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 International Conference on Artificial Intelligence and Education (ICAIE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAIE.2010.5641103\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Conference on Artificial Intelligence and Education (ICAIE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIE.2010.5641103","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Efficient unsupervised extraction of words categories using symmetric patterns and high frequency words
This paper presents a novel approach for discovering and extracting sets of words sharing semantic meaning. We utilize meta-patterns of high frequency words and content words in order to discover pattern candidates. Symmetric patterns are then identified using graph-based measures, and word categories are created based on graph clique sets. Our method is the pattern-based method that requires no seed patterns or words provided manually. For Chinese, only POS is carried out in advance. The computation time for large corpora is linear. The result is preferable by manual judgment.