Yuan Sun, Xiaodong Yan, Xiaobing Zhao, Guosheng Yang
{"title":"藏文新有效词自动提取方法","authors":"Yuan Sun, Xiaodong Yan, Xiaobing Zhao, Guosheng Yang","doi":"10.1109/ICINIS.2012.61","DOIUrl":null,"url":null,"abstract":"This paper proposes a model to automatically extract Tibetan new valid words. Through building the dynamic Tibetan corpus from 2009 to 2012, which covers more than 18 Tibetan network media of Tibet, Qinghai, Sichuan, Gansu and Yunnan, we research on the key techniques of Tibetan new valid word extraction: (1) using statistical method to establish Tibetan new word knowledge base, (2) using information entropy to filter Tibetan new valid words, (3) using vector space module similarity calculation to extract Tibetan new valid word.","PeriodicalId":302503,"journal":{"name":"2012 Fifth International Conference on Intelligent Networks and Intelligent Systems","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Automatic Extraction Method of Tibetan New Valid Words\",\"authors\":\"Yuan Sun, Xiaodong Yan, Xiaobing Zhao, Guosheng Yang\",\"doi\":\"10.1109/ICINIS.2012.61\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes a model to automatically extract Tibetan new valid words. Through building the dynamic Tibetan corpus from 2009 to 2012, which covers more than 18 Tibetan network media of Tibet, Qinghai, Sichuan, Gansu and Yunnan, we research on the key techniques of Tibetan new valid word extraction: (1) using statistical method to establish Tibetan new word knowledge base, (2) using information entropy to filter Tibetan new valid words, (3) using vector space module similarity calculation to extract Tibetan new valid word.\",\"PeriodicalId\":302503,\"journal\":{\"name\":\"2012 Fifth International Conference on Intelligent Networks and Intelligent Systems\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 Fifth International Conference on Intelligent Networks and Intelligent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICINIS.2012.61\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Fifth International Conference on Intelligent Networks and Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICINIS.2012.61","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automatic Extraction Method of Tibetan New Valid Words
This paper proposes a model to automatically extract Tibetan new valid words. Through building the dynamic Tibetan corpus from 2009 to 2012, which covers more than 18 Tibetan network media of Tibet, Qinghai, Sichuan, Gansu and Yunnan, we research on the key techniques of Tibetan new valid word extraction: (1) using statistical method to establish Tibetan new word knowledge base, (2) using information entropy to filter Tibetan new valid words, (3) using vector space module similarity calculation to extract Tibetan new valid word.