{"title":"Research of intelligent word segmentation and information retrieval","authors":"Xiaofei Li, X. Xie","doi":"10.1109/ICETC.2010.5529961","DOIUrl":null,"url":null,"abstract":"Chinese information retrieval process is somewhat different from the English information retrieval process. In consideration of the existing problems and difficulties of Chinese language information processing, Hibernate search was introduced to exploit information retrieval engine in this paper. A Chinese language analyzer based on the word stock was adopted to process Chinese language information, therefore this analyzer could advance with the times by updating the word stock at any time. However, ambiguity errors caused by the Chinese language analyzer always interfered with the degree of accuracy of the result. During the time of information retrieval, a secondary word segmentation algorithm was used in order to improve Chinese language information retrieval precision. The result list given in this paper had shown that the intelligent Chinese segmentation algorithm had improved the system performance well.","PeriodicalId":299461,"journal":{"name":"2010 2nd International Conference on Education Technology and Computer","volume":"147 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 2nd International Conference on Education Technology and Computer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICETC.2010.5529961","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Chinese information retrieval process is somewhat different from the English information retrieval process. In consideration of the existing problems and difficulties of Chinese language information processing, Hibernate search was introduced to exploit information retrieval engine in this paper. A Chinese language analyzer based on the word stock was adopted to process Chinese language information, therefore this analyzer could advance with the times by updating the word stock at any time. However, ambiguity errors caused by the Chinese language analyzer always interfered with the degree of accuracy of the result. During the time of information retrieval, a secondary word segmentation algorithm was used in order to improve Chinese language information retrieval precision. The result list given in this paper had shown that the intelligent Chinese segmentation algorithm had improved the system performance well.