{"title":"Searching Overseas Taiwanese Nationals by Web Content Mining","authors":"C. Hsu, Yuh Tzong Liu, Jen-Shin Hong","doi":"10.1109/CCOMS.2018.8463239","DOIUrl":null,"url":null,"abstract":"This article proposes a novel approach to retrieve overseas nationals of a given country by using web content mining techniques. The approach includes a three-step process: (1) key phrases composition, (2) query constraint imposition, and (3) web search result filtering. Based on the proposed approach, we develop a framework to realize a web retrieval system for searching professional overseas nationals. The framework includes modules of (1) Query Agent, (2) Web Search Engine, (3) Snippet Parser, (4) Page Filter and (5) Metadata Generator. The prototype system implementation shows that the feasibility of the proposed approach to efficiently retrieve large number of pages containing potential overseas nationals. Experiments shows that the precision rate varies between different combination of key phrases and query constraints. In certain cases, the precision rate could reach as high as 29%, which is much better than typical web searches using simple query terms.","PeriodicalId":405664,"journal":{"name":"2018 3rd International Conference on Computer and Communication Systems (ICCCS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 3rd International Conference on Computer and Communication Systems (ICCCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCOMS.2018.8463239","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This article proposes a novel approach to retrieve overseas nationals of a given country by using web content mining techniques. The approach includes a three-step process: (1) key phrases composition, (2) query constraint imposition, and (3) web search result filtering. Based on the proposed approach, we develop a framework to realize a web retrieval system for searching professional overseas nationals. The framework includes modules of (1) Query Agent, (2) Web Search Engine, (3) Snippet Parser, (4) Page Filter and (5) Metadata Generator. The prototype system implementation shows that the feasibility of the proposed approach to efficiently retrieve large number of pages containing potential overseas nationals. Experiments shows that the precision rate varies between different combination of key phrases and query constraints. In certain cases, the precision rate could reach as high as 29%, which is much better than typical web searches using simple query terms.