{"title":"特定领域的Web文档重排序算法","authors":"Grace Zhao, Xiaowen Zhang","doi":"10.1109/IIAI-AAI.2017.125","DOIUrl":null,"url":null,"abstract":"In order to build a domain-specific knowledge hub for learning on the web, the web resources crawled by generic search engines will need to be sifted and sorted before use. We propose a re-ranking algorithm that recognizes the highly domain relevant web data to feed in the domain knowledge learning hub. The algorithm studies the structure and semantics of the domain ontology (graph) and constructs computational relations among nodes. Through mining matching terms between ontology dictionary and the textual content (text, metadata) of the retrieved documents crawled by some credited web search engines, we calculate three-dimensional information scores - distance, direction, and attributes of each document and subsequently re-rank the retrieved documents to provide learners with more meaningful knowledge in the domain space they embrace.","PeriodicalId":281712,"journal":{"name":"2017 6th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A Domain-Specific Web Document Re-ranking Algorithm\",\"authors\":\"Grace Zhao, Xiaowen Zhang\",\"doi\":\"10.1109/IIAI-AAI.2017.125\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to build a domain-specific knowledge hub for learning on the web, the web resources crawled by generic search engines will need to be sifted and sorted before use. We propose a re-ranking algorithm that recognizes the highly domain relevant web data to feed in the domain knowledge learning hub. The algorithm studies the structure and semantics of the domain ontology (graph) and constructs computational relations among nodes. Through mining matching terms between ontology dictionary and the textual content (text, metadata) of the retrieved documents crawled by some credited web search engines, we calculate three-dimensional information scores - distance, direction, and attributes of each document and subsequently re-rank the retrieved documents to provide learners with more meaningful knowledge in the domain space they embrace.\",\"PeriodicalId\":281712,\"journal\":{\"name\":\"2017 6th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI)\",\"volume\":\"61 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 6th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IIAI-AAI.2017.125\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 6th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IIAI-AAI.2017.125","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Domain-Specific Web Document Re-ranking Algorithm
In order to build a domain-specific knowledge hub for learning on the web, the web resources crawled by generic search engines will need to be sifted and sorted before use. We propose a re-ranking algorithm that recognizes the highly domain relevant web data to feed in the domain knowledge learning hub. The algorithm studies the structure and semantics of the domain ontology (graph) and constructs computational relations among nodes. Through mining matching terms between ontology dictionary and the textual content (text, metadata) of the retrieved documents crawled by some credited web search engines, we calculate three-dimensional information scores - distance, direction, and attributes of each document and subsequently re-rank the retrieved documents to provide learners with more meaningful knowledge in the domain space they embrace.