Taeseok Lee, Do-Heon Jeong, Young-Su Moon, Minsoo Park, mi-hwan Hyun
{"title":"基于自动词分类的查询分类分析研究","authors":"Taeseok Lee, Do-Heon Jeong, Young-Su Moon, Minsoo Park, mi-hwan Hyun","doi":"10.3745/KIPSTD.2012.19D.2.133","DOIUrl":null,"url":null,"abstract":"Queries entered in a search box are the results of users' activities to actively seek information. Therefore, search logs are important data which represent users' information needs. The purpose of this study is to examine if there is a relationship between the results of queries automatically classified and the categories of documents accessed. Search sessions were identified in 2009 NDSL(National Discovery for Science Leaders) log dataset of KISTI (Korea Institute of Science and Technology Information). Queries and items used were extracted by session. The queries were processed using an automatic classifier. The identified queries were then compared with the subject categories of items used. As a result, it was found that the average similarity was 58.8% for the automatic classification of the top 100 queries. Interestingly, this result is a numerical value lower than 76.8%, the result of search evaluated by experts. The reason for this difference explains that the terms used as queries are newly emerging as those of concern in other fields of research.","PeriodicalId":348746,"journal":{"name":"The Kips Transactions:partd","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"An Analytic Study on the Categorization of Query through Automatic Term Classification\",\"authors\":\"Taeseok Lee, Do-Heon Jeong, Young-Su Moon, Minsoo Park, mi-hwan Hyun\",\"doi\":\"10.3745/KIPSTD.2012.19D.2.133\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Queries entered in a search box are the results of users' activities to actively seek information. Therefore, search logs are important data which represent users' information needs. The purpose of this study is to examine if there is a relationship between the results of queries automatically classified and the categories of documents accessed. Search sessions were identified in 2009 NDSL(National Discovery for Science Leaders) log dataset of KISTI (Korea Institute of Science and Technology Information). Queries and items used were extracted by session. The queries were processed using an automatic classifier. The identified queries were then compared with the subject categories of items used. As a result, it was found that the average similarity was 58.8% for the automatic classification of the top 100 queries. Interestingly, this result is a numerical value lower than 76.8%, the result of search evaluated by experts. The reason for this difference explains that the terms used as queries are newly emerging as those of concern in other fields of research.\",\"PeriodicalId\":348746,\"journal\":{\"name\":\"The Kips Transactions:partd\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-04-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Kips Transactions:partd\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3745/KIPSTD.2012.19D.2.133\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Kips Transactions:partd","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3745/KIPSTD.2012.19D.2.133","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Analytic Study on the Categorization of Query through Automatic Term Classification
Queries entered in a search box are the results of users' activities to actively seek information. Therefore, search logs are important data which represent users' information needs. The purpose of this study is to examine if there is a relationship between the results of queries automatically classified and the categories of documents accessed. Search sessions were identified in 2009 NDSL(National Discovery for Science Leaders) log dataset of KISTI (Korea Institute of Science and Technology Information). Queries and items used were extracted by session. The queries were processed using an automatic classifier. The identified queries were then compared with the subject categories of items used. As a result, it was found that the average similarity was 58.8% for the automatic classification of the top 100 queries. Interestingly, this result is a numerical value lower than 76.8%, the result of search evaluated by experts. The reason for this difference explains that the terms used as queries are newly emerging as those of concern in other fields of research.