Large-scale analysis of query logs to profile users for dataset search

Romina Sharifpour, Mingfang Wu, Xiuzhen Zhang
{"title":"Large-scale analysis of query logs to profile users for dataset search","authors":"Romina Sharifpour, Mingfang Wu, Xiuzhen Zhang","doi":"10.1108/jd-12-2021-0245","DOIUrl":null,"url":null,"abstract":"PurposeWith an explosion of datasets available on the Web, dataset search has gained attention as an emerging research domain. Understanding users' dataset behaviour is imperative for providing effective data discovery services. In this paper, the authors present a study on users' dataset search behaviour through the analysis of search logs from a research data discovery portal.Design/methodology/approachUsing query and session based features, the authors apply cluster analysis to discover distinct user profiles with different search behaviours. One particular behavioural construct of our interest is users' expertise that the authors generate via computing semantic similarity between users' search queries and the title of metadata records in the displayed search results.FindingsThe findings revealed that there are six distinct classes of user behaviours for dataset search, namely; Expert Research, Expert Search, Expert Explore, Novice Research, Novice Search and Novice Explore.Research limitations/implicationsThe user profiles are derived based on analysis of the search log of the research data catalogue in this study. Further research is needed to generalise the user profiles to other dataset search settings. Future research can take on a confirmatory approach to verify these user groups and establish a deeper understanding of their information needs.Practical implicationsThe findings in this paper have implications for designing search systems that tailor search results matching the diverse information needs of different user groups.Originality/valueWe propose for the first time a taxonomy of users for dataset search based on their domain expertise and search behaviour.","PeriodicalId":402385,"journal":{"name":"J. Documentation","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Documentation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/jd-12-2021-0245","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

PurposeWith an explosion of datasets available on the Web, dataset search has gained attention as an emerging research domain. Understanding users' dataset behaviour is imperative for providing effective data discovery services. In this paper, the authors present a study on users' dataset search behaviour through the analysis of search logs from a research data discovery portal.Design/methodology/approachUsing query and session based features, the authors apply cluster analysis to discover distinct user profiles with different search behaviours. One particular behavioural construct of our interest is users' expertise that the authors generate via computing semantic similarity between users' search queries and the title of metadata records in the displayed search results.FindingsThe findings revealed that there are six distinct classes of user behaviours for dataset search, namely; Expert Research, Expert Search, Expert Explore, Novice Research, Novice Search and Novice Explore.Research limitations/implicationsThe user profiles are derived based on analysis of the search log of the research data catalogue in this study. Further research is needed to generalise the user profiles to other dataset search settings. Future research can take on a confirmatory approach to verify these user groups and establish a deeper understanding of their information needs.Practical implicationsThe findings in this paper have implications for designing search systems that tailor search results matching the diverse information needs of different user groups.Originality/valueWe propose for the first time a taxonomy of users for dataset search based on their domain expertise and search behaviour.
对查询日志进行大规模分析,对用户进行数据集搜索
随着网络上可用数据集的爆炸式增长,数据集搜索作为一个新兴的研究领域得到了关注。了解用户的数据集行为对于提供有效的数据发现服务至关重要。在本文中,作者通过分析来自研究数据发现门户的搜索日志,对用户的数据集搜索行为进行了研究。设计/方法/方法使用基于查询和会话的功能,作者应用聚类分析来发现具有不同搜索行为的不同用户配置文件。我们感兴趣的一个特定的行为结构是用户的专业知识,作者通过计算用户搜索查询和显示的搜索结果中元数据记录标题之间的语义相似性来生成用户的专业知识。研究结果表明,数据集搜索的用户行为有六种不同的类别,即;专家研究,专家搜索,专家探索,新手研究,新手搜索和新手探索。研究局限/启示本研究透过分析研究资料目录的搜寻日志,得出使用者资料。需要进一步研究将用户配置文件推广到其他数据集搜索设置。未来的研究可以采取一种验证的方法来验证这些用户群体,并对他们的信息需求建立更深入的了解。本文的研究结果对设计搜索系统具有启示意义,该系统可以根据不同用户群体的不同信息需求定制搜索结果。原创性/价值我们首次提出了基于用户领域专长和搜索行为的数据集搜索用户分类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信