A data preprocessing framework of geoscience data sharing portal for user behavior mining

Mo Wang, Juanle Wang
{"title":"A data preprocessing framework of geoscience data sharing portal for user behavior mining","authors":"Mo Wang, Juanle Wang","doi":"10.1109/GEOINFORMATICS.2015.7378637","DOIUrl":null,"url":null,"abstract":"Science data sharing has many advantages for both scientific research and education. Knowing about behaviors of science data sharing participants is valuable to support informed decision making on data sharing policy and data sharing website design. Nowadays, data sharing is mainly carried through the Internet, and web usage mining provides an ideal approach to uncover user behaviors of data sharing. This paper presents a data preprocessing framework for further user behavior mining of a geoscience data sharing portal (geodata.cn). The preprocessing steps included data cleaning, user identification, session identification, and data modeling. Web server logs served as the major data source of this study. Heuristic algorithms were employed to accomplish data cleaning and user identification. Different session identification methods were applied for comparison. Users' geolocation were identified using an online Geo-IP lookup tool, which provides geographical coordinates of an IP address. On the basis of all the preprocessing procedures, a web usage data model of science data sharing portal were proposed for further user behavior mining, such as user classification and spatial association rules mining.","PeriodicalId":371399,"journal":{"name":"2015 23rd International Conference on Geoinformatics","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 23rd International Conference on Geoinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GEOINFORMATICS.2015.7378637","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Science data sharing has many advantages for both scientific research and education. Knowing about behaviors of science data sharing participants is valuable to support informed decision making on data sharing policy and data sharing website design. Nowadays, data sharing is mainly carried through the Internet, and web usage mining provides an ideal approach to uncover user behaviors of data sharing. This paper presents a data preprocessing framework for further user behavior mining of a geoscience data sharing portal (geodata.cn). The preprocessing steps included data cleaning, user identification, session identification, and data modeling. Web server logs served as the major data source of this study. Heuristic algorithms were employed to accomplish data cleaning and user identification. Different session identification methods were applied for comparison. Users' geolocation were identified using an online Geo-IP lookup tool, which provides geographical coordinates of an IP address. On the basis of all the preprocessing procedures, a web usage data model of science data sharing portal were proposed for further user behavior mining, such as user classification and spatial association rules mining.
面向用户行为挖掘的地学数据共享门户数据预处理框架
科学数据共享对科学研究和教育都有很多好处。了解科学数据共享参与者的行为对支持数据共享政策和数据共享网站设计的知情决策具有重要意义。目前,数据共享主要通过互联网进行,网络使用挖掘为揭示用户数据共享行为提供了一种理想的方法。为进一步挖掘地球科学数据共享门户网站(geodata.cn)的用户行为,提出了一种数据预处理框架。预处理步骤包括数据清理、用户标识、会话标识和数据建模。Web服务器日志是本研究的主要数据源。采用启发式算法完成数据清理和用户识别。采用不同的会话识别方法进行比较。用户的地理位置是通过在线地理IP查找工具确定的,该工具提供IP地址的地理坐标。在所有预处理步骤的基础上,提出了科学数据共享门户网站的web使用数据模型,用于用户分类和空间关联规则挖掘等用户行为挖掘。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信