Jian Zhang, Chaomei Chen, M. Vogeley, Danny Pan, Anirudha Thakar, M. Raddick
{"title":"SDSS Log Viewer : visual exploratory analysis of large-volume SQL log data","authors":"Jian Zhang, Chaomei Chen, M. Vogeley, Danny Pan, Anirudha Thakar, M. Raddick","doi":"10.1117/12.907097","DOIUrl":null,"url":null,"abstract":"User-generated Structured Query Language (SQL) queries are a rich source of information for database analysts, \ninformation scientists, and the end users of databases. In this study a group of scientists in astronomy and computer and \ninformation scientists work together to analyze a large volume of SQL log data generated by users of the Sloan Digital \nSky Survey (SDSS) data archive in order to better understand users' data seeking behavior. While statistical analysis of \nsuch logs is useful at aggregated levels, efficiently exploring specific patterns of queries is often a challenging task due \nto the typically large volume of the data, multivariate features, and data requirements specified in SQL queries. To \nenable and facilitate effective and efficient exploration of the SDSS log data, we designed an interactive visualization \ntool, called the SDSS Log Viewer, which integrates time series visualization, text visualization, and dynamic query \ntechniques. We describe two analysis scenarios of visual exploration of SDSS log data, including understanding \nunusually high daily query traffic and modeling the types of data seeking behaviors of massive query generators. The \ntwo scenarios demonstrate that the SDSS Log Viewer provides a novel and potentially valuable approach to support these \ntargeted tasks.","PeriodicalId":89305,"journal":{"name":"Visualization and data analysis","volume":"37 1","pages":"82940D"},"PeriodicalIF":0.0000,"publicationDate":"2012-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Visualization and data analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.907097","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
User-generated Structured Query Language (SQL) queries are a rich source of information for database analysts,
information scientists, and the end users of databases. In this study a group of scientists in astronomy and computer and
information scientists work together to analyze a large volume of SQL log data generated by users of the Sloan Digital
Sky Survey (SDSS) data archive in order to better understand users' data seeking behavior. While statistical analysis of
such logs is useful at aggregated levels, efficiently exploring specific patterns of queries is often a challenging task due
to the typically large volume of the data, multivariate features, and data requirements specified in SQL queries. To
enable and facilitate effective and efficient exploration of the SDSS log data, we designed an interactive visualization
tool, called the SDSS Log Viewer, which integrates time series visualization, text visualization, and dynamic query
techniques. We describe two analysis scenarios of visual exploration of SDSS log data, including understanding
unusually high daily query traffic and modeling the types of data seeking behaviors of massive query generators. The
two scenarios demonstrate that the SDSS Log Viewer provides a novel and potentially valuable approach to support these
targeted tasks.