{"title":"大规模Web日志分析的大容量假设检验","authors":"Sana Malik, Eunyee Koh","doi":"10.1145/2851581.2892487","DOIUrl":null,"url":null,"abstract":"Time-stamped event sequence data is being generated across many domains: shopping transactions, web traffic logs, medical histories, etc. Oftentimes, analysts are interested in comparing the similarities and differences between two or more groups of event sequences to better understand processes that lead to different outcomes (e.g., a customer did or did not make a purchase). CoCo is a visual analytics tool for Cohort Comparison that combines automated high-volume hypothesis testing (HVHT) with and interactive visualization and user interface for improved exploratory data analysis. This paper covers the first case study of CoCo for large-scale web log analysis and the challenges that arise when scaling a visual analytics tool to large datasets. The direct contributions of this paper are: (1) solutions to 7 challenges of scaling a visual analytics tool to larger datasets, and (2) a case study with three real-world analysts with these solutions implemented.","PeriodicalId":285547,"journal":{"name":"Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems","volume":"259 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"High-Volume Hypothesis Testing for Large-Scale Web Log Analysis\",\"authors\":\"Sana Malik, Eunyee Koh\",\"doi\":\"10.1145/2851581.2892487\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Time-stamped event sequence data is being generated across many domains: shopping transactions, web traffic logs, medical histories, etc. Oftentimes, analysts are interested in comparing the similarities and differences between two or more groups of event sequences to better understand processes that lead to different outcomes (e.g., a customer did or did not make a purchase). CoCo is a visual analytics tool for Cohort Comparison that combines automated high-volume hypothesis testing (HVHT) with and interactive visualization and user interface for improved exploratory data analysis. This paper covers the first case study of CoCo for large-scale web log analysis and the challenges that arise when scaling a visual analytics tool to large datasets. The direct contributions of this paper are: (1) solutions to 7 challenges of scaling a visual analytics tool to larger datasets, and (2) a case study with three real-world analysts with these solutions implemented.\",\"PeriodicalId\":285547,\"journal\":{\"name\":\"Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems\",\"volume\":\"259 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2851581.2892487\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2851581.2892487","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
High-Volume Hypothesis Testing for Large-Scale Web Log Analysis
Time-stamped event sequence data is being generated across many domains: shopping transactions, web traffic logs, medical histories, etc. Oftentimes, analysts are interested in comparing the similarities and differences between two or more groups of event sequences to better understand processes that lead to different outcomes (e.g., a customer did or did not make a purchase). CoCo is a visual analytics tool for Cohort Comparison that combines automated high-volume hypothesis testing (HVHT) with and interactive visualization and user interface for improved exploratory data analysis. This paper covers the first case study of CoCo for large-scale web log analysis and the challenges that arise when scaling a visual analytics tool to large datasets. The direct contributions of this paper are: (1) solutions to 7 challenges of scaling a visual analytics tool to larger datasets, and (2) a case study with three real-world analysts with these solutions implemented.