{"title":"Explaining cube measures through Intentional Analytics","authors":"Matteo Francia , Stefano Rizzi , Patrick Marcel","doi":"10.1016/j.is.2023.102338","DOIUrl":null,"url":null,"abstract":"<div><p>The Intentional Analytics Model (IAM) has been devised to couple OLAP and analytics by (i) letting users express their analysis intentions on multidimensional data cubes and (ii) returning enhanced cubes, i.e., multidimensional data annotated with knowledge insights in the form of models (e.g., correlations). Five intention operators were proposed to this end; of these, <span>describe</span> and <span>assess</span> have been investigated in previous papers. In this work we enrich the IAM picture by focusing on the <span>explain</span> operator, whose goal is to provide an answer to the user asking “why does measure <span><math><mi>m</mi></math></span> show these values?”; specifically, we consider models that explain <span><math><mi>m</mi></math></span> in terms of one or more other measures. We propose a syntax for the operator and discuss how enhanced cubes are built by (i) finding the relationship between <span><math><mi>m</mi></math></span> and the other cube measures via regression analysis and cross-correlation, and (ii) highlighting the most interesting one. Finally, we test the operator implementation in terms of efficiency and effectiveness.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"121 ","pages":"Article 102338"},"PeriodicalIF":3.0000,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306437923001746/pdfft?md5=23f8fab78fdd903fb8bd9c0b6f06f739&pid=1-s2.0-S0306437923001746-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306437923001746","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The Intentional Analytics Model (IAM) has been devised to couple OLAP and analytics by (i) letting users express their analysis intentions on multidimensional data cubes and (ii) returning enhanced cubes, i.e., multidimensional data annotated with knowledge insights in the form of models (e.g., correlations). Five intention operators were proposed to this end; of these, describe and assess have been investigated in previous papers. In this work we enrich the IAM picture by focusing on the explain operator, whose goal is to provide an answer to the user asking “why does measure show these values?”; specifically, we consider models that explain in terms of one or more other measures. We propose a syntax for the operator and discuss how enhanced cubes are built by (i) finding the relationship between and the other cube measures via regression analysis and cross-correlation, and (ii) highlighting the most interesting one. Finally, we test the operator implementation in terms of efficiency and effectiveness.
Intentional Analytics Model(IAM,意向分析模型)旨在通过以下方式将 OLAP 和分析结合起来:(i) 让用户在多维数据立方体上表达他们的分析意向;(ii) 返回增强的立方体,即以模型(如相关性)形式注释了知识见解的多维数据。为此,我们提出了五个意向操作符;其中,描述和评估已在以前的论文中进行过研究。在这项工作中,我们将重点放在解释运算符上,以丰富 IAM 的内容,解释运算符的目标是回答用户 "为什么测量值 m 会显示这些值?"的问题;具体来说,我们考虑用一个或多个其他测量值来解释 m 的模型。我们为运算符提出了一种语法,并讨论了如何通过以下方法构建增强立方体:(i) 通过回归分析和交叉相关分析找到 m 与其他立方体测量值之间的关系,(ii) 突出显示最有趣的测量值。最后,我们从效率和效果方面测试了算子的实现。
期刊介绍:
Information systems are the software and hardware systems that support data-intensive applications. The journal Information Systems publishes articles concerning the design and implementation of languages, data models, process models, algorithms, software and hardware for information systems.
Subject areas include data management issues as presented in the principal international database conferences (e.g., ACM SIGMOD/PODS, VLDB, ICDE and ICDT/EDBT) as well as data-related issues from the fields of data mining/machine learning, information retrieval coordinated with structured data, internet and cloud data management, business process management, web semantics, visual and audio information systems, scientific computing, and data science. Implementation papers having to do with massively parallel data management, fault tolerance in practice, and special purpose hardware for data-intensive systems are also welcome. Manuscripts from application domains, such as urban informatics, social and natural science, and Internet of Things, are also welcome. All papers should highlight innovative solutions to data management problems such as new data models, performance enhancements, and show how those innovations contribute to the goals of the application.