VIS顶点地址:我能相信我所看到的吗?-信息论算法验证

J. Buhmann
{"title":"VIS顶点地址:我能相信我所看到的吗?-信息论算法验证","authors":"J. Buhmann","doi":"10.1109/VAST.2018.8802482","DOIUrl":null,"url":null,"abstract":"Data Science promises us a methodology and algorithms to gain insights in ubiquitous Big Data. Sophisticated algorithmic techniques seek to identify and visualize non-accidental patterns that may be (causally) linked to mechanisms in the natural sciences, but also in the social sciences, medicine, technology, and governance. When we use machine learning algorithms to inspect the often high-dimensional, uncertain, and high-volume data to filter out and visualize relevant information, we aim to abstract from accidental factors in our experiments and thereby generalize over data fluctuations. Doing this, we often rely on highly nonlinear algorithms. This talk presents arguments advocating an information theoretic framework for algorithm analysis, where an algorithm is characterized as a computational evolution of a posterior distribution on the output space with a quantitative stopping criterion. The method allows us to investigate complex data analysis pipelines, such as those found in computational neuroscience, neurology, and molecular biology. I will demonstrate this concept for the validation of algorithms using the example of a statistical analysis of diffusion tensor imaging data. In addition, on the example of gene expression data, I will demonstrate how different spectral clustering methods can be validated by showing their robustness to data fluctuations and yet sufficient sensitivity to changes in the data. All in all, an information-theoretical method is presented for validating data analysis algorithms, offering the potential of more trustful results in Visual Analytics.","PeriodicalId":168094,"journal":{"name":"IEEE Conference on Visual Analytics Science and Technology","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"VIS Capstone Address : Can I believe what I see?-Information theoretic algorithm validation\",\"authors\":\"J. Buhmann\",\"doi\":\"10.1109/VAST.2018.8802482\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data Science promises us a methodology and algorithms to gain insights in ubiquitous Big Data. Sophisticated algorithmic techniques seek to identify and visualize non-accidental patterns that may be (causally) linked to mechanisms in the natural sciences, but also in the social sciences, medicine, technology, and governance. When we use machine learning algorithms to inspect the often high-dimensional, uncertain, and high-volume data to filter out and visualize relevant information, we aim to abstract from accidental factors in our experiments and thereby generalize over data fluctuations. Doing this, we often rely on highly nonlinear algorithms. This talk presents arguments advocating an information theoretic framework for algorithm analysis, where an algorithm is characterized as a computational evolution of a posterior distribution on the output space with a quantitative stopping criterion. The method allows us to investigate complex data analysis pipelines, such as those found in computational neuroscience, neurology, and molecular biology. I will demonstrate this concept for the validation of algorithms using the example of a statistical analysis of diffusion tensor imaging data. In addition, on the example of gene expression data, I will demonstrate how different spectral clustering methods can be validated by showing their robustness to data fluctuations and yet sufficient sensitivity to changes in the data. All in all, an information-theoretical method is presented for validating data analysis algorithms, offering the potential of more trustful results in Visual Analytics.\",\"PeriodicalId\":168094,\"journal\":{\"name\":\"IEEE Conference on Visual Analytics Science and Technology\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Conference on Visual Analytics Science and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VAST.2018.8802482\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Conference on Visual Analytics Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VAST.2018.8802482","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

数据科学为我们提供了一种方法和算法,以获得无处不在的大数据的洞察力。复杂的算法技术寻求识别和可视化非偶然模式,这些模式可能与自然科学中的机制(因果关系)有关,也与社会科学、医学、技术和治理有关。当我们使用机器学习算法来检查通常是高维、不确定和大容量的数据以过滤和可视化相关信息时,我们的目标是从实验中的偶然因素中抽象出来,从而对数据波动进行概括。要做到这一点,我们通常依赖于高度非线性的算法。本次演讲提出了支持算法分析的信息理论框架的论点,其中算法的特征是输出空间上具有定量停止准则的后验分布的计算进化。该方法允许我们研究复杂的数据分析管道,例如在计算神经科学、神经学和分子生物学中发现的数据分析管道。我将使用扩散张量成像数据的统计分析示例来演示算法验证的这个概念。此外,在基因表达数据的例子上,我将演示如何通过展示它们对数据波动的鲁棒性和对数据变化的足够敏感性来验证不同的谱聚类方法。总而言之,本文提出了一种验证数据分析算法的信息理论方法,为可视化分析提供了更可信的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
VIS Capstone Address : Can I believe what I see?-Information theoretic algorithm validation
Data Science promises us a methodology and algorithms to gain insights in ubiquitous Big Data. Sophisticated algorithmic techniques seek to identify and visualize non-accidental patterns that may be (causally) linked to mechanisms in the natural sciences, but also in the social sciences, medicine, technology, and governance. When we use machine learning algorithms to inspect the often high-dimensional, uncertain, and high-volume data to filter out and visualize relevant information, we aim to abstract from accidental factors in our experiments and thereby generalize over data fluctuations. Doing this, we often rely on highly nonlinear algorithms. This talk presents arguments advocating an information theoretic framework for algorithm analysis, where an algorithm is characterized as a computational evolution of a posterior distribution on the output space with a quantitative stopping criterion. The method allows us to investigate complex data analysis pipelines, such as those found in computational neuroscience, neurology, and molecular biology. I will demonstrate this concept for the validation of algorithms using the example of a statistical analysis of diffusion tensor imaging data. In addition, on the example of gene expression data, I will demonstrate how different spectral clustering methods can be validated by showing their robustness to data fluctuations and yet sufficient sensitivity to changes in the data. All in all, an information-theoretical method is presented for validating data analysis algorithms, offering the potential of more trustful results in Visual Analytics.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信