探索交互分析的可扩展计算体系结构

Taruna Seth, Chao Feng, M. Ramanathan, V. Chaudhary
{"title":"探索交互分析的可扩展计算体系结构","authors":"Taruna Seth, Chao Feng, M. Ramanathan, V. Chaudhary","doi":"10.1109/ICCCN.2018.8487405","DOIUrl":null,"url":null,"abstract":"Characterization of pharmacological signal transductions leading to drug-induced expressions of genes and proteins requires the capability to identify interactions among different potential predictor components, e.g. genomic data, clinical data, and environmental data. The detection of these gene-gene and gene-environment interactions remains challenging due to the exponential computational complexity and high dimensionality of the interaction problem. The problem is further exacerbated due to the involvement of very large-scale epidemiological datasets. Efficient high-order interaction analysis of such large-scale data is not feasible with the traditional frameworks. Parallel implementations of such applications in traditional cluster environments are often inefficient due to the storage bandwidth and network I/O limitations. Scalable distributed platforms can offer better scalability to such problems compared to the cluster architectures. Moreover, such data- and compute- intensive problems can benefit even further from data-intensive supercomputing (DISC) architectures that have been shown to yield superior performance compared to the regularly used cluster platforms. In this paper, we evaluate the applicability of different architectures such as traditional server based distributed architectures supported on commodity hardware and shared nothing architectures with massively parallel processing capabilities, towards the Interaction Analysis problem. Our experiments show that the massively parallel processing, shared-nothing architecture outweigh the benefits often realized through traditional server based and even distributed computing architectures. We conclude that the rapidly growing class of shared nothing architectures offers a potentially efficient and viable alternative to facilitate high-order interaction analysis involving extremely large scale biological datasets and is well suited to this category of data- and compute- intensive problems that cannot be addressed effectively using traditional frameworks.","PeriodicalId":399145,"journal":{"name":"2018 27th International Conference on Computer Communication and Networks (ICCCN)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring Scalable Computing Architectures for Interactions Analysis\",\"authors\":\"Taruna Seth, Chao Feng, M. Ramanathan, V. Chaudhary\",\"doi\":\"10.1109/ICCCN.2018.8487405\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Characterization of pharmacological signal transductions leading to drug-induced expressions of genes and proteins requires the capability to identify interactions among different potential predictor components, e.g. genomic data, clinical data, and environmental data. The detection of these gene-gene and gene-environment interactions remains challenging due to the exponential computational complexity and high dimensionality of the interaction problem. The problem is further exacerbated due to the involvement of very large-scale epidemiological datasets. Efficient high-order interaction analysis of such large-scale data is not feasible with the traditional frameworks. Parallel implementations of such applications in traditional cluster environments are often inefficient due to the storage bandwidth and network I/O limitations. Scalable distributed platforms can offer better scalability to such problems compared to the cluster architectures. Moreover, such data- and compute- intensive problems can benefit even further from data-intensive supercomputing (DISC) architectures that have been shown to yield superior performance compared to the regularly used cluster platforms. In this paper, we evaluate the applicability of different architectures such as traditional server based distributed architectures supported on commodity hardware and shared nothing architectures with massively parallel processing capabilities, towards the Interaction Analysis problem. Our experiments show that the massively parallel processing, shared-nothing architecture outweigh the benefits often realized through traditional server based and even distributed computing architectures. We conclude that the rapidly growing class of shared nothing architectures offers a potentially efficient and viable alternative to facilitate high-order interaction analysis involving extremely large scale biological datasets and is well suited to this category of data- and compute- intensive problems that cannot be addressed effectively using traditional frameworks.\",\"PeriodicalId\":399145,\"journal\":{\"name\":\"2018 27th International Conference on Computer Communication and Networks (ICCCN)\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 27th International Conference on Computer Communication and Networks (ICCCN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCCN.2018.8487405\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 27th International Conference on Computer Communication and Networks (ICCCN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCN.2018.8487405","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

表征导致药物诱导的基因和蛋白质表达的药理学信号转导需要识别不同潜在预测成分之间的相互作用的能力,例如基因组数据、临床数据和环境数据。由于这些相互作用问题的指数计算复杂性和高维性,这些基因-基因和基因-环境相互作用的检测仍然具有挑战性。由于涉及非常大规模的流行病学数据集,这一问题进一步加剧。传统框架无法对如此大规模的数据进行高效的高阶交互分析。由于存储带宽和网络I/O限制,在传统集群环境中并行实现这类应用程序通常效率低下。与集群体系结构相比,可伸缩的分布式平台可以为此类问题提供更好的可伸缩性。此外,这些数据和计算密集型问题可以从数据密集型超级计算(DISC)体系结构中进一步受益,与常规使用的集群平台相比,数据密集型超级计算(DISC)体系结构已被证明具有优越的性能。在本文中,我们评估了不同架构对交互分析问题的适用性,例如传统的基于商用硬件支持的基于服务器的分布式架构和具有大规模并行处理能力的无共享架构。我们的实验表明,大规模并行处理、无共享架构比传统的基于服务器甚至分布式计算架构带来的好处更大。我们得出的结论是,快速增长的无共享架构提供了一种潜在的高效和可行的替代方案,以促进涉及超大规模生物数据集的高阶交互分析,并且非常适合这类数据和计算密集型问题,这些问题无法使用传统框架有效地解决。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Exploring Scalable Computing Architectures for Interactions Analysis
Characterization of pharmacological signal transductions leading to drug-induced expressions of genes and proteins requires the capability to identify interactions among different potential predictor components, e.g. genomic data, clinical data, and environmental data. The detection of these gene-gene and gene-environment interactions remains challenging due to the exponential computational complexity and high dimensionality of the interaction problem. The problem is further exacerbated due to the involvement of very large-scale epidemiological datasets. Efficient high-order interaction analysis of such large-scale data is not feasible with the traditional frameworks. Parallel implementations of such applications in traditional cluster environments are often inefficient due to the storage bandwidth and network I/O limitations. Scalable distributed platforms can offer better scalability to such problems compared to the cluster architectures. Moreover, such data- and compute- intensive problems can benefit even further from data-intensive supercomputing (DISC) architectures that have been shown to yield superior performance compared to the regularly used cluster platforms. In this paper, we evaluate the applicability of different architectures such as traditional server based distributed architectures supported on commodity hardware and shared nothing architectures with massively parallel processing capabilities, towards the Interaction Analysis problem. Our experiments show that the massively parallel processing, shared-nothing architecture outweigh the benefits often realized through traditional server based and even distributed computing architectures. We conclude that the rapidly growing class of shared nothing architectures offers a potentially efficient and viable alternative to facilitate high-order interaction analysis involving extremely large scale biological datasets and is well suited to this category of data- and compute- intensive problems that cannot be addressed effectively using traditional frameworks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信