DMET微阵列基因分型数据分析的并行软件流水线。

Q2 Biochemistry, Genetics and Molecular Biology

High-Throughput Pub Date : 2018-06-14 DOI:10.3390/ht7020017

Giuseppe Agapito, Pietro Hiram Guzzi, Mario Cannataro

{"title":"DMET微阵列基因分型数据分析的并行软件流水线。","authors":"Giuseppe Agapito, Pietro Hiram Guzzi, Mario Cannataro","doi":"10.3390/ht7020017","DOIUrl":null,"url":null,"abstract":"Personalized medicine is an aspect of the P4 medicine (predictive, preventive, personalized and participatory) based precisely on the customization of all medical characters of each subject. In personalized medicine, the development of medical treatments and drugs is tailored to the individual characteristics and needs of each subject, according to the study of diseases at different scales from genotype to phenotype scale. To make concrete the goal of personalized medicine, it is necessary to employ high-throughput methodologies such as Next Generation Sequencing (NGS), Genome-Wide Association Studies (GWAS), Mass Spectrometry or Microarrays, that are able to investigate a single disease from a broader perspective. A side effect of high-throughput methodologies is the massive amount of data produced for each single experiment, that poses several challenges (e.g., high execution time and required memory) to bioinformatic software. Thus a main requirement of modern bioinformatic softwares, is the use of good software engineering methods and efficient programming techniques, able to face those challenges, that include the use of parallel programming and efficient and compact data structures. This paper presents the design and the experimentation of a comprehensive software pipeline, named microPipe, for the preprocessing, annotation and analysis of microarray-based Single Nucleotide Polymorphism (SNP) genotyping data. A use case in pharmacogenomics is presented. The main advantages of using microPipe are: the reduction of errors that may happen when trying to make data compatible among different tools; the possibility to analyze in parallel huge datasets; the easy annotation and integration of data. microPipe is available under Creative Commons license, and is freely downloadable for academic and not-for-profit institutions.","PeriodicalId":53433,"journal":{"name":"High-Throughput","volume":"7 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3390/ht7020017","citationCount":"1","resultStr":"{\"title\":\"A Parallel Software Pipeline for DMET Microarray Genotyping Data Analysis.\",\"authors\":\"Giuseppe Agapito, Pietro Hiram Guzzi, Mario Cannataro\",\"doi\":\"10.3390/ht7020017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Personalized medicine is an aspect of the P4 medicine (predictive, preventive, personalized and participatory) based precisely on the customization of all medical characters of each subject. In personalized medicine, the development of medical treatments and drugs is tailored to the individual characteristics and needs of each subject, according to the study of diseases at different scales from genotype to phenotype scale. To make concrete the goal of personalized medicine, it is necessary to employ high-throughput methodologies such as Next Generation Sequencing (NGS), Genome-Wide Association Studies (GWAS), Mass Spectrometry or Microarrays, that are able to investigate a single disease from a broader perspective. A side effect of high-throughput methodologies is the massive amount of data produced for each single experiment, that poses several challenges (e.g., high execution time and required memory) to bioinformatic software. Thus a main requirement of modern bioinformatic softwares, is the use of good software engineering methods and efficient programming techniques, able to face those challenges, that include the use of parallel programming and efficient and compact data structures. This paper presents the design and the experimentation of a comprehensive software pipeline, named microPipe, for the preprocessing, annotation and analysis of microarray-based Single Nucleotide Polymorphism (SNP) genotyping data. A use case in pharmacogenomics is presented. The main advantages of using microPipe are: the reduction of errors that may happen when trying to make data compatible among different tools; the possibility to analyze in parallel huge datasets; the easy annotation and integration of data. microPipe is available under Creative Commons license, and is freely downloadable for academic and not-for-profit institutions.\",\"PeriodicalId\":53433,\"journal\":{\"name\":\"High-Throughput\",\"volume\":\"7 2\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.3390/ht7020017\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"High-Throughput\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/ht7020017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Biochemistry, Genetics and Molecular Biology\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"High-Throughput","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/ht7020017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}

引用次数: 1

摘要

个性化医学是P4医学(预测性、预防性、个性化、参与性)的一个方面，正是基于对每个学科的所有医学特征进行定制。在个性化医疗中，根据从基因型到表型的不同尺度的疾病研究，针对每个受试者的个体特征和需求量身定制医疗和药物的开发。为了实现个性化医疗的具体目标，有必要采用高通量方法，如下一代测序(NGS)、全基因组关联研究(GWAS)、质谱分析或微阵列，这些方法能够从更广泛的角度研究单一疾病。高通量方法的一个副作用是每次实验产生大量数据，这给生物信息学软件带来了一些挑战(例如，高执行时间和所需内存)。因此，现代生物信息学软件的主要要求是使用良好的软件工程方法和高效的编程技术，能够面对这些挑战，包括使用并行编程和高效紧凑的数据结构。本文设计并实验了一个名为microPipe的综合软件管道，用于预处理、注释和分析基于微阵列的单核苷酸多态性(SNP)基因分型数据。介绍了药物基因组学中的一个用例。使用microPipe的主要优点是:减少了在尝试使数据在不同工具之间兼容时可能发生的错误;并行分析海量数据集的可能性;易于数据的注释和集成。microPipe是在知识共享许可下提供的，可供学术机构和非营利机构免费下载。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

A Parallel Software Pipeline for DMET Microarray Genotyping Data Analysis.

查看原文本刊更多论文

A Parallel Software Pipeline for DMET Microarray Genotyping Data Analysis.

Personalized medicine is an aspect of the P4 medicine (predictive, preventive, personalized and participatory) based precisely on the customization of all medical characters of each subject. In personalized medicine, the development of medical treatments and drugs is tailored to the individual characteristics and needs of each subject, according to the study of diseases at different scales from genotype to phenotype scale. To make concrete the goal of personalized medicine, it is necessary to employ high-throughput methodologies such as Next Generation Sequencing (NGS), Genome-Wide Association Studies (GWAS), Mass Spectrometry or Microarrays, that are able to investigate a single disease from a broader perspective. A side effect of high-throughput methodologies is the massive amount of data produced for each single experiment, that poses several challenges (e.g., high execution time and required memory) to bioinformatic software. Thus a main requirement of modern bioinformatic softwares, is the use of good software engineering methods and efficient programming techniques, able to face those challenges, that include the use of parallel programming and efficient and compact data structures. This paper presents the design and the experimentation of a comprehensive software pipeline, named microPipe, for the preprocessing, annotation and analysis of microarray-based Single Nucleotide Polymorphism (SNP) genotyping data. A use case in pharmacogenomics is presented. The main advantages of using microPipe are: the reduction of errors that may happen when trying to make data compatible among different tools; the possibility to analyze in parallel huge datasets; the easy annotation and integration of data. microPipe is available under Creative Commons license, and is freely downloadable for academic and not-for-profit institutions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

High-Throughput Biochemistry, Genetics and Molecular Biology-Biotechnology

CiteScore

3.60

自引率

0.00%

发文量

审稿时长

9 weeks

期刊介绍： High-Throughput (formerly Microarrays, ISSN 2076-3905) is a multidisciplinary peer-reviewed scientific journal that provides an advanced forum for the publication of studies reporting high-dimensional approaches and developments in Life Sciences, Chemistry and related fields. Our aim is to encourage scientists to publish their experimental and theoretical results based on high-throughput techniques as well as computational and statistical tools for data analysis and interpretation. The full experimental or methodological details must be provided so that the results can be reproduced. There is no restriction on the length of the papers. High-Throughput invites submissions covering several topics, including, but not limited to: -Microarrays -DNA Sequencing -RNA Sequencing -Protein Identification and Quantification -Cell-based Approaches -Omics Technologies -Imaging -Bioinformatics -Computational Biology/Chemistry -Statistics -Integrative Omics -Drug Discovery and Development -Microfluidics -Lab-on-a-chip -Data Mining -Databases -Multiplex Assays