阵列CGH数据中基因拷贝数变化的检测

Jingke Hu, Jianbo Gao, Yinhe Cao, Weijia Zhang
{"title":"阵列CGH数据中基因拷贝数变化的检测","authors":"Jingke Hu, Jianbo Gao, Yinhe Cao, Weijia Zhang","doi":"10.1109/LSSA.2006.250402","DOIUrl":null,"url":null,"abstract":"Developing effective methods for analyzing array-CGH data to detect chromosomal aberrations is very important for the diagnosis of pathogenesis of cancer and other diseases. Current analysis methods, being largely based on smoothing and/or segmentation, are not quite capable of detecting both the aberration regions and the boundary break points very accurately. This is undesirable, since each point in the array represents a gene. Furthermore, when evaluating the accuracy of an algorithm for analyzing array-CGH data, it is commonly assumed that noise in the data follows normal distribution. A fundamental question is whether noise in array-CGH is indeed Gaussian, and if not, can one exploit the characteristics of noise to develop novel analysis methods that are capable of detecting accurately the aberration regions as well as the boundary break points simultaneously? By analyzing bacterial artificial chromosomes (BACs) arrays, oligo-nucleotide arrays, and high density NimbleGen data, we show that when there are aberrations, noise in all three types of arrays is highly non-Gaussian and possesses long-range spatial correlations, and that such noise leads to worse performance of existing methods for detecting aberrations in array-CGH than the Gaussian noise case. We further develop a novel method, which has optimally exploited the characteristics of the noise, and is capable of identifying both aberration regions as well as the boundary break points very accurately","PeriodicalId":360097,"journal":{"name":"2006 IEEE/NLM Life Science Systems and Applications Workshop","volume":"146 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Detection of gene copy number change in array CGH data\",\"authors\":\"Jingke Hu, Jianbo Gao, Yinhe Cao, Weijia Zhang\",\"doi\":\"10.1109/LSSA.2006.250402\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Developing effective methods for analyzing array-CGH data to detect chromosomal aberrations is very important for the diagnosis of pathogenesis of cancer and other diseases. Current analysis methods, being largely based on smoothing and/or segmentation, are not quite capable of detecting both the aberration regions and the boundary break points very accurately. This is undesirable, since each point in the array represents a gene. Furthermore, when evaluating the accuracy of an algorithm for analyzing array-CGH data, it is commonly assumed that noise in the data follows normal distribution. A fundamental question is whether noise in array-CGH is indeed Gaussian, and if not, can one exploit the characteristics of noise to develop novel analysis methods that are capable of detecting accurately the aberration regions as well as the boundary break points simultaneously? By analyzing bacterial artificial chromosomes (BACs) arrays, oligo-nucleotide arrays, and high density NimbleGen data, we show that when there are aberrations, noise in all three types of arrays is highly non-Gaussian and possesses long-range spatial correlations, and that such noise leads to worse performance of existing methods for detecting aberrations in array-CGH than the Gaussian noise case. We further develop a novel method, which has optimally exploited the characteristics of the noise, and is capable of identifying both aberration regions as well as the boundary break points very accurately\",\"PeriodicalId\":360097,\"journal\":{\"name\":\"2006 IEEE/NLM Life Science Systems and Applications Workshop\",\"volume\":\"146 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 IEEE/NLM Life Science Systems and Applications Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/LSSA.2006.250402\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE/NLM Life Science Systems and Applications Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LSSA.2006.250402","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

建立有效的阵列cgh数据分析方法来检测染色体畸变,对癌症和其他疾病的发病机制诊断具有重要意义。目前的分析方法主要基于平滑和/或分割,不能很准确地检测像差区域和边界断点。这是不可取的,因为数组中的每个点代表一个基因。此外,在评估阵列cgh数据分析算法的准确性时,通常假设数据中的噪声服从正态分布。一个基本的问题是阵列cgh中的噪声是否确实是高斯的,如果不是,是否可以利用噪声的特性来开发新的分析方法,能够同时准确地检测像差区域和边界断点?通过对细菌人工染色体(BACs)阵列、寡核苷酸阵列和高密度NimbleGen数据的分析,我们发现当存在像差时,这三种阵列的噪声都是非高斯的,并且具有长程空间相关性,这种噪声导致现有阵列- cgh的像差检测方法的性能低于高斯噪声情况。我们进一步开发了一种新的方法,该方法充分利用了噪声的特性,能够非常准确地识别像差区域和边界断点
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Detection of gene copy number change in array CGH data
Developing effective methods for analyzing array-CGH data to detect chromosomal aberrations is very important for the diagnosis of pathogenesis of cancer and other diseases. Current analysis methods, being largely based on smoothing and/or segmentation, are not quite capable of detecting both the aberration regions and the boundary break points very accurately. This is undesirable, since each point in the array represents a gene. Furthermore, when evaluating the accuracy of an algorithm for analyzing array-CGH data, it is commonly assumed that noise in the data follows normal distribution. A fundamental question is whether noise in array-CGH is indeed Gaussian, and if not, can one exploit the characteristics of noise to develop novel analysis methods that are capable of detecting accurately the aberration regions as well as the boundary break points simultaneously? By analyzing bacterial artificial chromosomes (BACs) arrays, oligo-nucleotide arrays, and high density NimbleGen data, we show that when there are aberrations, noise in all three types of arrays is highly non-Gaussian and possesses long-range spatial correlations, and that such noise leads to worse performance of existing methods for detecting aberrations in array-CGH than the Gaussian noise case. We further develop a novel method, which has optimally exploited the characteristics of the noise, and is capable of identifying both aberration regions as well as the boundary break points very accurately
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信