*K-means and cluster models for cancer signatures

Q1 Biochemistry, Genetics and Molecular Biology
Zura Kakushadze , Willie Yu
{"title":"*K-means and cluster models for cancer signatures","authors":"Zura Kakushadze ,&nbsp;Willie Yu","doi":"10.1016/j.bdq.2017.07.001","DOIUrl":null,"url":null,"abstract":"<div><p>We present *K-means clustering algorithm and source code by expanding statistical clustering methods applied in <span>https://ssrn.com/abstract=2802753</span><svg><path></path></svg> to quantitative finance. *K-means is statistically deterministic without specifying initial centers, etc. We apply *K-means to extracting cancer signatures from genome data without using nonnegative matrix factorization (NMF). *K-means’ computational cost is a fraction of NMF’s. Using 1389 published samples for 14 cancer types, we find that 3 cancers (liver cancer, lung cancer and renal cell carcinoma) stand out and do not have cluster-like structures. Two clusters have especially high within-cluster correlations with 11 other cancers indicating common underlying structures. Our approach opens a novel avenue for studying such structures. *K-means is universal and can be applied in other fields. We discuss some potential applications in quantitative finance.</p></div>","PeriodicalId":38073,"journal":{"name":"Biomolecular Detection and Quantification","volume":"13 ","pages":"Pages 7-31"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.bdq.2017.07.001","citationCount":"35","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomolecular Detection and Quantification","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214753517302061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
引用次数: 35

Abstract

We present *K-means clustering algorithm and source code by expanding statistical clustering methods applied in https://ssrn.com/abstract=2802753 to quantitative finance. *K-means is statistically deterministic without specifying initial centers, etc. We apply *K-means to extracting cancer signatures from genome data without using nonnegative matrix factorization (NMF). *K-means’ computational cost is a fraction of NMF’s. Using 1389 published samples for 14 cancer types, we find that 3 cancers (liver cancer, lung cancer and renal cell carcinoma) stand out and do not have cluster-like structures. Two clusters have especially high within-cluster correlations with 11 other cancers indicating common underlying structures. Our approach opens a novel avenue for studying such structures. *K-means is universal and can be applied in other fields. We discuss some potential applications in quantitative finance.

Abstract Image

Abstract Image

Abstract Image

*癌症特征的K-means和聚类模型
将https://ssrn.com/abstract=2802753中应用的统计聚类方法扩展到定量金融中,给出了*K-means聚类算法和源代码。*K-means在不指定初始中心等情况下具有统计确定性。我们采用*K-means方法从基因组数据中提取癌症特征,而不使用非负矩阵分解(NMF)。*K-means的计算成本是NMF的一小部分。使用14种癌症类型的1389个已发表的样本,我们发现3种癌症(肝癌、肺癌和肾细胞癌)很突出,没有簇状结构。有两组癌症与其他11种癌症的群内相关性特别高,这表明它们有共同的潜在结构。我们的方法为研究这种结构开辟了一条新的途径。*K-means具有普适性,可应用于其他领域。我们讨论了量化金融的一些潜在应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biomolecular Detection and Quantification
Biomolecular Detection and Quantification Biochemistry, Genetics and Molecular Biology-Biochemistry
CiteScore
14.20
自引率
0.00%
发文量
0
审稿时长
8 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信