Plasma metabolite based clustering of breast cancer survivors and identification of dietary and health related characteristics: an application of unsupervised machine learning.
Ga-Eun Yie, Woojin Kyeong, Sihan Song, Zisun Kim, Hyun Jo Youn, Jihyoung Cho, Jun Won Min, Yoo Seok Kim, Jung Eun Lee
{"title":"Plasma metabolite based clustering of breast cancer survivors and identification of dietary and health related characteristics: an application of unsupervised machine learning.","authors":"Ga-Eun Yie, Woojin Kyeong, Sihan Song, Zisun Kim, Hyun Jo Youn, Jihyoung Cho, Jun Won Min, Yoo Seok Kim, Jung Eun Lee","doi":"10.4162/nrp.2025.19.2.273","DOIUrl":null,"url":null,"abstract":"<p><strong>Background/objectives: </strong>This study aimed to use plasma metabolites to identify clusters of breast cancer survivors and to compare their dietary characteristics and health-related factors across the clusters using unsupervised machine learning.</p><p><strong>Subjects/methods: </strong>A total of 419 breast cancer survivors were included in this cross-sectional study. We considered 30 plasma metabolites, quantified by high-throughput nuclear magnetic resonance metabolomics. Clusters were obtained based on metabolites using 4 different unsupervised clustering methods: k-means (KM), partitioning around medoids (PAM), self-organizing maps (SOM), and hierarchical agglomerative clustering (HAC). The <i>t</i>-test, χ<sup>2</sup> test, and Fisher's exact test were used to compare sociodemographic, lifestyle, clinical, and dietary characteristics across the clusters. <i>P</i>-values were adjusted through a false discovery rate (FDR).</p><p><strong>Results: </strong>Two clusters were identified using the 4 methods. Participants in cluster 2 had lower concentrations of apolipoprotein A1 and large high-density lipoprotein (HDL) particles and smaller HDL particle sizes, but higher concentrations of chylomicrons and extremely large very-low-density-lipoprotein (VLDL) particles and glycoprotein acetyls, a higher ratio of monounsaturated fatty acids to total fatty acids, and larger VLDL particle sizes compared with cluster 1. Body mass index was significantly higher in cluster 2 compared with cluster 1 (FDR adjusted-<i>P</i> <sub>KM</sub> < 0.001; <i>P</i> <sub>PAM</sub> = 0.001; <i>P</i> <sub>SOM</sub> < 0.001; and <i>P</i> <sub>HAC</sub> = 0.043).</p><p><strong>Conclusion: </strong>The breast cancer survivors clustered on the basis of plasma metabolites had distinct characteristics. Further prospective studies are needed to investigate the associations between metabolites, obesity, dietary factors, and breast cancer prognosis.</p>","PeriodicalId":19232,"journal":{"name":"Nutrition Research and Practice","volume":"19 2","pages":"273-291"},"PeriodicalIF":2.0000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11982688/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nutrition Research and Practice","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.4162/nrp.2025.19.2.273","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/19 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"NUTRITION & DIETETICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background/objectives: This study aimed to use plasma metabolites to identify clusters of breast cancer survivors and to compare their dietary characteristics and health-related factors across the clusters using unsupervised machine learning.
Subjects/methods: A total of 419 breast cancer survivors were included in this cross-sectional study. We considered 30 plasma metabolites, quantified by high-throughput nuclear magnetic resonance metabolomics. Clusters were obtained based on metabolites using 4 different unsupervised clustering methods: k-means (KM), partitioning around medoids (PAM), self-organizing maps (SOM), and hierarchical agglomerative clustering (HAC). The t-test, χ2 test, and Fisher's exact test were used to compare sociodemographic, lifestyle, clinical, and dietary characteristics across the clusters. P-values were adjusted through a false discovery rate (FDR).
Results: Two clusters were identified using the 4 methods. Participants in cluster 2 had lower concentrations of apolipoprotein A1 and large high-density lipoprotein (HDL) particles and smaller HDL particle sizes, but higher concentrations of chylomicrons and extremely large very-low-density-lipoprotein (VLDL) particles and glycoprotein acetyls, a higher ratio of monounsaturated fatty acids to total fatty acids, and larger VLDL particle sizes compared with cluster 1. Body mass index was significantly higher in cluster 2 compared with cluster 1 (FDR adjusted-PKM < 0.001; PPAM = 0.001; PSOM < 0.001; and PHAC = 0.043).
Conclusion: The breast cancer survivors clustered on the basis of plasma metabolites had distinct characteristics. Further prospective studies are needed to investigate the associations between metabolites, obesity, dietary factors, and breast cancer prognosis.
期刊介绍:
Nutrition Research and Practice (NRP) is an official journal, jointly published by the Korean Nutrition Society and the Korean Society of Community Nutrition since 2007. The journal had been published quarterly at the initial stage and has been published bimonthly since 2010.
NRP aims to stimulate research and practice across diverse areas of human nutrition. The Journal publishes peer-reviewed original manuscripts on nutrition biochemistry and metabolism, community nutrition, nutrition and disease management, nutritional epidemiology, nutrition education, foodservice management in the following categories: Original Research Articles, Notes, Communications, and Reviews. Reviews will be received by the invitation of the editors only. Statements made and opinions expressed in the manuscripts published in this Journal represent the views of authors and do not necessarily reflect the opinion of the Societies.