{"title":"基于多元统计分析探讨乳腺癌患者年龄、BMI及血液成分的特征","authors":"R. Dong","doi":"10.11648/j.acm.20200904.15","DOIUrl":null,"url":null,"abstract":"In this paper, through a series of analysis and testing of breast cancer detection data, the statistical rules of multiple objects and multiple indicators are analyzed in the case of their correlation. First of all, univariate diagnosis and multivariate diagnosis were performed on the data. Among them, when studying the correlation between variables, it was found that HOMA had a clear linear positive correlation with insulin content in blood. It is worth noting that some patients with breast cancer show a high degree of insulin resistance and blood insulin content, which is a feature not found in samples without breast cancer. Then, through single factor analysis of variance, we believe that there were significant differences in blood test conditions, ages, and BMI indicators of samples of different health conditions. Next, the principal component analysis was used to reduce the dimension of the data. In this study, the differences in age, BMI, and blood component content between the two groups with different health conditions can be summarized by these two independent factors. Among them, the absolute value of the MCP-1 (monocyte chemoattractant protein 1) coefficient in the main component 1 is large, reflecting the characteristics of the blood component of the sample; the load values of glucose and leptin in the main component 2 are large, reflecting similar results. Then, assuming the use of m = 3 factor model and the use of maximum likelihood method and principal component method, the original data and factor rotation data are re-analyzed, so that the variables are reduced to 3 factors for analysis. Among them, the maximum likelihood method is used to estimate the factor rotation data. The first factor reflects the insulin resistance factor attributed to insulin and HOMA indicators, and the second factor reflects the body fat and thin factor attributed to BMI and leptin. The third factor reflects the glucose content in the blood. Finally, by setting different misjudgment costs for discriminant analysis, the obtained APER is 0.1638 and EAER is 0.1872. Among them, the probability of discriminating patients with breast cancer from not having breast cancer is 0.09375, which is a low rate of misjudgment and also means the model established in this paper is efficient.","PeriodicalId":55503,"journal":{"name":"Applied and Computational Mathematics","volume":"1 1","pages":"130"},"PeriodicalIF":4.6000,"publicationDate":"2020-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Explore the Characteristics of Age, BMI and Blood Composition of Breast Cancer Patients Based on Multivariate Statistical Analysis\",\"authors\":\"R. Dong\",\"doi\":\"10.11648/j.acm.20200904.15\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, through a series of analysis and testing of breast cancer detection data, the statistical rules of multiple objects and multiple indicators are analyzed in the case of their correlation. First of all, univariate diagnosis and multivariate diagnosis were performed on the data. Among them, when studying the correlation between variables, it was found that HOMA had a clear linear positive correlation with insulin content in blood. It is worth noting that some patients with breast cancer show a high degree of insulin resistance and blood insulin content, which is a feature not found in samples without breast cancer. Then, through single factor analysis of variance, we believe that there were significant differences in blood test conditions, ages, and BMI indicators of samples of different health conditions. Next, the principal component analysis was used to reduce the dimension of the data. In this study, the differences in age, BMI, and blood component content between the two groups with different health conditions can be summarized by these two independent factors. Among them, the absolute value of the MCP-1 (monocyte chemoattractant protein 1) coefficient in the main component 1 is large, reflecting the characteristics of the blood component of the sample; the load values of glucose and leptin in the main component 2 are large, reflecting similar results. Then, assuming the use of m = 3 factor model and the use of maximum likelihood method and principal component method, the original data and factor rotation data are re-analyzed, so that the variables are reduced to 3 factors for analysis. Among them, the maximum likelihood method is used to estimate the factor rotation data. The first factor reflects the insulin resistance factor attributed to insulin and HOMA indicators, and the second factor reflects the body fat and thin factor attributed to BMI and leptin. The third factor reflects the glucose content in the blood. Finally, by setting different misjudgment costs for discriminant analysis, the obtained APER is 0.1638 and EAER is 0.1872. Among them, the probability of discriminating patients with breast cancer from not having breast cancer is 0.09375, which is a low rate of misjudgment and also means the model established in this paper is efficient.\",\"PeriodicalId\":55503,\"journal\":{\"name\":\"Applied and Computational Mathematics\",\"volume\":\"1 1\",\"pages\":\"130\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2020-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied and Computational Mathematics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.11648/j.acm.20200904.15\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied and Computational Mathematics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.11648/j.acm.20200904.15","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
Explore the Characteristics of Age, BMI and Blood Composition of Breast Cancer Patients Based on Multivariate Statistical Analysis
In this paper, through a series of analysis and testing of breast cancer detection data, the statistical rules of multiple objects and multiple indicators are analyzed in the case of their correlation. First of all, univariate diagnosis and multivariate diagnosis were performed on the data. Among them, when studying the correlation between variables, it was found that HOMA had a clear linear positive correlation with insulin content in blood. It is worth noting that some patients with breast cancer show a high degree of insulin resistance and blood insulin content, which is a feature not found in samples without breast cancer. Then, through single factor analysis of variance, we believe that there were significant differences in blood test conditions, ages, and BMI indicators of samples of different health conditions. Next, the principal component analysis was used to reduce the dimension of the data. In this study, the differences in age, BMI, and blood component content between the two groups with different health conditions can be summarized by these two independent factors. Among them, the absolute value of the MCP-1 (monocyte chemoattractant protein 1) coefficient in the main component 1 is large, reflecting the characteristics of the blood component of the sample; the load values of glucose and leptin in the main component 2 are large, reflecting similar results. Then, assuming the use of m = 3 factor model and the use of maximum likelihood method and principal component method, the original data and factor rotation data are re-analyzed, so that the variables are reduced to 3 factors for analysis. Among them, the maximum likelihood method is used to estimate the factor rotation data. The first factor reflects the insulin resistance factor attributed to insulin and HOMA indicators, and the second factor reflects the body fat and thin factor attributed to BMI and leptin. The third factor reflects the glucose content in the blood. Finally, by setting different misjudgment costs for discriminant analysis, the obtained APER is 0.1638 and EAER is 0.1872. Among them, the probability of discriminating patients with breast cancer from not having breast cancer is 0.09375, which is a low rate of misjudgment and also means the model established in this paper is efficient.
期刊介绍:
Applied and Computational Mathematics (ISSN Online: 2328-5613, ISSN Print: 2328-5605) is a prestigious journal that focuses on the field of applied and computational mathematics. It is driven by the computational revolution and places a strong emphasis on innovative applied mathematics with potential for real-world applicability and practicality.
The journal caters to a broad audience of applied mathematicians and scientists who are interested in the advancement of mathematical principles and practical aspects of computational mathematics. Researchers from various disciplines can benefit from the diverse range of topics covered in ACM. To ensure the publication of high-quality content, all research articles undergo a rigorous peer review process. This process includes an initial screening by the editors and anonymous evaluation by expert reviewers. This guarantees that only the most valuable and accurate research is published in ACM.