Rolando García, Shankar Srinivasan, Mehta Shashi, Frederick Coffman, Prasad Koduru
{"title":"不同的结构和数量染色体异常决定弥漫性大b细胞淋巴瘤的MYC状态,并有助于与伯基特淋巴瘤区分:使用无监督和人工智能驱动的预测模型的细胞遗传学数据分析。","authors":"Rolando García, Shankar Srinivasan, Mehta Shashi, Frederick Coffman, Prasad Koduru","doi":"10.1007/s00277-025-06508-6","DOIUrl":null,"url":null,"abstract":"<p><p>The aim of this study was to identify recurrent chromosome abnormalities (RCAs) to distinguish these entities and to test their specificities in a set of predictor models. The study analyzed publicly available cytogenetic data to construct models to predict DLBCL and BL. The Fisher Exact test (2-tail) was used to assess the significance of differences in the number of aberrations between groups, as well as to determine correlations between RCAs and the two entities. A p-value less than 0.05 was considered significant. Discrimination analysis was determined by the receiver operating curve (ROC). All analyses were performed using the R package. The SAS software package was used to develop a logistic regression model. Two subsequent supervised models were constructed using a larger dataset (n = 515) to confirm initial findings. A p-value < 0.05 was considered significant. Several RCAs were associated with DLBCL, including 1p-, 1q-, -2, + 3, -4, + 5, 6p gain, 6q-, + 7, -8, 9q-, -10/-15, -10/-14, + 11, +12, 14q-, 15q-, + 16, 16q-,17p-, + 18, 19p-, and 22q-. Of these, + 7, 15q-, + 16 and + 18 were more prevalent in MYC + DLBCL vs. BL, whereas 1q gain and 13q- were consistent with BL. The specificity of supervised models ranged from 90 to 100%, whereas the accuracy of the unsupervised logistic regression model was 85%. Our findings revealed unique RCAs that may be used in combination with model classifiers to augment diagnostic accuracy and help clinicians better manage these patients.</p>","PeriodicalId":8068,"journal":{"name":"Annals of Hematology","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Distinct structural and numerical chromosome abnormalities determine the MYC status in diffuse large B-Cell lymphoma and help differentiate from Burkitt lymphoma: a cytogenetic data analysis using unsupervised and AI-driven prediction models.\",\"authors\":\"Rolando García, Shankar Srinivasan, Mehta Shashi, Frederick Coffman, Prasad Koduru\",\"doi\":\"10.1007/s00277-025-06508-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The aim of this study was to identify recurrent chromosome abnormalities (RCAs) to distinguish these entities and to test their specificities in a set of predictor models. The study analyzed publicly available cytogenetic data to construct models to predict DLBCL and BL. The Fisher Exact test (2-tail) was used to assess the significance of differences in the number of aberrations between groups, as well as to determine correlations between RCAs and the two entities. A p-value less than 0.05 was considered significant. Discrimination analysis was determined by the receiver operating curve (ROC). All analyses were performed using the R package. The SAS software package was used to develop a logistic regression model. Two subsequent supervised models were constructed using a larger dataset (n = 515) to confirm initial findings. A p-value < 0.05 was considered significant. Several RCAs were associated with DLBCL, including 1p-, 1q-, -2, + 3, -4, + 5, 6p gain, 6q-, + 7, -8, 9q-, -10/-15, -10/-14, + 11, +12, 14q-, 15q-, + 16, 16q-,17p-, + 18, 19p-, and 22q-. Of these, + 7, 15q-, + 16 and + 18 were more prevalent in MYC + DLBCL vs. BL, whereas 1q gain and 13q- were consistent with BL. The specificity of supervised models ranged from 90 to 100%, whereas the accuracy of the unsupervised logistic regression model was 85%. Our findings revealed unique RCAs that may be used in combination with model classifiers to augment diagnostic accuracy and help clinicians better manage these patients.</p>\",\"PeriodicalId\":8068,\"journal\":{\"name\":\"Annals of Hematology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of Hematology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s00277-025-06508-6\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"HEMATOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Hematology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00277-025-06508-6","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEMATOLOGY","Score":null,"Total":0}
Distinct structural and numerical chromosome abnormalities determine the MYC status in diffuse large B-Cell lymphoma and help differentiate from Burkitt lymphoma: a cytogenetic data analysis using unsupervised and AI-driven prediction models.
The aim of this study was to identify recurrent chromosome abnormalities (RCAs) to distinguish these entities and to test their specificities in a set of predictor models. The study analyzed publicly available cytogenetic data to construct models to predict DLBCL and BL. The Fisher Exact test (2-tail) was used to assess the significance of differences in the number of aberrations between groups, as well as to determine correlations between RCAs and the two entities. A p-value less than 0.05 was considered significant. Discrimination analysis was determined by the receiver operating curve (ROC). All analyses were performed using the R package. The SAS software package was used to develop a logistic regression model. Two subsequent supervised models were constructed using a larger dataset (n = 515) to confirm initial findings. A p-value < 0.05 was considered significant. Several RCAs were associated with DLBCL, including 1p-, 1q-, -2, + 3, -4, + 5, 6p gain, 6q-, + 7, -8, 9q-, -10/-15, -10/-14, + 11, +12, 14q-, 15q-, + 16, 16q-,17p-, + 18, 19p-, and 22q-. Of these, + 7, 15q-, + 16 and + 18 were more prevalent in MYC + DLBCL vs. BL, whereas 1q gain and 13q- were consistent with BL. The specificity of supervised models ranged from 90 to 100%, whereas the accuracy of the unsupervised logistic regression model was 85%. Our findings revealed unique RCAs that may be used in combination with model classifiers to augment diagnostic accuracy and help clinicians better manage these patients.
期刊介绍:
Annals of Hematology covers the whole spectrum of clinical and experimental hematology, hemostaseology, blood transfusion, and related aspects of medical oncology, including diagnosis and treatment of leukemias, lymphatic neoplasias and solid tumors, and transplantation of hematopoietic stem cells. Coverage includes general aspects of oncology, molecular biology and immunology as pertinent to problems of human blood disease. The journal is associated with the German Society for Hematology and Medical Oncology, and the Austrian Society for Hematology and Oncology.