{"title":"maninem:表现导向的乳房x光筛查多模式预训练。","authors":"Xujun Li, Xin Wei, Jing Jiang, Danxiang Chen, Wei Zhang, Jinpeng Li","doi":"10.1016/j.compbiomed.2024.109628","DOIUrl":null,"url":null,"abstract":"<p><p>Breast cancer poses a significant health threat worldwide. Contrastive learning has emerged as an effective method to extract critical lesion features from mammograms, thereby offering a potent tool for breast cancer screening and analysis. A crucial aspect of contrastive learning is negative sampling, where the selection of hard negative samples is essential for driving representations to retain detailed lesion information. In large-scale contrastive learning applied to natural images, it is often assumed that extracted features can sufficiently capture semantic content, and that each mini-batch inherently includes ideal hard negative samples. However, the unique characteristics of breast lumps challenge these assumptions when dealing with mammographic data. In response, we introduce ManiNeg, a novel approach that leverages manifestations as proxies to select hard negative samples. As a condensed representation of a physician's domain knowledge, manifestations represent observable symptoms or signs of a disease and can provide a robust basis for choosing hard negative samples. This approach benefits from its invariance to model optimization, facilitating efficient sampling. We tested ManiNeg on the task of distinguishing between benign and malignant breast lumps. Our results demonstrate that ManiNeg not only improves representation in both unimodal and multimodal contexts but also offers benefits that extend to datasets beyond the initial pretraining phase. To support ManiNeg and future research endeavors, we have developed the MVKL mammographic dataset. This dataset includes multi-view mammograms, corresponding reports, meticulously annotated manifestations, and pathologically confirmed benign-malignant outcomes for each case. The MVKL dataset and our codes are publicly available at https://github.com/wxwxwwxxx/ManiNeg to foster further research within the community.</p>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"186 ","pages":"109628"},"PeriodicalIF":6.3000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ManiNeg: Manifestation-guided multimodal pretraining for mammography screening.\",\"authors\":\"Xujun Li, Xin Wei, Jing Jiang, Danxiang Chen, Wei Zhang, Jinpeng Li\",\"doi\":\"10.1016/j.compbiomed.2024.109628\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Breast cancer poses a significant health threat worldwide. Contrastive learning has emerged as an effective method to extract critical lesion features from mammograms, thereby offering a potent tool for breast cancer screening and analysis. A crucial aspect of contrastive learning is negative sampling, where the selection of hard negative samples is essential for driving representations to retain detailed lesion information. In large-scale contrastive learning applied to natural images, it is often assumed that extracted features can sufficiently capture semantic content, and that each mini-batch inherently includes ideal hard negative samples. However, the unique characteristics of breast lumps challenge these assumptions when dealing with mammographic data. In response, we introduce ManiNeg, a novel approach that leverages manifestations as proxies to select hard negative samples. As a condensed representation of a physician's domain knowledge, manifestations represent observable symptoms or signs of a disease and can provide a robust basis for choosing hard negative samples. This approach benefits from its invariance to model optimization, facilitating efficient sampling. We tested ManiNeg on the task of distinguishing between benign and malignant breast lumps. Our results demonstrate that ManiNeg not only improves representation in both unimodal and multimodal contexts but also offers benefits that extend to datasets beyond the initial pretraining phase. To support ManiNeg and future research endeavors, we have developed the MVKL mammographic dataset. This dataset includes multi-view mammograms, corresponding reports, meticulously annotated manifestations, and pathologically confirmed benign-malignant outcomes for each case. The MVKL dataset and our codes are publicly available at https://github.com/wxwxwwxxx/ManiNeg to foster further research within the community.</p>\",\"PeriodicalId\":10578,\"journal\":{\"name\":\"Computers in biology and medicine\",\"volume\":\"186 \",\"pages\":\"109628\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2025-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in biology and medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1016/j.compbiomed.2024.109628\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/26 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.compbiomed.2024.109628","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/26 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
ManiNeg: Manifestation-guided multimodal pretraining for mammography screening.
Breast cancer poses a significant health threat worldwide. Contrastive learning has emerged as an effective method to extract critical lesion features from mammograms, thereby offering a potent tool for breast cancer screening and analysis. A crucial aspect of contrastive learning is negative sampling, where the selection of hard negative samples is essential for driving representations to retain detailed lesion information. In large-scale contrastive learning applied to natural images, it is often assumed that extracted features can sufficiently capture semantic content, and that each mini-batch inherently includes ideal hard negative samples. However, the unique characteristics of breast lumps challenge these assumptions when dealing with mammographic data. In response, we introduce ManiNeg, a novel approach that leverages manifestations as proxies to select hard negative samples. As a condensed representation of a physician's domain knowledge, manifestations represent observable symptoms or signs of a disease and can provide a robust basis for choosing hard negative samples. This approach benefits from its invariance to model optimization, facilitating efficient sampling. We tested ManiNeg on the task of distinguishing between benign and malignant breast lumps. Our results demonstrate that ManiNeg not only improves representation in both unimodal and multimodal contexts but also offers benefits that extend to datasets beyond the initial pretraining phase. To support ManiNeg and future research endeavors, we have developed the MVKL mammographic dataset. This dataset includes multi-view mammograms, corresponding reports, meticulously annotated manifestations, and pathologically confirmed benign-malignant outcomes for each case. The MVKL dataset and our codes are publicly available at https://github.com/wxwxwwxxx/ManiNeg to foster further research within the community.
期刊介绍:
Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.