{"title":"使用融合先验文本和解剖学知识的多模态回归网络自动预测口腔舌鳞癌的浸润深度","authors":"Jiangchang Xu , Weiqing Tang , Pheng-Ann Heng , Xiaojun Chen","doi":"10.1016/j.media.2025.103824","DOIUrl":null,"url":null,"abstract":"<div><div>Oral tongue squamous cell carcinoma (OTSCC) is one of the most common malignant tumors in oral cancer. Its depth of invasion (DOI) serves as a crucial indicator for evaluating tumor invasiveness, predicting the risk of lymph node metastasis, and assessing patient prognosis. Compared to invasive measurement methods on pathology, DOI measurement on magnetic resonance imaging (MRI) is a non-invasive approach that can provide a timely reference for preoperative surgical planning. However, this method has several limitations, including a cumbersome measurement process, strong subjectivity, high experience requirements, and poor prediction stability. To address these issues, we propose an automatic prediction algorithm for OTSCC DOI using a multimodal regression network that fuses prior text and anatomical knowledge. First, the automatic segmentation of OTSCC is achieved using 3D nnUNet on multimodal MRI. Second, an automatic DOI measurement method that combines the detection of basement membrane landmarks with anatomical relationships is proposed to obtain 3D heatmap landmarks and prior DOI text. These elements are then fused into the proposed multimodal regression network to realize the automatic prediction of OTSCC DOI. Experimental results demonstrate that our method achieves a mean absolute error (MAE) of 2.11 mm, a root mean square error (RMSE) of 2.97 mm, and a mean squared error (MSE) of 8.81 mm<span><math><msup><mrow></mrow><mrow><mtext>2</mtext></mrow></msup></math></span>, which are markedly better than several state-of-the-art (SOTA) methods. The correlation with the pathological ground truth reaches a Pearson correlation coefficient (PCC) of 0.869, indicating high consistency. Additionally, our method outperforms the manual measurements of a resident doctor and a radiologist with six years of clinical experience. In the future, our method will have good clinical application prospects in OTSCC DOI prediction. The source code is available at <span><span>https://github.com/Lambater/Depth-of-invasion-prediction</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103824"},"PeriodicalIF":11.8000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automatic prediction of depth of invasion in oral tongue squamous cell carcinoma using a multimodal regression network fusing prior text and anatomical knowledge\",\"authors\":\"Jiangchang Xu , Weiqing Tang , Pheng-Ann Heng , Xiaojun Chen\",\"doi\":\"10.1016/j.media.2025.103824\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Oral tongue squamous cell carcinoma (OTSCC) is one of the most common malignant tumors in oral cancer. Its depth of invasion (DOI) serves as a crucial indicator for evaluating tumor invasiveness, predicting the risk of lymph node metastasis, and assessing patient prognosis. Compared to invasive measurement methods on pathology, DOI measurement on magnetic resonance imaging (MRI) is a non-invasive approach that can provide a timely reference for preoperative surgical planning. However, this method has several limitations, including a cumbersome measurement process, strong subjectivity, high experience requirements, and poor prediction stability. To address these issues, we propose an automatic prediction algorithm for OTSCC DOI using a multimodal regression network that fuses prior text and anatomical knowledge. First, the automatic segmentation of OTSCC is achieved using 3D nnUNet on multimodal MRI. Second, an automatic DOI measurement method that combines the detection of basement membrane landmarks with anatomical relationships is proposed to obtain 3D heatmap landmarks and prior DOI text. These elements are then fused into the proposed multimodal regression network to realize the automatic prediction of OTSCC DOI. Experimental results demonstrate that our method achieves a mean absolute error (MAE) of 2.11 mm, a root mean square error (RMSE) of 2.97 mm, and a mean squared error (MSE) of 8.81 mm<span><math><msup><mrow></mrow><mrow><mtext>2</mtext></mrow></msup></math></span>, which are markedly better than several state-of-the-art (SOTA) methods. The correlation with the pathological ground truth reaches a Pearson correlation coefficient (PCC) of 0.869, indicating high consistency. Additionally, our method outperforms the manual measurements of a resident doctor and a radiologist with six years of clinical experience. In the future, our method will have good clinical application prospects in OTSCC DOI prediction. The source code is available at <span><span>https://github.com/Lambater/Depth-of-invasion-prediction</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":18328,\"journal\":{\"name\":\"Medical image analysis\",\"volume\":\"107 \",\"pages\":\"Article 103824\"},\"PeriodicalIF\":11.8000,\"publicationDate\":\"2025-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical image analysis\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1361841525003706\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841525003706","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Automatic prediction of depth of invasion in oral tongue squamous cell carcinoma using a multimodal regression network fusing prior text and anatomical knowledge
Oral tongue squamous cell carcinoma (OTSCC) is one of the most common malignant tumors in oral cancer. Its depth of invasion (DOI) serves as a crucial indicator for evaluating tumor invasiveness, predicting the risk of lymph node metastasis, and assessing patient prognosis. Compared to invasive measurement methods on pathology, DOI measurement on magnetic resonance imaging (MRI) is a non-invasive approach that can provide a timely reference for preoperative surgical planning. However, this method has several limitations, including a cumbersome measurement process, strong subjectivity, high experience requirements, and poor prediction stability. To address these issues, we propose an automatic prediction algorithm for OTSCC DOI using a multimodal regression network that fuses prior text and anatomical knowledge. First, the automatic segmentation of OTSCC is achieved using 3D nnUNet on multimodal MRI. Second, an automatic DOI measurement method that combines the detection of basement membrane landmarks with anatomical relationships is proposed to obtain 3D heatmap landmarks and prior DOI text. These elements are then fused into the proposed multimodal regression network to realize the automatic prediction of OTSCC DOI. Experimental results demonstrate that our method achieves a mean absolute error (MAE) of 2.11 mm, a root mean square error (RMSE) of 2.97 mm, and a mean squared error (MSE) of 8.81 mm, which are markedly better than several state-of-the-art (SOTA) methods. The correlation with the pathological ground truth reaches a Pearson correlation coefficient (PCC) of 0.869, indicating high consistency. Additionally, our method outperforms the manual measurements of a resident doctor and a radiologist with six years of clinical experience. In the future, our method will have good clinical application prospects in OTSCC DOI prediction. The source code is available at https://github.com/Lambater/Depth-of-invasion-prediction.
期刊介绍:
Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.