Mingzhou Liu, Feiya Duan, Lin Ling, Jing Hu, Maogen Ge, Xi Zhang, Shanbao Cai
{"title":"Dynamic data balancing strategy-based Xception-dual-channel LSTM model for laparoscopic cholecystectomy phase recognition.","authors":"Mingzhou Liu, Feiya Duan, Lin Ling, Jing Hu, Maogen Ge, Xi Zhang, Shanbao Cai","doi":"10.1007/s11548-025-03509-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To enhance the temporal feature learning capability of the laparoscopic cholecystectomy phase recognition model and address the class imbalance issue in the training data, this paper proposes an Xception-dual-channel LSTM fusion model based on a dynamic data balancing strategy.</p><p><strong>Methods: </strong>The model dynamically adjusts the undersampling rate for each surgical phase, extracting short video clips from the original data as training samples to balance the data distribution and mitigate biased learning. The Xception model, utilizing depthwise separable convolutions, extracts fundamental visual features frame by frame, which are then passed to a dual-channel LSTM network. This network is composed of a temporal mapping bidirectional LSTM structure and a sequence embedding LSTM structure, both working in parallel. The dual-channel LSTM network models the temporal dependencies between adjacent frames, capturing the contextual temporal information to perceive the dynamic feature changes of the surgical phases. Finally, the surgical phase is determined by combining the prediction scores from both channels.</p><p><strong>Results: </strong>Experimental evaluation on the public dataset Cholec80 demonstrates that the proposed model outperforms traditional single-channel LSTM models. Moreover, compared to the model without the dynamic data balancing strategy, the F1-scores for all surgical phases have been improved.</p><p><strong>Conclusion: </strong>The experimental results validate the effectiveness of this strategy in extracting temporal feature information, alleviating the data class imbalance issue, and enhancing the overall detection performance of the model.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-025-03509-8","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: To enhance the temporal feature learning capability of the laparoscopic cholecystectomy phase recognition model and address the class imbalance issue in the training data, this paper proposes an Xception-dual-channel LSTM fusion model based on a dynamic data balancing strategy.
Methods: The model dynamically adjusts the undersampling rate for each surgical phase, extracting short video clips from the original data as training samples to balance the data distribution and mitigate biased learning. The Xception model, utilizing depthwise separable convolutions, extracts fundamental visual features frame by frame, which are then passed to a dual-channel LSTM network. This network is composed of a temporal mapping bidirectional LSTM structure and a sequence embedding LSTM structure, both working in parallel. The dual-channel LSTM network models the temporal dependencies between adjacent frames, capturing the contextual temporal information to perceive the dynamic feature changes of the surgical phases. Finally, the surgical phase is determined by combining the prediction scores from both channels.
Results: Experimental evaluation on the public dataset Cholec80 demonstrates that the proposed model outperforms traditional single-channel LSTM models. Moreover, compared to the model without the dynamic data balancing strategy, the F1-scores for all surgical phases have been improved.
Conclusion: The experimental results validate the effectiveness of this strategy in extracting temporal feature information, alleviating the data class imbalance issue, and enhancing the overall detection performance of the model.
期刊介绍:
The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.