Dynamic data balancing strategy-based Xception-dual-channel LSTM model for laparoscopic cholecystectomy phase recognition.

IF 2.3 3区 医学 Q3 ENGINEERING, BIOMEDICAL
Mingzhou Liu, Feiya Duan, Lin Ling, Jing Hu, Maogen Ge, Xi Zhang, Shanbao Cai
{"title":"Dynamic data balancing strategy-based Xception-dual-channel LSTM model for laparoscopic cholecystectomy phase recognition.","authors":"Mingzhou Liu, Feiya Duan, Lin Ling, Jing Hu, Maogen Ge, Xi Zhang, Shanbao Cai","doi":"10.1007/s11548-025-03509-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To enhance the temporal feature learning capability of the laparoscopic cholecystectomy phase recognition model and address the class imbalance issue in the training data, this paper proposes an Xception-dual-channel LSTM fusion model based on a dynamic data balancing strategy.</p><p><strong>Methods: </strong>The model dynamically adjusts the undersampling rate for each surgical phase, extracting short video clips from the original data as training samples to balance the data distribution and mitigate biased learning. The Xception model, utilizing depthwise separable convolutions, extracts fundamental visual features frame by frame, which are then passed to a dual-channel LSTM network. This network is composed of a temporal mapping bidirectional LSTM structure and a sequence embedding LSTM structure, both working in parallel. The dual-channel LSTM network models the temporal dependencies between adjacent frames, capturing the contextual temporal information to perceive the dynamic feature changes of the surgical phases. Finally, the surgical phase is determined by combining the prediction scores from both channels.</p><p><strong>Results: </strong>Experimental evaluation on the public dataset Cholec80 demonstrates that the proposed model outperforms traditional single-channel LSTM models. Moreover, compared to the model without the dynamic data balancing strategy, the F1-scores for all surgical phases have been improved.</p><p><strong>Conclusion: </strong>The experimental results validate the effectiveness of this strategy in extracting temporal feature information, alleviating the data class imbalance issue, and enhancing the overall detection performance of the model.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-025-03509-8","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: To enhance the temporal feature learning capability of the laparoscopic cholecystectomy phase recognition model and address the class imbalance issue in the training data, this paper proposes an Xception-dual-channel LSTM fusion model based on a dynamic data balancing strategy.

Methods: The model dynamically adjusts the undersampling rate for each surgical phase, extracting short video clips from the original data as training samples to balance the data distribution and mitigate biased learning. The Xception model, utilizing depthwise separable convolutions, extracts fundamental visual features frame by frame, which are then passed to a dual-channel LSTM network. This network is composed of a temporal mapping bidirectional LSTM structure and a sequence embedding LSTM structure, both working in parallel. The dual-channel LSTM network models the temporal dependencies between adjacent frames, capturing the contextual temporal information to perceive the dynamic feature changes of the surgical phases. Finally, the surgical phase is determined by combining the prediction scores from both channels.

Results: Experimental evaluation on the public dataset Cholec80 demonstrates that the proposed model outperforms traditional single-channel LSTM models. Moreover, compared to the model without the dynamic data balancing strategy, the F1-scores for all surgical phases have been improved.

Conclusion: The experimental results validate the effectiveness of this strategy in extracting temporal feature information, alleviating the data class imbalance issue, and enhancing the overall detection performance of the model.

基于异常双通道LSTM模型的腹腔镜胆囊切除术相位识别。
目的:为了增强腹腔镜胆囊切除术相位识别模型的时间特征学习能力,解决训练数据中的类不平衡问题,提出了一种基于动态数据平衡策略的异常-双通道LSTM融合模型。方法:该模型动态调整每个手术阶段的欠采样率,从原始数据中提取短视频片段作为训练样本,平衡数据分布,减轻偏向学习。Xception模型利用深度可分离卷积,逐帧提取基本视觉特征,然后将其传递到双通道LSTM网络。该网络由时间映射双向LSTM结构和序列嵌入LSTM结构组成,两者并行工作。双通道LSTM网络对相邻帧之间的时间依赖性进行建模,捕获上下文时间信息以感知手术阶段的动态特征变化。最后,结合两个通道的预测评分来确定手术阶段。结果:在公共数据集Cholec80上的实验评估表明,该模型优于传统的单通道LSTM模型。此外,与没有动态数据平衡策略的模型相比,所有手术阶段的f1评分都有所提高。结论:实验结果验证了该策略在提取时间特征信息、缓解数据类不平衡问题、提高模型整体检测性能方面的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Computer Assisted Radiology and Surgery
International Journal of Computer Assisted Radiology and Surgery ENGINEERING, BIOMEDICAL-RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
CiteScore
5.90
自引率
6.70%
发文量
243
审稿时长
6-12 weeks
期刊介绍: The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信