Samyakh Tukra, Haozheng Xu, Chi Xu, Stamatia Giannarou
{"title":"Generalizable stereo depth estimation with masked image modelling","authors":"Samyakh Tukra, Haozheng Xu, Chi Xu, Stamatia Giannarou","doi":"10.1049/htl2.12067","DOIUrl":"10.1049/htl2.12067","url":null,"abstract":"<p>Generalizable and accurate stereo depth estimation is vital for 3D reconstruction, especially in surgery. Supervised learning methods obtain best performance however, limited ground truth data for surgical scenes limits generalizability. Self-supervised methods don't need ground truth, but suffer from scale ambiguity and incorrect disparity prediction due to inconsistency of photometric loss. This work proposes a two-phase training procedure that is generalizable and retains the high performance of supervised methods. It entails: (1) performing self-supervised representation learning of left and right views via masked image modelling (MIM) to learn generalizable semantic stereo features (2) utilizing the MIM pre-trained model to learn robust depth representation via supervised learning for disparity estimation on synthetic data only. To improve stereo representations learnt via MIM, perceptual loss terms are introduced, which improve the model's stereo representations learnt by explicitly encouraging the learning of higher scene-level features. Qualitative and quantitative performance evaluation on surgical and natural scenes shows that the approach achieves sub-millimetre accuracy and lowest errors respectively, setting a new state-of-the-art. Despite not training on surgical nor natural scene data for disparity estimation.</p>","PeriodicalId":37474,"journal":{"name":"Healthcare Technology Letters","volume":"11 2-3","pages":"108-116"},"PeriodicalIF":2.1,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/htl2.12067","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139162763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daiwei Lu, Yifan Wu, Ayberk Acar, Xing Yao, Jie Ying Wu, Nicholas Kavoussi, Ipek Oguz
{"title":"ASSIST-U: A system for segmentation and image style transfer for ureteroscopy","authors":"Daiwei Lu, Yifan Wu, Ayberk Acar, Xing Yao, Jie Ying Wu, Nicholas Kavoussi, Ipek Oguz","doi":"10.1049/htl2.12065","DOIUrl":"10.1049/htl2.12065","url":null,"abstract":"<p>Kidney stones require surgical removal when they grow too large to be broken up externally or to pass on their own. Upper tract urothelial carcinoma is also sometimes treated endoscopically in a similar procedure. These surgeries are difficult, particularly for trainees who often miss tumours, stones or stone fragments, requiring re-operation. Furthermore, there are no patient-specific simulators to facilitate training or standardized visualization tools for ureteroscopy despite its high prevalence. Here a system ASSIST-U is proposed to create realistic ureteroscopy images and videos solely using preoperative computerized tomography (CT) images to address these unmet needs. A 3D UNet model is trained to automatically segment CT images and construct 3D surfaces. These surfaces are then skeletonized for rendering. Finally, a style transfer model is trained using contrastive unpaired translation (CUT) to synthesize realistic ureteroscopy images. Cross validation on the CT segmentation model achieved a Dice score of 0.853 <span></span><math>\u0000 <semantics>\u0000 <mo>±</mo>\u0000 <annotation>$pm$</annotation>\u0000 </semantics></math> 0.084. CUT style transfer produced visually plausible images; the kernel inception distance to real ureteroscopy images was reduced from 0.198 (rendered) to 0.089 (synthesized). The entire pipeline from CT to synthesized ureteroscopy is also qualitatively demonstrated. The proposed ASSIST-U system shows promise for aiding surgeons in the visualization of kidney ureteroscopy.</p>","PeriodicalId":37474,"journal":{"name":"Healthcare Technology Letters","volume":"11 2-3","pages":"40-47"},"PeriodicalIF":2.1,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/htl2.12065","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139174225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scale-preserving shape reconstruction from monocular endoscope image sequences by supervised depth learning","authors":"Takeshi Masuda, Ryusuke Sagawa, Ryo Furukawa, Hiroshi Kawasaki","doi":"10.1049/htl2.12064","DOIUrl":"10.1049/htl2.12064","url":null,"abstract":"<p>Reconstructing 3D shapes from images are becoming popular, but such methods usually estimate relative depth maps with ambiguous scales. A method for reconstructing a scale-preserving 3D shape from monocular endoscope image sequences through training an absolute depth prediction network is proposed. First, a dataset of synchronized sequences of RGB images and depth maps is created using an endoscope simulator. Then, a supervised depth prediction network is trained that estimates a depth map from a RGB image minimizing the loss compared to the ground-truth depth map. The predicted depth map sequence is aligned to reconstruct a 3D shape. Finally, the proposed method is applied to a real endoscope image sequence.</p>","PeriodicalId":37474,"journal":{"name":"Healthcare Technology Letters","volume":"11 2-3","pages":"76-84"},"PeriodicalIF":2.1,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/htl2.12064","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138997035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ayberk Acar, Jumanh Atoum, Amy Reed, Yizhou Li, Nicholas Kavoussi, Jie Ying Wu
{"title":"Intraoperative gaze guidance with mixed reality","authors":"Ayberk Acar, Jumanh Atoum, Amy Reed, Yizhou Li, Nicholas Kavoussi, Jie Ying Wu","doi":"10.1049/htl2.12061","DOIUrl":"10.1049/htl2.12061","url":null,"abstract":"<p>Efficient communication and collaboration are essential in the operating room for successful and safe surgery. While many technologies are improving various aspects of surgery, communication between attending surgeons, residents, and surgical teams is still limited to verbal interactions that are prone to misunderstandings. Novel modes of communication can increase speed and accuracy, and transform operating rooms. A mixed reality (MR) based gaze sharing application on Microsoft HoloLens 2 headset that can help expert surgeons indicate specific regions, communicate with decreased verbal effort, and guide novices throughout an operation is presented. The utility of the application is tested with a user study of endoscopic kidney stone localization completed by urology experts and novice surgeons. Improvement is observed in the NASA task load index surveys (up to 25.23%), in the success rate of the task (6.98% increase in localized stone percentage), and in gaze analyses (up to 31.99%). The proposed application shows promise in both operating room applications and surgical training tasks.</p>","PeriodicalId":37474,"journal":{"name":"Healthcare Technology Letters","volume":"11 2-3","pages":"85-92"},"PeriodicalIF":2.1,"publicationDate":"2023-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/htl2.12061","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139003664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ayberk Acar, Daiwei Lu, Yifan Wu, Ipek Oguz, Nicholas Kavoussi, Jie Ying Wu
{"title":"Towards navigation in endoscopic kidney surgery based on preoperative imaging","authors":"Ayberk Acar, Daiwei Lu, Yifan Wu, Ipek Oguz, Nicholas Kavoussi, Jie Ying Wu","doi":"10.1049/htl2.12059","DOIUrl":"10.1049/htl2.12059","url":null,"abstract":"<p>Endoscopic renal surgeries have high re-operation rates, particularly for lower volume surgeons. Due to the limited field and depth of view of current endoscopes, mentally mapping preoperative computed tomography (CT) images of patient anatomy to the surgical field is challenging. The inability to completely navigate the intrarenal collecting system leads to missed kidney stones and tumors, subsequently raising recurrence rates. A guidance system is proposed to estimate the endoscope positions within the CT to reduce re-operation rates. A Structure from Motion algorithm is used to reconstruct the kidney collecting system from the endoscope videos. In addition, the kidney collecting system is segmented from CT scans using 3D U-Net to create a 3D model. The two collecting system representations can then be registered to provide information on the relative endoscope position. Correct reconstruction and localization of intrarenal anatomy and endoscope position is demonstrated. Furthermore, a 3D map is created supported by the RGB endoscope images to reduce the burden of mental mapping during surgery. The proposed reconstruction pipeline has been validated for guidance. It can reduce the mental burden for surgeons and is a step towards the long-term goal of reducing re-operation rates in kidney stone surgery.</p>","PeriodicalId":37474,"journal":{"name":"Healthcare Technology Letters","volume":"11 2-3","pages":"67-75"},"PeriodicalIF":2.1,"publicationDate":"2023-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/htl2.12059","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139006696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reza Abbasi-Kesbi, Mohammad Fathi, Seyed Zaniyar Sajadi
{"title":"Movement examination of the lumbar spine using a developed wearable motion sensor","authors":"Reza Abbasi-Kesbi, Mohammad Fathi, Seyed Zaniyar Sajadi","doi":"10.1049/htl2.12063","DOIUrl":"10.1049/htl2.12063","url":null,"abstract":"<p>A system for monitoring spinal movements based on wearable motion sensors is proposed here. For this purpose, a hardware system is first developed that measures data of linear acceleration, angular velocity, and the magnetic field of the spine. Then, the obtained data from these sensors are combined in a proposed complementary filter, and their angular variations are estimated. The obtained results of angular variation of this system in comparison with an accurate reference illustrate that the root mean squared error is less than 1.61 degrees for three angles of <math>\u0000 <semantics>\u0000 <msub>\u0000 <mi>ϕ</mi>\u0000 <mi>r</mi>\u0000 </msub>\u0000 <annotation>$phi _r$</annotation>\u0000 </semantics></math>, <math>\u0000 <semantics>\u0000 <msub>\u0000 <mi>θ</mi>\u0000 <mi>r</mi>\u0000 </msub>\u0000 <annotation>$theta _r$</annotation>\u0000 </semantics></math> and <math>\u0000 <semantics>\u0000 <msub>\u0000 <mi>ψ</mi>\u0000 <mi>r</mi>\u0000 </msub>\u0000 <annotation>$psi _r$</annotation>\u0000 </semantics></math> for this system that proves this system can accurately estimate the angular variation of the spine. Then, the system is mounted on the lumbar spine of several volunteers, and the obtained angles from the patients' spine are compared with some healthy volunteers' spine, and the performance of their spine improves over time. The results show that this system can be very effective for patients who suffer from back problems and help in their recovery process a lot.</p>","PeriodicalId":37474,"journal":{"name":"Healthcare Technology Letters","volume":"10 6","pages":"122-132"},"PeriodicalIF":2.1,"publicationDate":"2023-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/htl2.12063","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138585476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-time surgical tool detection with multi-scale positional encoding and contrastive learning","authors":"Gerardo Loza, Pietro Valdastri, Sharib Ali","doi":"10.1049/htl2.12060","DOIUrl":"10.1049/htl2.12060","url":null,"abstract":"<p>Real-time detection of surgical tools in laparoscopic data plays a vital role in understanding surgical procedures, evaluating the performance of trainees, facilitating learning, and ultimately supporting the autonomy of robotic systems. Existing detection methods for surgical data need to improve processing speed and high prediction accuracy. Most methods rely on anchors or region proposals, limiting their adaptability to variations in tool appearance and leading to sub-optimal detection results. Moreover, using non-anchor-based detectors to alleviate this problem has been partially explored without remarkable results. An anchor-free architecture based on a transformer that allows real-time tool detection is introduced. The proposal is to utilize multi-scale features within the feature extraction layer and at the transformer-based detection architecture through positional encoding that can refine and capture context-aware and structural information of different-sized tools. Furthermore, a supervised contrastive loss is introduced to optimize representations of object embeddings, resulting in improved feed-forward network performances for classifying localized bounding boxes. The strategy demonstrates superiority to state-of-the-art (SOTA) methods. Compared to the most accurate existing SOTA (DSSS) method, the approach has an improvement of nearly 4% on mAP<span></span><math>\u0000 <semantics>\u0000 <msub>\u0000 <mrow></mrow>\u0000 <mn>50</mn>\u0000 </msub>\u0000 <annotation>$_{50}$</annotation>\u0000 </semantics></math> and a reduction in the inference time by 113%. It also showed a 7% higher mAP<span></span><math>\u0000 <semantics>\u0000 <msub>\u0000 <mrow></mrow>\u0000 <mn>50</mn>\u0000 </msub>\u0000 <annotation>$_{50}$</annotation>\u0000 </semantics></math> than the baseline model.</p>","PeriodicalId":37474,"journal":{"name":"Healthcare Technology Letters","volume":"11 2-3","pages":"48-58"},"PeriodicalIF":2.1,"publicationDate":"2023-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/htl2.12060","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138593167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guansen Tong, Jiayi Xu, Michael Pfister, Jumanh Atoum, Kavita Prasad, Alexis Miller, Michael Topf, Jie Ying Wu
{"title":"Development of an augmented reality guidance system for head and neck cancer resection","authors":"Guansen Tong, Jiayi Xu, Michael Pfister, Jumanh Atoum, Kavita Prasad, Alexis Miller, Michael Topf, Jie Ying Wu","doi":"10.1049/htl2.12062","DOIUrl":"10.1049/htl2.12062","url":null,"abstract":"<p>The use of head-mounted augmented reality (AR) for surgeries has grown rapidly in recent years. AR aids in intraoperative surgical navigation through overlaying three-dimensional (3D) holographic reconstructions of medical data. However, performing AR surgeries on complex areas such as the head and neck region poses challenges in terms of accuracy and speed. This study explores the feasibility of an AR guidance system for resections of positive tumour margins in a cadaveric specimen. The authors present an intraoperative solution that enables surgeons to upload and visualize holographic reconstructions of resected cadaver tissues. The solution involves using a 3D scanner to capture detailed scans of the resected tissue, which are subsequently uploaded into our software. The software converts the scans of resected tissues into specimen holograms that are viewable through a head-mounted AR display. By re-aligning these holograms with cadavers with gestures or voice commands, surgeons can navigate the head and neck tumour site. This workflow can run concurrently with frozen section analysis. On average, the authors achieve an uploading time of 2.98 min, visualization time of 1.05 min, and re-alignment time of 4.39 min, compared to the 20 to 30 min typical for frozen section analysis. The authors achieve a mean re-alignment error of 3.1 mm. The authors’ software provides a foundation for new research and product development for using AR to navigate complex 3D anatomy in surgery.</p>","PeriodicalId":37474,"journal":{"name":"Healthcare Technology Letters","volume":"11 2-3","pages":"93-100"},"PeriodicalIF":2.1,"publicationDate":"2023-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/htl2.12062","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138590745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proof-of-concept of a robotic-driven photogrammetric scanner for intra-operative knee cartilage repair","authors":"Álvaro Bertelsen, Amaia Iribar-Zabala, Ekiñe Otegi-Alvaro, Rafael Benito, Karen López-Linares, Iván Macía","doi":"10.1049/htl2.12054","DOIUrl":"10.1049/htl2.12054","url":null,"abstract":"<p>This work presents a proof-of-concept of a robotic-driven intra-operative scanner designed for knee cartilage lesion repair, part of a system for direct in vivo bioprinting. The proposed system is based on a photogrammetric pipeline, which reconstructs the cartilage and lesion surfaces from sets of photographs acquired by a robotic-handled endoscope, and produces 3D grafts for further printing path planning. A validation on a synthetic phantom is presented, showing that, despite the cartilage smooth and featureless surface, the current prototype can accurately reconstruct osteochondral lesions and their surroundings with mean error values of 0.199 ± 0.096 mm but with noticeable concentration on areas with poor lighting or low photographic coverage. The system can also accurately generate grafts for bioprinting, although with a slight tendency to underestimate the actual lesion sizes, producing grafts with coverage errors of −12.2 ± 3.7, −7.9 ± 4.9, and −15.2 ± 3.4% for the medio-lateral, antero-posterior, and craneo-caudal directions, respectively. Improvements in lighting and acquisition for enhancing reconstruction accuracy are planned as future work, as well as integration into a complete bioprinting pipeline and validation with ex vivo phantoms.</p>","PeriodicalId":37474,"journal":{"name":"Healthcare Technology Letters","volume":"11 2-3","pages":"59-66"},"PeriodicalIF":2.1,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/htl2.12054","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138598457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruiqing Wang, Jie Zhang, Shilin He, Huayuan Guo, Tao Li, Qin Zhong, Jun Ma, Jie Xu, Kunlun He
{"title":"Design and application of a novel telemedicine system jointly driven by multinetwork integration and remote control: Practical experience from PLAGH, China","authors":"Ruiqing Wang, Jie Zhang, Shilin He, Huayuan Guo, Tao Li, Qin Zhong, Jun Ma, Jie Xu, Kunlun He","doi":"10.1049/htl2.12057","DOIUrl":"10.1049/htl2.12057","url":null,"abstract":"<p>In China, several problems were common in the telemedicine systems, such as the poor network stability and difficult interconnection. A new telemedicine system jointly driven by multinetwork integration and remote control has been designed to address these problems. A multilink aggregation algorithm and an overlay network for telemedicine system (ONTMS) were developed to improve network stability, and a non-intervention remote control method was designed for Internet of Things (IoT) devices/systems. The authors monitored the network parameters, and distributed the questionnaire to participants, for evaluating the telemedicine system and services. Under a detection bandwidth of 8 Mbps, the aggregation parameters of Unicom 4G, Telecom 4G, and China Mobile 4G were optimal, with an uplink bandwidth, delay, and packet loss ratio (PLR) of 7.93 Mbps, 58.80 ms, and 0.06%, respectively. These parameters were significantly superior to those of China Mobile 4G, the best single network (<i>p</i> < 0.001). Through the ONTMS, the mean round-trip transporting delay from Beijing to Sanya was 76 ms, and the PLR was 0 at vast majority of time. A total of 1988 participants, including 1920 patients and 68 doctors, completed the questionnaires. More than 97% of participants felt that the audio and video transmission and remote control were fluent and convenient. 96% of patients rated the telemedicine services with scores of 4 or 5. This system has shown robust network property and excellent interaction ability, and satisfied the needs of patients and doctors.</p>","PeriodicalId":37474,"journal":{"name":"Healthcare Technology Letters","volume":"10 6","pages":"113-121"},"PeriodicalIF":2.1,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/htl2.12057","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138600245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}