科学勘探中自主微钻的多模态数字孪生。

IF 2.3 3区医学 Q3 ENGINEERING, BIOMEDICAL

International Journal of Computer Assisted Radiology and Surgery Pub Date : 2025-10-01 Epub Date: 2025-06-26 DOI:10.1007/s11548-025-03465-3

Saul Alexis Heredia Perez, Tze Lun Lok, Enduo Zhao, Kanako Harada

{"title":"科学勘探中自主微钻的多模态数字孪生。","authors":"Saul Alexis Heredia Perez, Tze Lun Lok, Enduo Zhao, Kanako Harada","doi":"10.1007/s11548-025-03465-3","DOIUrl":null,"url":null,"abstract":"Purpose: To support research on autonomous robotic micro-drilling for cranial window creation in mice, a multimodal digital twin (DT) is developed to generate realistic synthetic images and drilling sounds. The realism of the DT is evaluated using data from an eggshell drilling scenario, demonstrating its potential for training AI models with multimodal synthetic data.Methods: The asynchronous multi-body framework (AMBF) simulator for volumetric drilling with haptic feedback is combined with the Isaac Sim simulator for photorealistic rendering. A deep audio generator (DAG) model is presented and its realism is evaluated on real drilling sounds. A convolutional neural network (CNN) trained on synthetic images is used to assess visual realism by detecting drilling areas in real eggshell images. Finally, the accuracy of the DT is evaluated by experiments on a real eggshell.Results: The DAG model outperformed pitch modulation methods, achieving lower Frechet audio distance (FAD) and Frechet inception distance (FID) scores, demonstrating a closer resemblance to real drilling sounds. The CNN trained on synthetic images achieved a mean average precision (mAP) of 70.2 when tested on real drilling images. The DT had an alignment error of 0.22 ± 0.03 mm.Conclusion: A multimodal DT has been developed to simulate the creation of the cranial window on an eggshell model and its realism has been evaluated. The results indicate a high degree of realism in both the synthetic audio and images and submillimeter accuracy.","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"1987-1997"},"PeriodicalIF":2.3000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12518470/pdf/","citationCount":"0","resultStr":"{\"title\":\"A multimodal digital twin for autonomous micro-drilling in scientific exploration.\",\"authors\":\"Saul Alexis Heredia Perez, Tze Lun Lok, Enduo Zhao, Kanako Harada\",\"doi\":\"10.1007/s11548-025-03465-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: To support research on autonomous robotic micro-drilling for cranial window creation in mice, a multimodal digital twin (DT) is developed to generate realistic synthetic images and drilling sounds. The realism of the DT is evaluated using data from an eggshell drilling scenario, demonstrating its potential for training AI models with multimodal synthetic data.Methods: The asynchronous multi-body framework (AMBF) simulator for volumetric drilling with haptic feedback is combined with the Isaac Sim simulator for photorealistic rendering. A deep audio generator (DAG) model is presented and its realism is evaluated on real drilling sounds. A convolutional neural network (CNN) trained on synthetic images is used to assess visual realism by detecting drilling areas in real eggshell images. Finally, the accuracy of the DT is evaluated by experiments on a real eggshell.Results: The DAG model outperformed pitch modulation methods, achieving lower Frechet audio distance (FAD) and Frechet inception distance (FID) scores, demonstrating a closer resemblance to real drilling sounds. The CNN trained on synthetic images achieved a mean average precision (mAP) of 70.2 when tested on real drilling images. The DT had an alignment error of 0.22 ± 0.03 mm.Conclusion: A multimodal DT has been developed to simulate the creation of the cranial window on an eggshell model and its realism has been evaluated. The results indicate a high degree of realism in both the synthetic audio and images and submillimeter accuracy.\",\"PeriodicalId\":51251,\"journal\":{\"name\":\"International Journal of Computer Assisted Radiology and Surgery\",\"volume\":\" \",\"pages\":\"1987-1997\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12518470/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computer Assisted Radiology and Surgery\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s11548-025-03465-3\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/6/26 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, BIOMEDICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-025-03465-3","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/26 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

摘要

目的：为了支持自主机器人微钻孔在小鼠颅窗创造中的研究，开发了一个多模态数字孪生（DT）来生成逼真的合成图像和钻孔声音。使用蛋壳钻探场景的数据对DT的真实感进行了评估，展示了其使用多模态合成数据训练AI模型的潜力。方法：将带触觉反馈的体钻异步多体框架（AMBF）仿真器与Isaac Sim仿真器相结合，进行真实感渲染。提出了一种深度音频发生器（DAG）模型，并对该模型的真实感进行了评价。在合成图像上训练的卷积神经网络（CNN）通过检测真实蛋壳图像中的钻孔区域来评估视觉真实感。最后，通过在真实蛋壳上的实验，对DT的精度进行了评价。结果：DAG模型优于音高调制方法，实现了更低的Frechet音频距离（FAD）和Frechet起始距离（FID）分数，更接近真实钻井声音。在合成图像上训练的CNN在真实钻井图像上测试的平均精度（mAP）为70.2。结论：建立了一种模拟蛋壳模型颅骨窗口形成的多模态DT模型，并对其真实感进行了评价。结果表明，合成的音频和图像具有较高的真实感和亚毫米精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A multimodal digital twin for autonomous micro-drilling in scientific exploration.

Purpose: To support research on autonomous robotic micro-drilling for cranial window creation in mice, a multimodal digital twin (DT) is developed to generate realistic synthetic images and drilling sounds. The realism of the DT is evaluated using data from an eggshell drilling scenario, demonstrating its potential for training AI models with multimodal synthetic data.

Methods: The asynchronous multi-body framework (AMBF) simulator for volumetric drilling with haptic feedback is combined with the Isaac Sim simulator for photorealistic rendering. A deep audio generator (DAG) model is presented and its realism is evaluated on real drilling sounds. A convolutional neural network (CNN) trained on synthetic images is used to assess visual realism by detecting drilling areas in real eggshell images. Finally, the accuracy of the DT is evaluated by experiments on a real eggshell.

Results: The DAG model outperformed pitch modulation methods, achieving lower Frechet audio distance (FAD) and Frechet inception distance (FID) scores, demonstrating a closer resemblance to real drilling sounds. The CNN trained on synthetic images achieved a mean average precision (mAP) of 70.2 when tested on real drilling images. The DT had an alignment error of 0.22 ± 0.03 mm.

Conclusion: A multimodal DT has been developed to simulate the creation of the cranial window on an eggshell model and its realism has been evaluated. The results indicate a high degree of realism in both the synthetic audio and images and submillimeter accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Computer Assisted Radiology and Surgery ENGINEERING, BIOMEDICAL-RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

CiteScore

5.90

自引率

6.70%

发文量

243

审稿时长

6-12 weeks

期刊介绍： The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.