{"title":"A multimodal digital twin for autonomous micro-drilling in scientific exploration.","authors":"Saul Alexis Heredia Perez, Tze Lun Lok, Enduo Zhao, Kanako Harada","doi":"10.1007/s11548-025-03465-3","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To support research on autonomous robotic micro-drilling for cranial window creation in mice, a multimodal digital twin (DT) is developed to generate realistic synthetic images and drilling sounds. The realism of the DT is evaluated using data from an eggshell drilling scenario, demonstrating its potential for training AI models with multimodal synthetic data.</p><p><strong>Methods: </strong>The asynchronous multi-body framework (AMBF) simulator for volumetric drilling with haptic feedback is combined with the Isaac Sim simulator for photorealistic rendering. A deep audio generator (DAG) model is presented and its realism is evaluated on real drilling sounds. A convolutional neural network (CNN) trained on synthetic images is used to assess visual realism by detecting drilling areas in real eggshell images. Finally, the accuracy of the DT is evaluated by experiments on a real eggshell.</p><p><strong>Results: </strong>The DAG model outperformed pitch modulation methods, achieving lower Frechet audio distance (FAD) and Frechet inception distance (FID) scores, demonstrating a closer resemblance to real drilling sounds. The CNN trained on synthetic images achieved a mean average precision (mAP) of 70.2 when tested on real drilling images. The DT had an alignment error of 0.22 ± 0.03 mm.</p><p><strong>Conclusion: </strong>A multimodal DT has been developed to simulate the creation of the cranial window on an eggshell model and its realism has been evaluated. The results indicate a high degree of realism in both the synthetic audio and images and submillimeter accuracy.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"1987-1997"},"PeriodicalIF":2.3000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12518470/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Assisted Radiology and Surgery","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11548-025-03465-3","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/26 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: To support research on autonomous robotic micro-drilling for cranial window creation in mice, a multimodal digital twin (DT) is developed to generate realistic synthetic images and drilling sounds. The realism of the DT is evaluated using data from an eggshell drilling scenario, demonstrating its potential for training AI models with multimodal synthetic data.
Methods: The asynchronous multi-body framework (AMBF) simulator for volumetric drilling with haptic feedback is combined with the Isaac Sim simulator for photorealistic rendering. A deep audio generator (DAG) model is presented and its realism is evaluated on real drilling sounds. A convolutional neural network (CNN) trained on synthetic images is used to assess visual realism by detecting drilling areas in real eggshell images. Finally, the accuracy of the DT is evaluated by experiments on a real eggshell.
Results: The DAG model outperformed pitch modulation methods, achieving lower Frechet audio distance (FAD) and Frechet inception distance (FID) scores, demonstrating a closer resemblance to real drilling sounds. The CNN trained on synthetic images achieved a mean average precision (mAP) of 70.2 when tested on real drilling images. The DT had an alignment error of 0.22 ± 0.03 mm.
Conclusion: A multimodal DT has been developed to simulate the creation of the cranial window on an eggshell model and its realism has been evaluated. The results indicate a high degree of realism in both the synthetic audio and images and submillimeter accuracy.
期刊介绍:
The International Journal for Computer Assisted Radiology and Surgery (IJCARS) is a peer-reviewed journal that provides a platform for closing the gap between medical and technical disciplines, and encourages interdisciplinary research and development activities in an international environment.