Nav2Scene: Navigation-driven fine-tuning for robot-friendly scene generation

IF 2.2 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Graphical Models Pub Date : 2025-08-17 DOI:10.1016/j.gmod.2025.101287

Bowei Jiang , Tongyuan Bai , Peng Zheng , Tieru Wu , Rui Ma

{"title":"Nav2Scene: Navigation-driven fine-tuning for robot-friendly scene generation","authors":"Bowei Jiang , Tongyuan Bai , Peng Zheng , Tieru Wu , Rui Ma","doi":"10.1016/j.gmod.2025.101287","DOIUrl":null,"url":null,"abstract":"<div><div>The integration of embodied intelligence in indoor scene synthesis holds significant potential for future interior design applications. Nevertheless, prevailing methodologies for indoor scene synthesis predominantly adhere to data-driven learning paradigms. Despite achieving photorealistic 3D renderings through such approaches, current frameworks systematically neglect to incorporate agent-centric functional metrics essential for optimizing navigational topology and task-oriented interactivity in embodied AI systems like service robotics platforms or autonomous domestic assistants. For example, poorly arranged furniture may prevent robots from effectively interacting with the environment, and this issue cannot be fully resolved by merely introducing prior constraints. To fill this gap, we propose Nav2Scene, a novel plug-and-play fine-tuning mechanism that can be deployed on existing scene generators to enhance the suitability of generated scenes for efficient robot navigation. Specifically, we first introduce path planning score (PPS), which is defined based on the results of the path planning algorithm and can be used to evaluate the robot navigation suitability of a given scene. Then, we pre-compute the PPS of 3D scenes from existing datasets and train a ScoreNet to efficiently predict the PPS of the generated scenes. Finally, the predicted PPS is used to guide the fine-tuning of existing scene generators and produce indoor scenes with higher PPS, indicating improved suitability for robot navigation. We conduct experiments on the 3D-FRONT dataset for different tasks including scene generation, completion and re-arrangement. The results demonstrate that by incorporating our Nav2Scene mechanism, the fine-tuned scene generators can produce scenes with improved navigation compatibility for home robots, while maintaining superior or comparable performance in terms of scene quality and diversity.</div></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"141 ","pages":"Article 101287"},"PeriodicalIF":2.2000,"publicationDate":"2025-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Graphical Models","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1524070325000347","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

The integration of embodied intelligence in indoor scene synthesis holds significant potential for future interior design applications. Nevertheless, prevailing methodologies for indoor scene synthesis predominantly adhere to data-driven learning paradigms. Despite achieving photorealistic 3D renderings through such approaches, current frameworks systematically neglect to incorporate agent-centric functional metrics essential for optimizing navigational topology and task-oriented interactivity in embodied AI systems like service robotics platforms or autonomous domestic assistants. For example, poorly arranged furniture may prevent robots from effectively interacting with the environment, and this issue cannot be fully resolved by merely introducing prior constraints. To fill this gap, we propose Nav2Scene, a novel plug-and-play fine-tuning mechanism that can be deployed on existing scene generators to enhance the suitability of generated scenes for efficient robot navigation. Specifically, we first introduce path planning score (PPS), which is defined based on the results of the path planning algorithm and can be used to evaluate the robot navigation suitability of a given scene. Then, we pre-compute the PPS of 3D scenes from existing datasets and train a ScoreNet to efficiently predict the PPS of the generated scenes. Finally, the predicted PPS is used to guide the fine-tuning of existing scene generators and produce indoor scenes with higher PPS, indicating improved suitability for robot navigation. We conduct experiments on the 3D-FRONT dataset for different tasks including scene generation, completion and re-arrangement. The results demonstrate that by incorporating our Nav2Scene mechanism, the fine-tuned scene generators can produce scenes with improved navigation compatibility for home robots, while maintaining superior or comparable performance in terms of scene quality and diversity.

查看原文本刊更多论文

Nav2Scene：用于机器人友好场景生成的导航驱动微调

将具身智能集成到室内场景合成中，在未来的室内设计应用中具有巨大的潜力。然而，室内场景合成的主流方法主要坚持数据驱动的学习范式。尽管通过这些方法实现了逼真的3D渲染，但目前的框架系统地忽略了将以代理为中心的功能指标纳入优化导航拓扑和任务导向的交互性所必需的嵌入AI系统，如服务机器人平台或自主家庭助理。例如，摆放不当的家具可能会阻碍机器人与环境的有效互动，这个问题不能仅仅通过引入先验约束来完全解决。为了填补这一空白，我们提出了Nav2Scene，这是一种新型的即插即用微调机制，可以部署在现有的场景生成器上，以增强生成的场景对高效机器人导航的适用性。具体来说，我们首先引入路径规划分数（PPS），它是基于路径规划算法的结果定义的，可以用来评估给定场景下机器人的导航适用性。然后，我们从现有的数据集中预先计算3D场景的PPS，并训练ScoreNet来有效地预测生成场景的PPS。最后，利用预测的PPS指导现有场景生成器进行微调，生成更高PPS的室内场景，提高了机器人导航的适用性。我们在3D-FRONT数据集上进行了不同任务的实验，包括场景生成、补全和重新排列。结果表明，通过结合我们的Nav2Scene机制，经过微调的场景生成器可以生成具有改进的家庭机器人导航兼容性的场景，同时在场景质量和多样性方面保持优越或相当的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Graphical Models 工程技术-计算机：软件工程

CiteScore

3.60

自引率

5.90%

发文量

审稿时长

47 days

期刊介绍： Graphical Models is recognized internationally as a highly rated, top tier journal and is focused on the creation, geometric processing, animation, and visualization of graphical models and on their applications in engineering, science, culture, and entertainment. GMOD provides its readers with thoroughly reviewed and carefully selected papers that disseminate exciting innovations, that teach rigorous theoretical foundations, that propose robust and efficient solutions, or that describe ambitious systems or applications in a variety of topics. We invite papers in five categories: research (contributions of novel theoretical or practical approaches or solutions), survey (opinionated views of the state-of-the-art and challenges in a specific topic), system (the architecture and implementation details of an innovative architecture for a complete system that supports model/animation design, acquisition, analysis, visualization?), application (description of a novel application of know techniques and evaluation of its impact), or lecture (an elegant and inspiring perspective on previously published results that clarifies them and teaches them in a new way). GMOD offers its authors an accelerated review, feedback from experts in the field, immediate online publication of accepted papers, no restriction on color and length (when justified by the content) in the online version, and a broad promotion of published papers. A prestigious group of editors selected from among the premier international researchers in their fields oversees the review process.