Xinyan Yang , Fei Hu , Shaofei Liu , Long Ye , Ye Wang , Guanghua Zhu , Jiyin Li
{"title":"3D indoor scene assessment via layout plausibility","authors":"Xinyan Yang , Fei Hu , Shaofei Liu , Long Ye , Ye Wang , Guanghua Zhu , Jiyin Li","doi":"10.1016/j.displa.2025.102964","DOIUrl":null,"url":null,"abstract":"<div><div>As the amount of 3D scene data increases, plausibility quality assessment methods are urgently needed. The existing 3D scene assessment methods usually focus on visual but not semantical reasonability. The amounts and categories of the open-source 3D indoor scene data are still inadequate for training fully labeled learning assessment methods. In this paper, we build a minority category of 3D indoor scene assessment dataset 3D-SPAD-MI to extend the previous majority 3D-SPAD dataset. And expanding application scope and improving performance of the previous method 3D scene plausibility assessment network(3D-SPAN) by multimodality model(3D-SPAN-M) and few-shot learning(3D-SPAN-F). 3D-SPAN-M considers vision and semantics in 3D indoor scenes via fusing image and scene graph features. 3D-SPAN-F introduces multi-task meta-learning with prototypical networks into the 3D-SPAN so that it could evaluate more different categories of 3D indoor scenes. The comparison and ablation experiments verify performance improvement and generalization of our method.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"87 ","pages":"Article 102964"},"PeriodicalIF":3.7000,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938225000010","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
As the amount of 3D scene data increases, plausibility quality assessment methods are urgently needed. The existing 3D scene assessment methods usually focus on visual but not semantical reasonability. The amounts and categories of the open-source 3D indoor scene data are still inadequate for training fully labeled learning assessment methods. In this paper, we build a minority category of 3D indoor scene assessment dataset 3D-SPAD-MI to extend the previous majority 3D-SPAD dataset. And expanding application scope and improving performance of the previous method 3D scene plausibility assessment network(3D-SPAN) by multimodality model(3D-SPAN-M) and few-shot learning(3D-SPAN-F). 3D-SPAN-M considers vision and semantics in 3D indoor scenes via fusing image and scene graph features. 3D-SPAN-F introduces multi-task meta-learning with prototypical networks into the 3D-SPAN so that it could evaluate more different categories of 3D indoor scenes. The comparison and ablation experiments verify performance improvement and generalization of our method.
期刊介绍:
Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface.
Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.