Ting Zhou, Yanjie Zhao, Xinyi Hou, Xiaoyu Sun, Kai Chen, Haoyu Wang
{"title":"Bridging Design and Development with Automated Declarative UI Code Generation","authors":"Ting Zhou, Yanjie Zhao, Xinyi Hou, Xiaoyu Sun, Kai Chen, Haoyu Wang","doi":"arxiv-2409.11667","DOIUrl":null,"url":null,"abstract":"Declarative UI frameworks have gained widespread adoption in mobile app\ndevelopment, offering benefits such as improved code readability and easier\nmaintenance. Despite these advantages, the process of translating UI designs\ninto functional code remains challenging and time-consuming. Recent\nadvancements in multimodal large language models (MLLMs) have shown promise in\ndirectly generating mobile app code from user interface (UI) designs. However,\nthe direct application of MLLMs to this task is limited by challenges in\naccurately recognizing UI components and comprehensively capturing interaction\nlogic. To address these challenges, we propose DeclarUI, an automated approach that\nsynergizes computer vision (CV), MLLMs, and iterative compiler-driven\noptimization to generate and refine declarative UI code from designs. DeclarUI\nenhances visual fidelity, functional completeness, and code quality through\nprecise component segmentation, Page Transition Graphs (PTGs) for modeling\ncomplex inter-page relationships, and iterative optimization. In our\nevaluation, DeclarUI outperforms baselines on React Native, a widely adopted\ndeclarative UI framework, achieving a 96.8% PTG coverage rate and a 98%\ncompilation success rate. Notably, DeclarUI demonstrates significant\nimprovements over state-of-the-art MLLMs, with a 123% increase in PTG coverage\nrate, up to 55% enhancement in visual similarity scores, and a 29% boost in\ncompilation success rate. We further demonstrate DeclarUI's generalizability\nthrough successful applications to Flutter and ArkUI frameworks.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"41 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11667","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Declarative UI frameworks have gained widespread adoption in mobile app
development, offering benefits such as improved code readability and easier
maintenance. Despite these advantages, the process of translating UI designs
into functional code remains challenging and time-consuming. Recent
advancements in multimodal large language models (MLLMs) have shown promise in
directly generating mobile app code from user interface (UI) designs. However,
the direct application of MLLMs to this task is limited by challenges in
accurately recognizing UI components and comprehensively capturing interaction
logic. To address these challenges, we propose DeclarUI, an automated approach that
synergizes computer vision (CV), MLLMs, and iterative compiler-driven
optimization to generate and refine declarative UI code from designs. DeclarUI
enhances visual fidelity, functional completeness, and code quality through
precise component segmentation, Page Transition Graphs (PTGs) for modeling
complex inter-page relationships, and iterative optimization. In our
evaluation, DeclarUI outperforms baselines on React Native, a widely adopted
declarative UI framework, achieving a 96.8% PTG coverage rate and a 98%
compilation success rate. Notably, DeclarUI demonstrates significant
improvements over state-of-the-art MLLMs, with a 123% increase in PTG coverage
rate, up to 55% enhancement in visual similarity scores, and a 29% boost in
compilation success rate. We further demonstrate DeclarUI's generalizability
through successful applications to Flutter and ArkUI frameworks.