{"title":"SketchTriplet:自监督场景化草图-文本-图像三重生成","authors":"Zhenbei Wu;Qiang Wang;Jie Yang","doi":"10.1109/JIOT.2024.3523382","DOIUrl":null,"url":null,"abstract":"Touchscreen Internet of Things (IoT) devices, such as smartphones and tablets, have been seamlessly integrated into our daily lives. Drawing sketches on the touch screen is an extremely convenient mode of interaction, and when combined with generative artificial intelligence (AI), it makes the customized data generation and digital twin based on IoT devices more straightforward. However, the scarcity of free-hand sketch data makes the construction of data-driven generative AI models a thorny issue. Despite the emergence of some large-scale sketch datasets, these datasets primarily consist of sketches at the single-object level. There continues to be a lack of large-scale paired datasets for scene sketches. In this article, we propose a self-supervised method for scene sketch generation that does not rely on any existing scene sketch, enabling the transformation of single-object sketches into scene sketches. To accomplish this, we introduce a method for vector sketch captioning and sketch semantic expansion. Additionally, we design a sketch generation network that incorporates a fusion of multimodal perceptual constraints, suitable for application in zero-shot image-to-sketch downstream task, demonstrating state-of-the-art performance through experimental validation. Finally, leveraging our proposed sketch-to-sketch generation method, we contribute a large-scale dataset centered around scene sketches, comprising highly semantically consistent “text-sketch–image” triplets. Our research can provide critical technical support for AI content creation and customized data generation based on IoT devices.","PeriodicalId":54347,"journal":{"name":"IEEE Internet of Things Journal","volume":"12 9","pages":"13021-13032"},"PeriodicalIF":8.9000,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SketchTriplet: Self-Supervised Scenarized Sketch–Text–Image Triplet Generation\",\"authors\":\"Zhenbei Wu;Qiang Wang;Jie Yang\",\"doi\":\"10.1109/JIOT.2024.3523382\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Touchscreen Internet of Things (IoT) devices, such as smartphones and tablets, have been seamlessly integrated into our daily lives. Drawing sketches on the touch screen is an extremely convenient mode of interaction, and when combined with generative artificial intelligence (AI), it makes the customized data generation and digital twin based on IoT devices more straightforward. However, the scarcity of free-hand sketch data makes the construction of data-driven generative AI models a thorny issue. Despite the emergence of some large-scale sketch datasets, these datasets primarily consist of sketches at the single-object level. There continues to be a lack of large-scale paired datasets for scene sketches. In this article, we propose a self-supervised method for scene sketch generation that does not rely on any existing scene sketch, enabling the transformation of single-object sketches into scene sketches. To accomplish this, we introduce a method for vector sketch captioning and sketch semantic expansion. Additionally, we design a sketch generation network that incorporates a fusion of multimodal perceptual constraints, suitable for application in zero-shot image-to-sketch downstream task, demonstrating state-of-the-art performance through experimental validation. Finally, leveraging our proposed sketch-to-sketch generation method, we contribute a large-scale dataset centered around scene sketches, comprising highly semantically consistent “text-sketch–image” triplets. Our research can provide critical technical support for AI content creation and customized data generation based on IoT devices.\",\"PeriodicalId\":54347,\"journal\":{\"name\":\"IEEE Internet of Things Journal\",\"volume\":\"12 9\",\"pages\":\"13021-13032\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-01-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Internet of Things Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10836756/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Internet of Things Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10836756/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Touchscreen Internet of Things (IoT) devices, such as smartphones and tablets, have been seamlessly integrated into our daily lives. Drawing sketches on the touch screen is an extremely convenient mode of interaction, and when combined with generative artificial intelligence (AI), it makes the customized data generation and digital twin based on IoT devices more straightforward. However, the scarcity of free-hand sketch data makes the construction of data-driven generative AI models a thorny issue. Despite the emergence of some large-scale sketch datasets, these datasets primarily consist of sketches at the single-object level. There continues to be a lack of large-scale paired datasets for scene sketches. In this article, we propose a self-supervised method for scene sketch generation that does not rely on any existing scene sketch, enabling the transformation of single-object sketches into scene sketches. To accomplish this, we introduce a method for vector sketch captioning and sketch semantic expansion. Additionally, we design a sketch generation network that incorporates a fusion of multimodal perceptual constraints, suitable for application in zero-shot image-to-sketch downstream task, demonstrating state-of-the-art performance through experimental validation. Finally, leveraging our proposed sketch-to-sketch generation method, we contribute a large-scale dataset centered around scene sketches, comprising highly semantically consistent “text-sketch–image” triplets. Our research can provide critical technical support for AI content creation and customized data generation based on IoT devices.
期刊介绍:
The EEE Internet of Things (IoT) Journal publishes articles and review articles covering various aspects of IoT, including IoT system architecture, IoT enabling technologies, IoT communication and networking protocols such as network coding, and IoT services and applications. Topics encompass IoT's impacts on sensor technologies, big data management, and future internet design for applications like smart cities and smart homes. Fields of interest include IoT architecture such as things-centric, data-centric, service-oriented IoT architecture; IoT enabling technologies and systematic integration such as sensor technologies, big sensor data management, and future Internet design for IoT; IoT services, applications, and test-beds such as IoT service middleware, IoT application programming interface (API), IoT application design, and IoT trials/experiments; IoT standardization activities and technology development in different standard development organizations (SDO) such as IEEE, IETF, ITU, 3GPP, ETSI, etc.