LGRL：用于即时 FG-SBIR 的局部-全局表征学习

IF 7.5 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Big Data Pub Date : 2024-01-19 DOI:10.1109/TBDATA.2024.3356393

Dawei Dai;Yingge Liu;Yutang Li;Shiyu Fu;Shuyin Xia;Guoyin Wang

{"title":"LGRL：用于即时 FG-SBIR 的局部-全局表征学习","authors":"Dawei Dai;Yingge Liu;Yutang Li;Shiyu Fu;Shuyin Xia;Guoyin Wang","doi":"10.1109/TBDATA.2024.3356393","DOIUrl":null,"url":null,"abstract":"On-the-fly Fine-grained sketch-based image retrieval (On-the-fly FG-SBIR) framework aim to break the barriers that sketch drawing requires excellent skills and is time-consuming. Considering such problems, a partial sketch with fewer strokes contains only the little local information, and the drawing process may show great difference among users, resulting in poor performance at the early retrieval. In this study, we developed a local-global representation learning (LGRL) method, in which we learn the representations for both the local and global regions of the partial sketch and its target photos. Specifically, we first designed a triplet network to learn the joint embedding space shared between the local and global regions of the entire sketch and its corresponding region of the photo. Then, we divided each partial sketch in the sketch-drawing episode into several local regions; Another learnable module following the triplet network was designed to learn the representations for the local regions of the partial sketch. Finally, by combining both the local and global regions of the sketches and photos, the final distance was determined. In the experiments, our method outperformed state-of-the-art baseline methods in terms of early retrieval efficiency on two publicly sketch-retrieval datasets and the practice test.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 4","pages":"543-555"},"PeriodicalIF":7.5000,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LGRL: Local-Global Representation Learning for On-the-Fly FG-SBIR\",\"authors\":\"Dawei Dai;Yingge Liu;Yutang Li;Shiyu Fu;Shuyin Xia;Guoyin Wang\",\"doi\":\"10.1109/TBDATA.2024.3356393\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"On-the-fly Fine-grained sketch-based image retrieval (On-the-fly FG-SBIR) framework aim to break the barriers that sketch drawing requires excellent skills and is time-consuming. Considering such problems, a partial sketch with fewer strokes contains only the little local information, and the drawing process may show great difference among users, resulting in poor performance at the early retrieval. In this study, we developed a local-global representation learning (LGRL) method, in which we learn the representations for both the local and global regions of the partial sketch and its target photos. Specifically, we first designed a triplet network to learn the joint embedding space shared between the local and global regions of the entire sketch and its corresponding region of the photo. Then, we divided each partial sketch in the sketch-drawing episode into several local regions; Another learnable module following the triplet network was designed to learn the representations for the local regions of the partial sketch. Finally, by combining both the local and global regions of the sketches and photos, the final distance was determined. In the experiments, our method outperformed state-of-the-art baseline methods in terms of early retrieval efficiency on two publicly sketch-retrieval datasets and the practice test.\",\"PeriodicalId\":13106,\"journal\":{\"name\":\"IEEE Transactions on Big Data\",\"volume\":\"10 4\",\"pages\":\"543-555\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-01-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Big Data\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10409584/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10409584/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

基于即时细粒度草图的图像检索（On-the-fly Fine-grained Sketch-based Image Retrieval，简称On-the-fly FG-SBIR）框架旨在打破草图绘制需要高超技巧和耗费时间的障碍。考虑到这些问题，笔画较少的局部草图仅包含很少的局部信息，而且用户之间的绘制过程可能存在很大差异，导致早期检索性能不佳。在本研究中，我们开发了一种局部-全局表示学习（LGRL）方法，即学习局部草图及其目标照片的局部区域和全局区域的表示。具体来说，我们首先设计了一个三元组网络来学习整个草图的局部和全局区域与照片的相应区域之间共享的联合嵌入空间。然后，我们将草图绘制过程中的每个局部草图划分为若干局部区域；继三重网络之后，我们又设计了另一个可学习模块，用于学习局部草图局部区域的表征。最后，结合草图和照片的局部区域和全局区域，确定最终距离。在实验中，在两个公开的草图检索数据集和实践测试中，我们的方法在早期检索效率方面优于最先进的基线方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

LGRL: Local-Global Representation Learning for On-the-Fly FG-SBIR

On-the-fly Fine-grained sketch-based image retrieval (On-the-fly FG-SBIR) framework aim to break the barriers that sketch drawing requires excellent skills and is time-consuming. Considering such problems, a partial sketch with fewer strokes contains only the little local information, and the drawing process may show great difference among users, resulting in poor performance at the early retrieval. In this study, we developed a local-global representation learning (LGRL) method, in which we learn the representations for both the local and global regions of the partial sketch and its target photos. Specifically, we first designed a triplet network to learn the joint embedding space shared between the local and global regions of the entire sketch and its corresponding region of the photo. Then, we divided each partial sketch in the sketch-drawing episode into several local regions; Another learnable module following the triplet network was designed to learn the representations for the local regions of the partial sketch. Finally, by combining both the local and global regions of the sketches and photos, the final distance was determined. In the experiments, our method outperformed state-of-the-art baseline methods in terms of early retrieval efficiency on two publicly sketch-retrieval datasets and the practice test.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Big Data Multiple-

CiteScore

11.80

自引率

2.80%

发文量

114

期刊介绍： The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.