The Visual Computer最新文献_第7页

Image super-resolution method based on the interactive fusion of transformer and CNN features 基于变压器与CNN特征交互融合的图像超分辨率方法

The Visual Computer Pub Date : 2023-11-03 DOI: 10.1007/s00371-023-03138-9

Jianxin Wang, Yongsong Zou, Osama Alfarraj, Pradip Kumar Sharma, Wael Said, Jin Wang

引用次数: 0

DCBFusion: an infrared and visible image fusion method through detail enhancement, contrast reserve and brightness balance DCBFusion:一种通过细节增强、对比度保留和亮度平衡实现红外和可见光图像融合的方法

The Visual Computer Pub Date : 2023-11-02 DOI: 10.1007/s00371-023-03134-z

Shenghui Sun, Kechen Song, Yi Man, Hongwen Dong, Yunhui Yan

引用次数: 0

Human body construction based on combination of parametric and nonparametric reconstruction methods 基于参数与非参数相结合的人体构造方法

The Visual Computer Pub Date : 2023-11-01 DOI: 10.1007/s00371-023-03122-3

Xihang Li, Guiqin Li, Tiancai Li, Peter Mitrouchev

引用次数: 0

A deep learning approach for anomaly detection in large-scale Hajj crowds 大规模朝觐人群异常检测的深度学习方法

The Visual Computer Pub Date : 2023-11-01 DOI: 10.1007/s00371-023-03124-1

Amnah Aldayri, Waleed Albattah

引用次数: 0

Multiple instance learning-based two-stage metric learning network for whole slide image classification 基于多实例学习的两阶段度量学习网络全幻灯片图像分类

The Visual Computer Pub Date : 2023-11-01 DOI: 10.1007/s00371-023-03131-2

Xiaoyu Li, Bei Yang, Tiandong Chen, Zheng Gao, Huijie Li

引用次数: 0

Annotate and retrieve in vivo images using hybrid self-organizing map 使用混合自组织地图对活体图像进行注释和检索

The Visual Computer Pub Date : 2023-10-31 DOI: 10.1007/s00371-023-03126-z

Parminder Kaur, Avleen Malhi, Husanbir Pannu

{"title":"Annotate and retrieve in vivo images using hybrid self-organizing map","authors":"Parminder Kaur, Avleen Malhi, Husanbir Pannu","doi":"10.1007/s00371-023-03126-z","DOIUrl":"https://doi.org/10.1007/s00371-023-03126-z","url":null,"abstract":"Abstract Multimodal retrieval has gained much attention lately due to its effectiveness over uni-modal retrieval. For instance, visual features often under-constrain the description of an image in content-based retrieval; however, another modality, such as collateral text, can be introduced to abridge the semantic gap and make the retrieval process more efficient. This article proposes the application of cross-modal fusion and retrieval on real in vivo gastrointestinal images and linguistic cues, as the visual features alone are insufficient for image description and to assist gastroenterologists. So, a cross-modal information retrieval approach has been proposed to retrieve related images given text and vice versa while handling the heterogeneity gap issue among the modalities. The technique comprises two stages: (1) individual modality feature learning; and (2) fusion of two trained networks. In the first stage, two self-organizing maps (SOMs) are trained separately using images and texts, which are clustered in the respective SOMs based on their similarity. In the second (fusion) stage, the trained SOMs are integrated using an associative network to enable cross-modal retrieval. The underlying learning techniques of the associative network include Hebbian learning and Oja learning (Improved Hebbian learning). The introduced framework can annotate images with keywords and illustrate keywords with images, and it can also be extended to incorporate more diverse modalities. Extensive experimentation has been performed on real gastrointestinal images obtained from a known gastroenterologist that have collateral keywords with each image. The obtained results proved the efficacy of the algorithm and its significance in aiding gastroenterologists in quick and pertinent decision making.","PeriodicalId":227044,"journal":{"name":"The Visual Computer","volume":"2002 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135813127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Understanding of multiple bending-sloping arched scenes based on angle projections 基于角度投影的多个弯曲倾斜拱形场景的理解

The Visual Computer Pub Date : 2023-10-30 DOI: 10.1007/s00371-023-03133-0

Luping Wang, Hui Wei

引用次数: 0

Making paper labels smart for augmented wine recognition 使纸质标签智能增强葡萄酒识别

The Visual Computer Pub Date : 2023-10-27 DOI: 10.1007/s00371-023-03119-y

Alessia Angeli, Lorenzo Stacchio, Lorenzo Donatiello, Alessandro Giacchè, Gustavo Marfia

{"title":"Making paper labels smart for augmented wine recognition","authors":"Alessia Angeli, Lorenzo Stacchio, Lorenzo Donatiello, Alessandro Giacchè, Gustavo Marfia","doi":"10.1007/s00371-023-03119-y","DOIUrl":"https://doi.org/10.1007/s00371-023-03119-y","url":null,"abstract":"Abstract An invisible layer of knowledge is progressively growing with the emergence of situated visualizations and reality-based information retrieval systems. In essence, digital content will overlap with real-world entities, eventually providing insights into the surrounding environment and useful information for the user. The implementation of such a vision may appear close, but many subtle details separate us from its fulfillment. This kind of implementation, as the overlap between rendered virtual annotations and the camera’s real-world view, requires different computer vision paradigms for object recognition and tracking which often require high computing power and large-scale datasets of images. Nevertheless, these resources are not always available, and in some specific domains, the lack of an appropriate reference dataset could be disruptive for a considered task. In this particular scenario, we here consider the problem of wine recognition to support an augmented reading of their labels. In fact, images of wine bottle labels may not be available as wineries periodically change their designs, product information regulations may vary, and specific bottles may be rare, making the label recognition process hard or even impossible. In this work, we present augmented wine recognition, an augmented reality system that exploits optical character recognition paradigms to interpret and exploit the text within a wine label, without requiring any reference image. Our experiments show that such a framework can overcome the limitations posed by image retrieval-based systems while exhibiting a comparable performance.","PeriodicalId":227044,"journal":{"name":"The Visual Computer","volume":"23 2-4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136318478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A rotation robust shape transformer for cartoon character recognition 一种用于卡通人物识别的旋转鲁棒形状变压器

The Visual Computer Pub Date : 2023-10-27 DOI: 10.1007/s00371-023-03123-2

Qi Jia, Xinyu Chen, Yi Wang, Xin Fan, Haibin Ling, Longin Jan Latecki

引用次数: 0

A nightshade crop leaf disease detection using enhance-nightshade-CNN for ground truth data 利用增强-茄类- cnn对地面真值数据进行茄类作物叶片病害检测

The Visual Computer Pub Date : 2023-10-27 DOI: 10.1007/s00371-023-03127-y

Barkha M. Joshi, Hetal Bhavsar

引用次数: 0