Machine Intelligence Research最新文献

Multitask Learning with Multiscale Residual Attention for Brain Tumor Segmentation and Classification 基于多尺度剩余注意的多任务学习在脑肿瘤分割分类中的应用

4区计算机科学

Machine Intelligence Research Pub Date : 2023-11-09 DOI: 10.1007/s11633-022-1392-6

Gaoxiang Li, Xiao Hui, Wenjing Li, Yanlin Luo

引用次数: 0

Shallow Feature-driven Dual-edges Localization Network for Weakly Supervised Localization 弱监督定位的浅特征驱动双边定位网络

4区计算机科学

Machine Intelligence Research Pub Date : 2023-11-09 DOI: 10.1007/s11633-022-1368-6

Wenjun Hui, Guanghua Gu, Bo Wang

引用次数: 0

Mask Distillation Network for Conjunctival Hyperemia Severity Classification 用于结膜充血严重程度分级的面罩蒸馏网络

4区计算机科学

Machine Intelligence Research Pub Date : 2023-11-09 DOI: 10.1007/s11633-022-1385-5

Mingchao Li, Kun Huang, Xiao Ma, Yuexuan Wang, Wen Fan, Qiang Chen

引用次数: 0

Effective Model Compression via Stage-wise Pruning 通过分阶段剪枝的有效模型压缩

4区计算机科学

Machine Intelligence Research Pub Date : 2023-11-09 DOI: 10.1007/s11633-022-1357-9

Ming-Yang Zhang, Xin-Yi Yu, Lin-Lin Ou

引用次数: 1

Rolling Shutter Camera: Modeling, Optimization and Learning 卷帘式相机:建模、优化和学习

4区计算机科学

Machine Intelligence Research Pub Date : 2023-11-09 DOI: 10.1007/s11633-022-1399-z

Bin Fan, Yuchao Dai, Mingyi He

引用次数: 1

State of the Art on Deep Learning-enhanced Rendering Methods 深度学习增强渲染方法的最新进展

4区计算机科学

Machine Intelligence Research Pub Date : 2023-11-09 DOI: 10.1007/s11633-022-1400-x

Qi Wang, Zhihua Zhong, Yuchi Huo, Hujun Bao, Rui Wang

引用次数: 0

Transmission Line Insulator Defect Detection Based on Swin Transformer and Context 基于Swin变压器和上下文的输电线路绝缘子缺陷检测

4区计算机科学

Machine Intelligence Research Pub Date : 2023-09-15 DOI: 10.1007/s11633-022-1355-y

Yu Xi, Ke Zhou, Ling-Wen Meng, Bo Chen, Hao-Min Chen, Jing-Yi Zhang

{"title":"Transmission Line Insulator Defect Detection Based on Swin Transformer and Context","authors":"Yu Xi, Ke Zhou, Ling-Wen Meng, Bo Chen, Hao-Min Chen, Jing-Yi Zhang","doi":"10.1007/s11633-022-1355-y","DOIUrl":"https://doi.org/10.1007/s11633-022-1355-y","url":null,"abstract":"Insulators are important components of power transmission lines. Once a failure occurs, it may cause a large-scale blackout and other hidden dangers. Due to the large image size and complex background, detecting small defect objects is a challenge. We make improvements based on the two-stage network Faster R-convolutional neural networks (CNN). First, we use a hierarchical Swin Transformer with shifted windows as the feature extraction network, instead of ResNet, to extract more discriminative features, and then design the deformable receptive field block to encode global and local context information, which is utilized to capture key clues for detecting objects in complex backgrounds. Finally, the filling data augmentation method is proposed for the problem of insufficient defects and more images of insulator defects under different backgrounds are added to the training set to improve the robustness of the model. As a result, the recall increases from 89.5% to 92.1%, and the average precision increases from 81.0% to 87.1%. To further prove the superiority of the proposed algorithm, we also tested the model on the public data set Pascal visual object classes (VOC), which also yields outstanding results.","PeriodicalId":29727,"journal":{"name":"Machine Intelligence Research","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135396705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

YOLO-CORE: Contour Regression for Efficient Instance Segmentation YOLO-CORE:高效实例分割的轮廓回归

4区计算机科学

Machine Intelligence Research Pub Date : 2023-09-15 DOI: 10.1007/s11633-022-1379-3

Haoliang Liu, Wei Xiong, Yu Zhang

{"title":"YOLO-CORE: Contour Regression for Efficient Instance Segmentation","authors":"Haoliang Liu, Wei Xiong, Yu Zhang","doi":"10.1007/s11633-022-1379-3","DOIUrl":"https://doi.org/10.1007/s11633-022-1379-3","url":null,"abstract":"Instance segmentation has drawn mounting attention due to its significant utility. However, high computational costs have been widely acknowledged in this domain, as the instance mask is generally achieved by pixel-level labeling. In this paper, we present a conceptually efficient contour regression network based on the you only look once (YOLO) architecture named YOLO-CORE for instance segmentation. The mask of the instance is efficiently acquired by explicit and direct contour regression using our designed multi-order constraint consisting of a polar distance loss and a sector loss. Our proposed YOLO-CORE yields impressive segmentation performance in terms of both accuracy and speed. It achieves 57.9% AP@0.5 with 47 FPS (frames per second) on the semantic boundaries dataset (SBD) and 51.1% AP@0.5 with 46 FPS on the COCO dataset. The superior performance achieved by our method with explicit contour regression suggests a new technique line in the YOLO-based image understanding field. Moreover, our instance segmentation design can be flexibly integrated into existing deep detectors with negligible computation cost (65.86 BFLOPs (billion float operations per second) to 66.15 BFLOPs with the YOLOv3 detector).","PeriodicalId":29727,"journal":{"name":"Machine Intelligence Research","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135396713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Practical Blind Image Denoising via Swin-Conv-UNet and Data Synthesis 基于swn - convn - unet和数据综合的实用盲图像去噪

4区计算机科学

Machine Intelligence Research Pub Date : 2023-09-15 DOI: 10.1007/s11633-023-1466-0

Zhang, Kai, Li, Yawei, Liang, Jingyun, Cao, Jiezhang, Zhang, Yulun, Tang, Hao, Timofte, Radu, Van Gool, Luc

{"title":"Practical Blind Image Denoising via Swin-Conv-UNet and Data Synthesis","authors":"Zhang, Kai, Li, Yawei, Liang, Jingyun, Cao, Jiezhang, Zhang, Yulun, Tang, Hao, Timofte, Radu, Van Gool, Luc","doi":"10.1007/s11633-023-1466-0","DOIUrl":"https://doi.org/10.1007/s11633-023-1466-0","url":null,"abstract":"While recent years have witnessed a dramatic upsurge of exploiting deep neural networks toward solving image denoising, existing methods mostly rely on simple noise assumptions, such as additive white Gaussian noise (AWGN), JPEG compression noise and camera sensor noise, and a general-purpose blind denoising method for real images remains unsolved. In this paper, we attempt to solve this problem from the perspective of network architecture design and training data synthesis. Specifically, for the network architecture design, we propose a swin-conv block to incorporate the local modeling ability of residual convolutional layer and non-local modeling ability of swin transformer block, and then plug it as the main building block into the widely-used image-to-image translation UNet architecture. For the training data synthesis, we design a practical noise degradation model which takes into consideration different kinds of noise (including Gaussian, Poisson, speckle, JPEG compression, and processed camera sensor noises) and resizing, and also involves a random shuffle strategy and a double degradation strategy. Extensive experiments on AGWN removal and real image denoising demonstrate that the new network architecture design achieves state-of-the-art performance and the new degradation model can help to significantly improve the practicability. We believe our work can provide useful insights into current denoising research.","PeriodicalId":29727,"journal":{"name":"Machine Intelligence Research","volume":"175 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135353684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 29

DepthFormer: Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation DepthFormer:利用远程相关和局部信息进行精确的单目深度估计

4区计算机科学

Machine Intelligence Research Pub Date : 2023-09-13 DOI: 10.1007/s11633-023-1458-0

Zhenyu Li, Zehui Chen, Xianming Liu, Junjun Jiang

{"title":"DepthFormer: Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation","authors":"Zhenyu Li, Zehui Chen, Xianming Liu, Junjun Jiang","doi":"10.1007/s11633-023-1458-0","DOIUrl":"https://doi.org/10.1007/s11633-023-1458-0","url":null,"abstract":"Abstract This paper aims to address the problem of supervised monocular depth estimation. We start with a meticulous pilot study to demonstrate that the long-range correlation is essential for accurate depth estimation. Moreover, the Transformer and convolution are good at long-range and close-range depth estimation, respectively. Therefore, we propose to adopt a parallel encoder architecture consisting of a Transformer branch and a convolution branch. The former can model global context with the effective attention mechanism and the latter aims to preserve the local information as the Transformer lacks the spatial inductive bias in modeling such contents. However, independent branches lead to a shortage of connections between features. To bridge this gap, we design a hierarchical aggregation and heterogeneous interaction module to enhance the Transformer features and model the affinity between the heterogeneous features in a set-to-set translation manner. Due to the unbearable memory cost introduced by the global attention on high-resolution feature maps, we adopt the deformable scheme to reduce the complexity. Extensive experiments on the KITTI, NYU, and SUN RGB-D datasets demonstrate that our proposed model, termed DepthFormer, surpasses state-of-the-art monocular depth estimation methods with prominent margins. The effectiveness of each proposed module is elaborately evaluated through meticulous and intensive ablation studies.","PeriodicalId":29727,"journal":{"name":"Machine Intelligence Research","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134990420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 63