Pattern Analysis and Applications最新文献

筛选
英文 中文
A stacked convolutional neural network framework with multi-scale attention mechanism for text-independent voiceprint recognition 具有多尺度关注机制的堆叠卷积神经网络框架,适用于与文本无关的声纹识别
IF 3.9 4区 计算机科学
Pattern Analysis and Applications Pub Date : 2024-04-27 DOI: 10.1007/s10044-024-01278-9
V. Karthikeyan, S. Suja Priyadharsini
{"title":"A stacked convolutional neural network framework with multi-scale attention mechanism for text-independent voiceprint recognition","authors":"V. Karthikeyan, S. Suja Priyadharsini","doi":"10.1007/s10044-024-01278-9","DOIUrl":"https://doi.org/10.1007/s10044-024-01278-9","url":null,"abstract":"<p>Short-utterance speaker identification is a difficult area of study in natural language processing (NLP). Most cutting-edge experimental approaches for speech processing make use of convolutional neural networks (CNNs) and deep neural networks and analyse data in a unidirectional stream of time. In the past, approaches for identifying speakers that utilised CNNs often made use of highly dense or vast layers, leading to a large number of factors and significant computational expenses. In this article, we provide a novel multi-scale attention-focused 1-dimensional convolutional neural network (MSA-CNN) for recognising speakers that combines L1 and L2 norms. The multi-scale convolutional training architecture was developed to autonomously extract multi-scale characteristics of raw audio data by employing a variety of filter banks. In order for the multi-scale system to emphasis on important speaker feature characteristics in varying settings, a novel attention mechanism was built. In the end, it was combined and applied to the suggested multi-layered convolutional neural network framework to identify the speakers' labels. The recommended network model was tested on a number of standard voice databases and real time recorded corpus. The findings from the experiments demonstrate that our methodology outperformed a baseline CNN scheme (without an attention mechanism) in addition to conventional speaker identification techniques involving feature engineering, achieving an accuracy rate of 97.94% across numerous databases as well as distortion constraints.</p>","PeriodicalId":54639,"journal":{"name":"Pattern Analysis and Applications","volume":"21 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140811343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cotton crop classification using satellite images with score level fusion based hybrid model 利用基于分数级融合混合模型的卫星图像进行棉花作物分类
IF 3.9 4区 计算机科学
Pattern Analysis and Applications Pub Date : 2024-04-16 DOI: 10.1007/s10044-024-01257-0
Amandeep Kaur, Geetanjali Singla, Manjinder Singh, Amit Mittal, Ruchi Mittal, Varun Malik
{"title":"Cotton crop classification using satellite images with score level fusion based hybrid model","authors":"Amandeep Kaur, Geetanjali Singla, Manjinder Singh, Amit Mittal, Ruchi Mittal, Varun Malik","doi":"10.1007/s10044-024-01257-0","DOIUrl":"https://doi.org/10.1007/s10044-024-01257-0","url":null,"abstract":"<p>Accurate cotton images are significant component for surveiling cotton development and its precise control. A suitable technique for charting the distribution of cotton at the county or field level must be available to researchers and production managers. The classification of cotton remote sensing models at the county level has significant implications for precision farming, land management, and government decision-making. This work aims to develop a novel cotton crop classification model using satellite images based on soil behaviour. It includes phases like preprocessing, segmentation, feature extraction, and classification. Here, preprocessing is carried out by Gaussian filtering to improve the quality of the input image. Then Modified Deep Joint Segmentation method is employed for the segmentation process. The features such as wide dynamic range vegetation index, simple ratio, Green Chlorophyll index, Transformed vegetation index, and Green leaf area index are extracted for classifying the input. The hybrid Improved CNN (ICNN) and Bidirectional Gated recurrent Unit (Bi-GRU) have used for classification purposes, which is computed by the improved score level fusion. The suggested new hybrid optimization model known as the Battle Royale assisted Butterfly optimization algorithm (BRABOA) is used for adjusting the hidden neuron count of both the ICNN and Bi-GRU classifiers for improving the accuracy. At last, the efficiency of the suggested model is then evaluated to other schemes using a variety of metrics. The suggested HC + BRABOA method obtains a maximum accuracy of (0.95) over conventional methods at a learning percentage of 90% for classifying cotton crops using satellite images.</p>","PeriodicalId":54639,"journal":{"name":"Pattern Analysis and Applications","volume":"32 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatio-temporal trajectory data modeling for fishing gear classification 为渔具分类建立时空轨迹数据模型
IF 3.9 4区 计算机科学
Pattern Analysis and Applications Pub Date : 2024-04-15 DOI: 10.1007/s10044-024-01263-2
Juan Manuel Rodriguez-Albala, Alejandro Peña, Pietro Melzi, Aythami Morales, Ruben Tolosana, Julian Fierrez, Ruben Vera-Rodriguez, Javier Ortega-Garcia
{"title":"Spatio-temporal trajectory data modeling for fishing gear classification","authors":"Juan Manuel Rodriguez-Albala, Alejandro Peña, Pietro Melzi, Aythami Morales, Ruben Tolosana, Julian Fierrez, Ruben Vera-Rodriguez, Javier Ortega-Garcia","doi":"10.1007/s10044-024-01263-2","DOIUrl":"https://doi.org/10.1007/s10044-024-01263-2","url":null,"abstract":"<p>International Organizations urge the protection of our oceans and their ecosystems due to their immeasurable importance to humankind. Since illegal fishing activities, commonly known as IUU fishing, cause irreparable damage to these ecosystems, concerned organisms are pushing to detect and combat IUU fishing practices. The automatic identification system allows to locate the position and trajectory of fishing vessels. In this study we address the task of detecting vessels’ fishing gears based on the trajectory behavior defined by GPS position data, a useful task to prevent the proliferation of IUU fishing practices. We present a new database including trajectories that span 7 different fishing gears and analyze these as in a time sequence analysis problem. We leverage from feature extraction techniques from the online signature verification domain to model vessel trajectories, and extract relevant information in the form of both local and global feature sets. We show how, based on these sets of features, the kinematics of vessels according to different fishing gears can be effectively classified using common supervised learning algorithms with accuracies up to <span>(90%)</span>. Furthermore, motivated by the concerns raised by several organizations on the adverse impact of bottom trawling on marine biodiversity, we present a binary classification experiment in which we were able to distinguish this kind of fishing gear with an accuracy of <span>(99%)</span>. We also illustrate in an ablation study the relevance of factors such as data availability and the sampling period to perform fishing gear classification. Compared to existing works, we highlight these factors, especially the importance of using sampling periods in the order of minutes instead of hours.</p>","PeriodicalId":54639,"journal":{"name":"Pattern Analysis and Applications","volume":"65 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-recall calibration monitoring for stereo cameras 立体摄像机的高召回率校准监控
IF 3.9 4区 计算机科学
Pattern Analysis and Applications Pub Date : 2024-04-13 DOI: 10.1007/s10044-024-01264-1
Jaroslav Moravec, Radim Šára
{"title":"High-recall calibration monitoring for stereo cameras","authors":"Jaroslav Moravec, Radim Šára","doi":"10.1007/s10044-024-01264-1","DOIUrl":"https://doi.org/10.1007/s10044-024-01264-1","url":null,"abstract":"<p>Cameras are the prevalent sensors used for perception in autonomous robotic systems, but their initial calibration may degrade over time due to dynamic factors. This may lead to a failure of downstream tasks, such as simultaneous localization and mapping (SLAM) or object recognition. Hence, a computationally lightweight process that detects the decalibration is of interest. We describe a modification of StOCaMo, an online calibration monitoring procedure for a stereoscopic system. The method uses robust kernel correlation based on epipolar constraints; it validates extrinsic calibration parameters on a single frame with no temporal tracking. In this paper, we present a modified StOCaMo with an improved recall rate on small decalibrations through a confirmation technique based on resampled variance. With fixed parameters learned on a realistic synthetic dataset from CARLA, StOCaMo and its proposed modification were tested on multiple sequences from two real-world datasets: KITTI and EuRoC MAV. The modification improved the recall of StOCaMo by 25 % (to 91 % and 82 %, respectively), and the accuracy by 12 % (to 94.7 % and 87.5 %, respectively), while labeling at most one-third of the input data as uninformative. The upgraded method achieved the rank correlation between StOCaMo V-index and downstream SLAM error of 0.78 (Spearman).</p>","PeriodicalId":54639,"journal":{"name":"Pattern Analysis and Applications","volume":"3 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The limitations of differentiable architecture search 可微分架构搜索的局限性
IF 3.9 4区 计算机科学
Pattern Analysis and Applications Pub Date : 2024-04-12 DOI: 10.1007/s10044-024-01260-5
Lacharme Guillaume, Cardot Hubert, Lente Christophe, Monmarche Nicolas
{"title":"The limitations of differentiable architecture search","authors":"Lacharme Guillaume, Cardot Hubert, Lente Christophe, Monmarche Nicolas","doi":"10.1007/s10044-024-01260-5","DOIUrl":"https://doi.org/10.1007/s10044-024-01260-5","url":null,"abstract":"<p>In this paper, we will provide a detailed explanation of the limitations behind differentiable architecture search (DARTS). Algorithms based on the DARTS paradigm tend to converge towards degenerate solutions. A degenerate solution corresponds to an architecture with a shallow graph containing mainly skip connections. We have identified 6 sources of errors that could explain this phenomenon. Some of these errors can only be partially eliminated. Therefore, we will propose an innovative solution to remove degenerate solutions from the search space. We will demonstrate the validity of our approach through experiments conducted on the CIFAR10 and CIFAR100 databases. Our code is available at the following link: https://scm.univ-tours.fr/projetspublics/lifat/darts_ibpria_sparcity</p>","PeriodicalId":54639,"journal":{"name":"Pattern Analysis and Applications","volume":"65 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Focalize K-NN: an imputation algorithm for time series datasets Focalize K-NN:时间序列数据集的估算算法
IF 3.9 4区 计算机科学
Pattern Analysis and Applications Pub Date : 2024-04-07 DOI: 10.1007/s10044-024-01262-3
Ana Almeida, Susana Brás, Susana Sargento, Filipe Cabral Pinto
{"title":"Focalize K-NN: an imputation algorithm for time series datasets","authors":"Ana Almeida, Susana Brás, Susana Sargento, Filipe Cabral Pinto","doi":"10.1007/s10044-024-01262-3","DOIUrl":"https://doi.org/10.1007/s10044-024-01262-3","url":null,"abstract":"<p>The effective use of time series data is crucial in business decision-making. Temporal data reveals temporal trends and patterns, enabling decision-makers to make informed decisions and prevent potential problems. However, missing values in time series data can interfere with the analysis and lead to inaccurate conclusions. Thus, our work proposes a Focalize K-NN method that leverages time series properties to perform missing data imputation. This approach shows the benefits of taking advantage of correlated features and temporal lags to improve the performance of the traditional K-NN imputer. A similar approach could be employed in other methods. We tested this approach with two datasets, various parameter and feature combinations, and observed that it is beneficial in scenarios with disjoint missing patterns. Our findings demonstrate the effectiveness of Focalize K-NN for imputing missing values in time series data. The more noticeable benefits of our methods occur when there is a high percentage of missing data. However, as the amount of missing data increases, so does the error.</p>","PeriodicalId":54639,"journal":{"name":"Pattern Analysis and Applications","volume":"9 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial–temporal attention with graph and general neural network-based sign language recognition 基于图形和通用神经网络的时空注意力手语识别
IF 3.9 4区 计算机科学
Pattern Analysis and Applications Pub Date : 2024-04-04 DOI: 10.1007/s10044-024-01229-4
Abu Saleh Musa Miah, Md. Al Mehedi Hasan, Yuichi Okuyama, Yoichi Tomioka, Jungpil Shin
{"title":"Spatial–temporal attention with graph and general neural network-based sign language recognition","authors":"Abu Saleh Musa Miah, Md. Al Mehedi Hasan, Yuichi Okuyama, Yoichi Tomioka, Jungpil Shin","doi":"10.1007/s10044-024-01229-4","DOIUrl":"https://doi.org/10.1007/s10044-024-01229-4","url":null,"abstract":"<p>Automatic sign language recognition (SLR) stands as a vital aspect within the realms of human–computer interaction and computer vision, facilitating the conversion of hand signs utilized by individuals with significant hearing and speech impairments into equivalent text or voice. Researchers have recently used hand skeleton joint information instead of the image pixel due to light illumination and complex background-bound problems. However, besides the hand information, body motion and facial gestures play an essential role in expressing sign language emotion. Also, a few researchers have been working to develop an SLR system by taking a multi-gesture dataset, but their performance accuracy and time complexity are not sufficient. In light of these limitations, we introduce a spatial and temporal attention model amalgamated with a general neural network designed for the SLR system. The main idea of our architecture is first to construct a fully connected graph to project the skeleton information. We employ self-attention mechanisms to extract insights from node and edge features across spatial and temporal domains. Our architecture bifurcates into three branches: a graph-based spatial branch, a graph-based temporal branch, and a general neural network branch, which collectively synergize to contribute to the final feature integration. Specifically, the spatial branch discerns spatial dependencies, while the temporal branch amplifies temporal dependencies embedded within the sequential hand skeleton data. Further, the general neural network branch enhances the architecture’s generalization capabilities, bolstering its robustness. In our evaluation, utilizing the Mexican Sign Language (MSL), Pakistani Sign Language (PSL) datasets, and American Sign Language Large Video dataset which comprises 3D joint coordinates for face, body, and hands that conducted experiments on individual gestures and their combinations. Impressively, our model demonstrated notable efficacy, achieving an accuracy rate of 99.96% for the MSL dataset, 92.00% for PSL, and 26.00% for the ASLLVD dataset, which includes more than 2700 classes. These exemplary performance metrics, coupled with the model’s computationally efficient profile, underscore its preeminence compared to contemporaneous methodologies in the field.</p>","PeriodicalId":54639,"journal":{"name":"Pattern Analysis and Applications","volume":"23 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140585406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tiny polyp detection from endoscopic video frames using vision transformers 利用视觉变换器从内窥镜视频帧中检测微小息肉
IF 3.9 4区 计算机科学
Pattern Analysis and Applications Pub Date : 2024-04-04 DOI: 10.1007/s10044-024-01254-3
Entong Liu, Bishi He, Darong Zhu, Yuanjiao Chen, Zhe Xu
{"title":"Tiny polyp detection from endoscopic video frames using vision transformers","authors":"Entong Liu, Bishi He, Darong Zhu, Yuanjiao Chen, Zhe Xu","doi":"10.1007/s10044-024-01254-3","DOIUrl":"https://doi.org/10.1007/s10044-024-01254-3","url":null,"abstract":"<p>Deep learning techniques can be effective in helping doctors diagnose gastrointestinal polyps. Currently, processing video frame sequences containing a large amount of spurious noise in polyp detection suffers from elevated recall and mean average precision. Moreover, the mean average precision is also low when the polyp target in the video frame has large-scale variability. Therefore, we propose a tiny polyp detection from endoscopic video frames using Vision Transformers, named TPolyp. The proposed method uses a cross-stage Swin Transformer as a multi-scale feature extractor to extract deep feature representations of data samples, improves the bidirectional sampling feature pyramid, and integrates the prediction heads of multiple channel self-attention mechanisms. This approach focuses more on the feature information of the tiny object detection task than convolutional neural networks and retains relatively deeper semantic information. It additionally improves feature expression and discriminability without increasing the computational complexity. Experimental results show that TPolyp improves detection accuracy by 7%, recall by 7.3%, and average accuracy by 7.5% compared to the YOLOv5 model, and has better tiny object detection in scenarios with blurry artifacts.</p>","PeriodicalId":54639,"journal":{"name":"Pattern Analysis and Applications","volume":"45 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140582268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CABF-YOLO: a precise and efficient deep learning method for defect detection on strip steel surface CABF-YOLO:用于带钢表面缺陷检测的精确高效深度学习方法
IF 3.9 4区 计算机科学
Pattern Analysis and Applications Pub Date : 2024-04-03 DOI: 10.1007/s10044-024-01252-5
{"title":"CABF-YOLO: a precise and efficient deep learning method for defect detection on strip steel surface","authors":"","doi":"10.1007/s10044-024-01252-5","DOIUrl":"https://doi.org/10.1007/s10044-024-01252-5","url":null,"abstract":"<h3>Abstract</h3> <p>Deep learning algorithms have gained widespread usage in defect detection systems. However, existing methods are not satisfied for large-scale applications on surface defect detection of strip steel. In this paper, we propose a precise and efficient detection model, named CABF-YOLO, based on the YOLOX for strip steel surface defects. Firstly, we introduce the Triplet Convolutional Coordinate Attention (TCCA) module in the backbone of the YOLOX. By factorizing the pooling operation, the TCCA module can accurately capture cross-channel features to identify the location information of defects. Secondly, we design a novel Bidirectional Fusion (BF) strategy in the neck of the YOLOX. The BF strategy enhances the fusion of low-level and high-level semantic information to obtain fine-grained information. Lastly, the original bounding box loss function is replaced by the EIoU loss function. In the EIoU loss function, the penalty term is redefined to consider the overlap area, central point, and side length of the required regressions to accelerate the convergence rate and localization accuracy. On the benchmark NEU-DET dataset and GC10-DET dataset, the experimental results show that the CABF-YOLO achieves superior performance compared with other comparison models and satisfies the real-time detection requirement of industrial production.</p>","PeriodicalId":54639,"journal":{"name":"Pattern Analysis and Applications","volume":"49 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140585401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A weakly supervised end-to-end framework for semantic segmentation of cancerous area in whole slide image 用于整张幻灯片图像癌症区域语义分割的弱监督端到端框架
IF 3.9 4区 计算机科学
Pattern Analysis and Applications Pub Date : 2024-04-02 DOI: 10.1007/s10044-024-01251-6
Yanbo Feng, Adel Hafiane, Hélène Laurent
{"title":"A weakly supervised end-to-end framework for semantic segmentation of cancerous area in whole slide image","authors":"Yanbo Feng, Adel Hafiane, Hélène Laurent","doi":"10.1007/s10044-024-01251-6","DOIUrl":"https://doi.org/10.1007/s10044-024-01251-6","url":null,"abstract":"<p>The segmentation of pathological image is an indispensable content in the cancerous diagnosis and grading, which is provided to doctors for the location and quantitative analysis of pathologically altered tissue. However, pathological whole slide image (WSI) generally has gigapixel size and huge region-level objective to be segmented. Extracting patches from WSI can address the limitation of computer memory, but the integrity of target is hence affected. Moreover, supervised learning methods require manually annotated labels for training, which is laborious and time-consuming. Thus, we studied a novel weakly supervised learning (WSL)-based end-to-end framework for semantic segmentation of cancerous area in WSI. The proposed framework is based on the block-level segmentation of convolutional neural network (CNN), while CNN is required to integrate the global average pooling layer and single fully connected layer as WSL-CNN. Class activation map and dense conditional random field (DenseCRF) are adapted to realize pixel-level segmentation of the cancerous area in patch, which is incorporated into the classification process of WSL-CNN. The hierarchically double use of DenseCRF effectively improves the precision of semantic segmentation. A region-based annotation method and a flexible method of constructing training dataset are proposed to reduce the workload of annotation. Experiments show that the block-level segmentation of CNNs has better performance than the pixel-level segmentation of fully convolutional networks, ResNet50 is the best one that achieves F1 score of 0.87426, Jaccard score of 0.78079, Recall of 0.94251 and Precision of 0.82182. The proposed framework can effectively refine the block-level prediction as semantic segmentation without pixel-level label. The precision of all tested CNNs get improved in the experiments, with WSL-ResNet50 achieving F1 score of 0.90630, Jaccard score of 0.83230, Recall of 0.92051 and Precision of 0.89789. We propose a complete end-to-end framework, including the specific structure of neural network, the construction of training dataset, the prediction method using neural network and the post-processing. CNN-like architectures can be widely transplanted into this framework to realize semantic segmentation, solving the problem of insufficient label of large-scale medical image to a certain extent.</p>","PeriodicalId":54639,"journal":{"name":"Pattern Analysis and Applications","volume":"26 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140585176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信