Journal of Imaging最新文献

筛选
英文 中文
Comparison of Visual and Quantra Software Mammographic Density Assessment According to BI-RADS® in 2D and 3D Images. 根据 BI-RADS® 在二维和三维图像中进行乳腺密度评估的视觉软件和 Quantra 软件的比较。
IF 2.7
Journal of Imaging Pub Date : 2024-09-23 DOI: 10.3390/jimaging10090238
Francesca Morciano, Cristina Marcazzan, Rossella Rella, Oscar Tommasini, Marco Conti, Paolo Belli, Andrea Spagnolo, Andrea Quaglia, Stefano Tambalo, Andreea Georgiana Trisca, Claudia Rossati, Francesca Fornasa, Giovanna Romanucci
{"title":"Comparison of Visual and Quantra Software Mammographic Density Assessment According to BI-RADS<sup>®</sup> in 2D and 3D Images.","authors":"Francesca Morciano, Cristina Marcazzan, Rossella Rella, Oscar Tommasini, Marco Conti, Paolo Belli, Andrea Spagnolo, Andrea Quaglia, Stefano Tambalo, Andreea Georgiana Trisca, Claudia Rossati, Francesca Fornasa, Giovanna Romanucci","doi":"10.3390/jimaging10090238","DOIUrl":"https://doi.org/10.3390/jimaging10090238","url":null,"abstract":"<p><p>Mammographic density (MD) assessment is subject to inter- and intra-observer variability. An automated method, such as Quantra software, could be a useful tool for an objective and reproducible MD assessment. Our purpose was to evaluate the performance of Quantra software in assessing MD, according to BI-RADS<sup>®</sup> Atlas Fifth Edition recommendations, verifying the degree of agreement with the gold standard, given by the consensus of two breast radiologists. A total of 5009 screening examinations were evaluated by two radiologists and analysed by Quantra software to assess MD. The agreement between the three assigned values was expressed as intraclass correlation coefficients (ICCs). The agreement between the software and the two readers (R1 and R2) was moderate with ICC values of 0.725 and 0.713, respectively. A better agreement was demonstrated between the software's assessment and the average score of the values assigned by the two radiologists, with an index of 0.793, which reflects a good correlation. Quantra software appears a promising tool in supporting radiologists in the MD assessment and could be part of a personalised screening protocol soon. However, some fine-tuning is needed to improve its accuracy, reduce its tendency to overestimate, and ensure it excludes high-density structures from its assessment.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11433353/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142355811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient End-to-End Convolutional Architecture for Point-of-Gaze Estimation. 用于注视点估计的高效端到端卷积架构
IF 2.7
Journal of Imaging Pub Date : 2024-09-23 DOI: 10.3390/jimaging10090237
Casian Miron, George Ciubotariu, Alexandru Păsărică, Radu Timofte
{"title":"Efficient End-to-End Convolutional Architecture for Point-of-Gaze Estimation.","authors":"Casian Miron, George Ciubotariu, Alexandru Păsărică, Radu Timofte","doi":"10.3390/jimaging10090237","DOIUrl":"https://doi.org/10.3390/jimaging10090237","url":null,"abstract":"<p><p>Point-of-gaze estimation is part of a larger set of tasks aimed at improving user experience, providing business insights, or facilitating interactions with different devices. There has been a growing interest in this task, particularly due to the need for upgrades in e-meeting platforms during the pandemic when on-site activities were no longer possible for educational institutions, corporations, and other organizations. Current research advancements are focusing on more complex methodologies for data collection and task implementation, creating a gap that we intend to address with our contributions. Thus, we introduce a methodology for data acquisition that shows promise due to its nonrestrictive and straightforward nature, notably increasing the yield of collected data without compromising diversity or quality. Additionally, we present a novel and efficient convolutional neural network specifically tailored for calibration-free point-of-gaze estimation that outperforms current state-of-the-art methods on the MPIIFaceGaze dataset by a substantial margin, and sets a strong baseline on our own data.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11433013/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142336671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Method for Augmenting Side-Scan Sonar Seafloor Sediment Image Dataset Based on BCEL1-CBAM-INGAN. 基于 BCEL1-CBAM-INGAN 的侧扫声纳海底沉积物图像数据集增强方法。
IF 2.7
Journal of Imaging Pub Date : 2024-09-20 DOI: 10.3390/jimaging10090233
Haixing Xia, Yang Cui, Shaohua Jin, Gang Bian, Wei Zhang, Chengyang Peng
{"title":"Method for Augmenting Side-Scan Sonar Seafloor Sediment Image Dataset Based on BCEL1-CBAM-INGAN.","authors":"Haixing Xia, Yang Cui, Shaohua Jin, Gang Bian, Wei Zhang, Chengyang Peng","doi":"10.3390/jimaging10090233","DOIUrl":"https://doi.org/10.3390/jimaging10090233","url":null,"abstract":"<p><p>In this paper, a method for augmenting samples of side-scan sonar seafloor sediment images based on CBAM-BCEL1-INGAN is proposed, aiming to address the difficulties in acquiring and labeling datasets, as well as the insufficient diversity and quantity of data samples. Firstly, a Convolutional Block Attention Module (CBAM) is integrated into the residual blocks of the INGAN generator to enhance the learning of specific attributes and improve the quality of the generated images. Secondly, a BCEL1 loss function (combining binary cross-entropy and L1 loss functions) is introduced into the discriminator, enabling it to focus on both global image consistency and finer distinctions for better generation results. Finally, augmented samples are input into an AlexNet classifier to verify their authenticity. Experimental results demonstrate the excellent performance of the method in generating images of coarse sand, gravel, and bedrock, as evidenced by significant improvements in the Frechet Inception Distance (FID) and Inception Score (IS). The introduction of the CBAM and BCEL1 loss function notably enhances the quality and details of the generated images. Moreover, classification experiments using the AlexNet classifier show an increase in the recognition rate from 90.5% using only INGAN-generated images of bedrock to 97.3% using images augmented using our method, marking a 6.8% improvement. Additionally, the classification accuracy of bedrock-type matrices is improved by 5.2% when images enhanced using the method presented in this paper are added to the training set, which is 2.7% higher than that of the simple method amplification. This validates the effectiveness of our method in the task of generating seafloor sediment images, partially alleviating the scarcity of side-scan sonar seafloor sediment image data.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11433333/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142355824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multi-Task Model for Pulmonary Nodule Segmentation and Classification. 肺结节分割和分类的多任务模型
IF 2.7
Journal of Imaging Pub Date : 2024-09-20 DOI: 10.3390/jimaging10090234
Tiequn Tang, Rongfu Zhang
{"title":"A Multi-Task Model for Pulmonary Nodule Segmentation and Classification.","authors":"Tiequn Tang, Rongfu Zhang","doi":"10.3390/jimaging10090234","DOIUrl":"https://doi.org/10.3390/jimaging10090234","url":null,"abstract":"<p><p>In the computer-aided diagnosis of lung cancer, the automatic segmentation of pulmonary nodules and the classification of benign and malignant tumors are two fundamental tasks. However, deep learning models often overlook the potential benefits of task correlations in improving their respective performances, as they are typically designed for a single task only. Therefore, we propose a multi-task network (MT-Net) that integrates shared backbone architecture and a prediction distillation structure for the simultaneous segmentation and classification of pulmonary nodules. The model comprises a coarse segmentation subnetwork (Coarse Seg-net), a cooperative classification subnetwork (Class-net), and a cooperative segmentation subnetwork (Fine Seg-net). Coarse Seg-net and Fine Seg-net share identical structure, where Coarse Seg-net provides prior location information for the subsequent Fine Seg-net and Class-net, thereby boosting pulmonary nodule segmentation and classification performance. We quantitatively and qualitatively analyzed the performance of the model by using the public dataset LIDC-IDRI. Our results show that the model achieves a Dice similarity coefficient (<i>DI</i>) index of 83.2% for pulmonary nodule segmentation, as well as an accuracy (<i>ACC</i>) of 91.9% for benign and malignant pulmonary nodule classification, which is competitive with other state-of-the-art methods. The experimental results demonstrate that the performance of pulmonary nodule segmentation and classification can be improved by a unified model that leverages the potential correlation between tasks.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11433280/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142355790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convolutional Neural Network-Machine Learning Model: Hybrid Model for Meningioma Tumour and Healthy Brain Classification. 卷积神经网络-机器学习模型:用于脑膜瘤肿瘤和健康大脑分类的混合模型。
IF 2.7
Journal of Imaging Pub Date : 2024-09-20 DOI: 10.3390/jimaging10090235
Simona Moldovanu, Gigi Tăbăcaru, Marian Barbu
{"title":"Convolutional Neural Network-Machine Learning Model: Hybrid Model for Meningioma Tumour and Healthy Brain Classification.","authors":"Simona Moldovanu, Gigi Tăbăcaru, Marian Barbu","doi":"10.3390/jimaging10090235","DOIUrl":"https://doi.org/10.3390/jimaging10090235","url":null,"abstract":"<p><p>This paper presents a hybrid study of convolutional neural networks (CNNs), machine learning (ML), and transfer learning (TL) in the context of brain magnetic resonance imaging (MRI). The anatomy of the brain is very complex; inside the skull, a brain tumour can form in any part. With MRI technology, cross-sectional images are generated, and radiologists can detect the abnormalities. When the size of the tumour is very small, it is undetectable to the human visual system, necessitating alternative analysis using AI tools. As is widely known, CNNs explore the structure of an image and provide features on the SoftMax fully connected (SFC) layer, and the classification of the items that belong to the input classes is established. Two comparison studies for the classification of meningioma tumours and healthy brains are presented in this paper: (i) classifying MRI images using an original CNN and two pre-trained CNNs, DenseNet169 and EfficientNetV2B0; (ii) determining which CNN and ML combination yields the most accurate classification when SoftMax is replaced with three ML models; in this context, Random Forest (RF), K-Nearest Neighbors (KNN), and Support Vector Machine (SVM) were proposed. In a binary classification of tumours and healthy brains, the EfficientNetB0-SVM combination shows an accuracy of 99.5% on the test dataset. A generalisation of the results was performed, and overfitting was prevented by using the bagging ensemble method.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11433632/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142355813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Historical Blurry Video-Based Face Recognition. 基于历史模糊视频的人脸识别。
IF 2.7
Journal of Imaging Pub Date : 2024-09-20 DOI: 10.3390/jimaging10090236
Lujun Zhai, Suxia Cui, Yonghui Wang, Song Wang, Jun Zhou, Greg Wilsbacher
{"title":"Historical Blurry Video-Based Face Recognition.","authors":"Lujun Zhai, Suxia Cui, Yonghui Wang, Song Wang, Jun Zhou, Greg Wilsbacher","doi":"10.3390/jimaging10090236","DOIUrl":"https://doi.org/10.3390/jimaging10090236","url":null,"abstract":"<p><p>Face recognition is a widely used computer vision, which plays an increasingly important role in user authentication systems, security systems, and consumer electronics. The models for most current applications are based on high-definition digital cameras. In this paper, we focus on digital images derived from historical motion picture films. Historical motion picture films often have poorer resolution than modern digital imagery, making face detection a more challenging task. To approach this problem, we first propose a trunk-branch concatenated multi-task cascaded convolutional neural network (TB-MTCNN), which efficiently extracts facial features from blurry historical films by combining the trunk with branch networks and employing various sizes of kernels to enrich the multi-scale receptive field. Next, we build a deep neural network-integrated object-tracking algorithm to compensate for failed recognition over one or more video frames. The framework combines simple online and real-time tracking with deep data association (Deep SORT), and TB-MTCNN with the residual neural network (ResNet) model. Finally, a state-of-the-art image restoration method is employed to reduce the effect of noise and blurriness. The experimental results show that our proposed joint face recognition and tracking network can significantly reduce missed recognition in historical motion picture film frames.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11433217/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142355819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Deep Learning Model Explainability in Brain Tumor Datasets Using Post-Heuristic Approaches. 利用后探索方法增强脑肿瘤数据集中深度学习模型的可解释性
IF 2.7
Journal of Imaging Pub Date : 2024-09-18 DOI: 10.3390/jimaging10090232
Konstantinos Pasvantis, Eftychios Protopapadakis
{"title":"Enhancing Deep Learning Model Explainability in Brain Tumor Datasets Using Post-Heuristic Approaches.","authors":"Konstantinos Pasvantis, Eftychios Protopapadakis","doi":"10.3390/jimaging10090232","DOIUrl":"https://doi.org/10.3390/jimaging10090232","url":null,"abstract":"<p><p>The application of deep learning models in medical diagnosis has showcased considerable efficacy in recent years. Nevertheless, a notable limitation involves the inherent lack of explainability during decision-making processes. This study addresses such a constraint by enhancing the interpretability robustness. The primary focus is directed towards refining the explanations generated by the LIME Library and LIME image explainer. This is achieved through post-processing mechanisms based on scenario-specific rules. Multiple experiments have been conducted using publicly accessible datasets related to brain tumor detection. Our proposed post-heuristic approach demonstrates significant advancements, yielding more robust and concrete results in the context of medical diagnosis.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11433079/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142355816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three-Dimensional Reconstruction of Indoor Scenes Based on Implicit Neural Representation. 基于内隐神经表征的室内场景三维重建
IF 2.7
Journal of Imaging Pub Date : 2024-09-16 DOI: 10.3390/jimaging10090231
Zhaoji Lin, Yutao Huang, Li Yao
{"title":"Three-Dimensional Reconstruction of Indoor Scenes Based on Implicit Neural Representation.","authors":"Zhaoji Lin, Yutao Huang, Li Yao","doi":"10.3390/jimaging10090231","DOIUrl":"https://doi.org/10.3390/jimaging10090231","url":null,"abstract":"<p><p>Reconstructing 3D indoor scenes from 2D images has always been an important task in computer vision and graphics applications. For indoor scenes, traditional 3D reconstruction methods have problems such as missing surface details, poor reconstruction of large plane textures and uneven illumination areas, and many wrongly reconstructed floating debris noises in the reconstructed models. This paper proposes a 3D reconstruction method for indoor scenes that combines neural radiation field (NeRFs) and signed distance function (SDF) implicit expressions. The volume density of the NeRF is used to provide geometric information for the SDF field, and the learning of geometric shapes and surfaces is strengthened by adding an adaptive normal prior optimization learning process. It not only preserves the high-quality geometric information of the NeRF, but also uses the SDF to generate an explicit mesh with a smooth surface, significantly improving the reconstruction quality of large plane textures and uneven illumination areas in indoor scenes. At the same time, a new regularization term is designed to constrain the weight distribution, making it an ideal unimodal compact distribution, thereby alleviating the problem of uneven density distribution and achieving the effect of floating debris removal in the final model. Experiments show that the 3D reconstruction effect of this paper on ScanNet, Hypersim, and Replica datasets outperforms the state-of-the-art methods.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11433400/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142355833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Role of Cardiovascular Imaging in the Diagnosis of Athlete's Heart: Navigating the Shades of Grey. 心血管成像在运动员心脏诊断中的作用:灰色阴影中的导航。
IF 2.7
Journal of Imaging Pub Date : 2024-09-14 DOI: 10.3390/jimaging10090230
Nima Baba Ali, Sogol Attaripour Esfahani, Isabel G Scalia, Juan M Farina, Milagros Pereyra, Timothy Barry, Steven J Lester, Said Alsidawi, David E Steidley, Chadi Ayoub, Stefano Palermi, Reza Arsanjani
{"title":"The Role of Cardiovascular Imaging in the Diagnosis of Athlete's Heart: Navigating the Shades of Grey.","authors":"Nima Baba Ali, Sogol Attaripour Esfahani, Isabel G Scalia, Juan M Farina, Milagros Pereyra, Timothy Barry, Steven J Lester, Said Alsidawi, David E Steidley, Chadi Ayoub, Stefano Palermi, Reza Arsanjani","doi":"10.3390/jimaging10090230","DOIUrl":"https://doi.org/10.3390/jimaging10090230","url":null,"abstract":"<p><p>Athlete's heart (AH) represents the heart's remarkable ability to adapt structurally and functionally to prolonged and intensive athletic training. Characterized by increased left ventricular (LV) wall thickness, enlarged cardiac chambers, and augmented cardiac mass, AH typically maintains or enhances systolic and diastolic functions. Despite the positive health implications, these adaptations can obscure the difference between benign physiological changes and early manifestations of cardiac pathologies such as dilated cardiomyopathy (DCM), hypertrophic cardiomyopathy (HCM), and arrhythmogenic cardiomyopathy (ACM). This article reviews the imaging characteristics of AH across various modalities, emphasizing echocardiography, cardiac magnetic resonance (CMR), and cardiac computed tomography as primary tools for evaluating cardiac function and distinguishing physiological adaptations from pathological conditions. The findings highlight the need for precise diagnostic criteria and advanced imaging techniques to ensure accurate differentiation, preventing misdiagnosis and its associated risks, such as sudden cardiac death (SCD). Understanding these adaptations and employing the appropriate imaging methods are crucial for athletes' effective management and health optimization.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11433181/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142355832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging Perspective Transformation for Enhanced Pothole Detection in Autonomous Vehicles. 利用透视变换增强自动驾驶汽车的坑洞探测能力。
IF 2.7
Journal of Imaging Pub Date : 2024-09-14 DOI: 10.3390/jimaging10090227
Abdalmalek Abu-Raddaha, Zaid A El-Shair, Samir Rawashdeh
{"title":"Leveraging Perspective Transformation for Enhanced Pothole Detection in Autonomous Vehicles.","authors":"Abdalmalek Abu-Raddaha, Zaid A El-Shair, Samir Rawashdeh","doi":"10.3390/jimaging10090227","DOIUrl":"https://doi.org/10.3390/jimaging10090227","url":null,"abstract":"<p><p>Road conditions, often degraded by insufficient maintenance or adverse weather, significantly contribute to accidents, exacerbated by the limited human reaction time to sudden hazards like potholes. Early detection of distant potholes is crucial for timely corrective actions, such as reducing speed or avoiding obstacles, to mitigate vehicle damage and accidents. This paper introduces a novel approach that utilizes perspective transformation to enhance pothole detection at different distances, focusing particularly on distant potholes. Perspective transformation improves the visibility and clarity of potholes by virtually bringing them closer and enlarging their features, which is particularly beneficial given the fixed-size input requirement of object detection networks, typically significantly smaller than the raw image resolutions captured by cameras. Our method automatically identifies the region of interest (ROI)-the road area-and calculates the corner points to generate a perspective transformation matrix. This matrix is applied to all images and corresponding bounding box labels, enhancing the representation of potholes in the dataset. This approach significantly boosts detection performance when used with YOLOv5-small, achieving a 43% improvement in the average precision (AP) metric at intersection-over-union thresholds of 0.5 to 0.95 for single class evaluation, and notable improvements of 34%, 63%, and 194% for near, medium, and far potholes, respectively, after categorizing them based on their distance. To the best of our knowledge, this work is the first to employ perspective transformation specifically for enhancing the detection of distant potholes.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11432791/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142355822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信