A Modified Hierarchical Vision Transformer Model for Poultry Disease Detection

IF 2 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IET Image Processing Pub Date : 2025-05-25 DOI:10.1049/ipr2.70115

Michael Agbo Tettey Soli, Dacosta Agyei, Waliyyullah Umar Bandawu, Leonard Mensah Boante, Justice Kwame Appati

{"title":"A Modified Hierarchical Vision Transformer Model for Poultry Disease Detection","authors":"Michael Agbo Tettey Soli, Dacosta Agyei, Waliyyullah Umar Bandawu, Leonard Mensah Boante, Justice Kwame Appati","doi":"10.1049/ipr2.70115","DOIUrl":null,"url":null,"abstract":"<p>Poultry production faces challenges from diseases like newcastle, salmonella, and coccidiosis, which are critical to global food security, resulting in economic losses and public health concerns. Current detection technologies, such as human inspections and PCR-based procedures, are time-consuming and costly, limiting scalability. Convolutional neural networks (CNNs) like ResNet50 and VGG16 have shown promise for automating disease identification, but they struggle with generalization and collecting fine-grained local and global information. In this study, we propose a deep learning solution based on a hierarchical vision transformer (HViT) model to detect poultry diseases from fecal images. We compare the performance of our HViT model with traditional CNNs (ResNet50, VGG16), lightweight architectures (MobileNetV3_Large_100, XceptionNet), and standard vision transformers (ViT) (ViT-B/16). The experimental results demonstrate that our HViT model outperforms other models, achieving an average validation accuracy of 90.90% with a validation loss of 0.2647. The HViT's ability to balance local and global feature recognition highlights its potential as a scalable solution for real-time poultry disease detection. These findings underscore the significance of hierarchical attention in addressing complex image analysis tasks, with implications for broader applications in agriculture and medical imaging.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.70115","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Image Processing","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.70115","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Poultry production faces challenges from diseases like newcastle, salmonella, and coccidiosis, which are critical to global food security, resulting in economic losses and public health concerns. Current detection technologies, such as human inspections and PCR-based procedures, are time-consuming and costly, limiting scalability. Convolutional neural networks (CNNs) like ResNet50 and VGG16 have shown promise for automating disease identification, but they struggle with generalization and collecting fine-grained local and global information. In this study, we propose a deep learning solution based on a hierarchical vision transformer (HViT) model to detect poultry diseases from fecal images. We compare the performance of our HViT model with traditional CNNs (ResNet50, VGG16), lightweight architectures (MobileNetV3_Large_100, XceptionNet), and standard vision transformers (ViT) (ViT-B/16). The experimental results demonstrate that our HViT model outperforms other models, achieving an average validation accuracy of 90.90% with a validation loss of 0.2647. The HViT's ability to balance local and global feature recognition highlights its potential as a scalable solution for real-time poultry disease detection. These findings underscore the significance of hierarchical attention in addressing complex image analysis tasks, with implications for broader applications in agriculture and medical imaging.

查看原文本刊更多论文

家禽疾病检测的改进层次视觉变换模型

家禽生产面临着鸡瘟、沙门氏菌和球虫病等疾病的挑战，这些疾病对全球粮食安全至关重要，导致经济损失和公共卫生问题。目前的检测技术，如人工检查和基于pcr的程序，既耗时又昂贵，限制了可扩展性。像ResNet50和VGG16这样的卷积神经网络（cnn）已经显示出自动化疾病识别的希望，但它们在泛化和收集细粒度的局部和全局信息方面存在困难。在这项研究中，我们提出了一种基于层次视觉变压器（HViT）模型的深度学习解决方案，用于从粪便图像中检测家禽疾病。我们将HViT模型与传统cnn (ResNet50, VGG16)，轻量级架构（MobileNetV3_Large_100, XceptionNet）和标准视觉变压器（ViT）（ViT- b /16）的性能进行了比较。实验结果表明，我们的HViT模型优于其他模型，平均验证准确率为90.90%，验证损失为0.2647。HViT平衡局部和全局特征识别的能力突出了其作为实时家禽疾病检测的可扩展解决方案的潜力。这些发现强调了分层关注在解决复杂图像分析任务中的重要性，对农业和医学成像的更广泛应用具有启示意义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IET Image Processing 工程技术-工程：电子与电气

CiteScore

5.40

自引率

8.70%

发文量

282

审稿时长

6 months

期刊介绍： The IET Image Processing journal encompasses research areas related to the generation, processing and communication of visual information. The focus of the journal is the coverage of the latest research results in image and video processing, including image generation and display, enhancement and restoration, segmentation, colour and texture analysis, coding and communication, implementations and architectures as well as innovative applications. Principal topics include: Generation and Display - Imaging sensors and acquisition systems, illumination, sampling and scanning, quantization, colour reproduction, image rendering, display and printing systems, evaluation of image quality. Processing and Analysis - Image enhancement, restoration, segmentation, registration, multispectral, colour and texture processing, multiresolution processing and wavelets, morphological operations, stereoscopic and 3-D processing, motion detection and estimation, video and image sequence processing. Implementations and Architectures - Image and video processing hardware and software, design and construction, architectures and software, neural, adaptive, and fuzzy processing. Coding and Transmission - Image and video compression and coding, compression standards, noise modelling, visual information networks, streamed video. Retrieval and Multimedia - Storage of images and video, database design, image retrieval, video annotation and editing, mixed media incorporating visual information, multimedia systems and applications, image and video watermarking, steganography. Applications - Innovative application of image and video processing technologies to any field, including life sciences, earth sciences, astronomy, document processing and security. Current Special Issue Call for Papers: Evolutionary Computation for Image Processing - https://digital-library.theiet.org/files/IET_IPR_CFP_EC.pdf AI-Powered 3D Vision - https://digital-library.theiet.org/files/IET_IPR_CFP_AIPV.pdf Multidisciplinary advancement of Imaging Technologies: From Medical Diagnostics and Genomics to Cognitive Machine Vision, and Artificial Intelligence - https://digital-library.theiet.org/files/IET_IPR_CFP_IST.pdf Deep Learning for 3D Reconstruction - https://digital-library.theiet.org/files/IET_IPR_CFP_DLR.pdf