ArrayPub Date : 2024-04-26DOI: 10.1016/j.array.2024.100345
Al Amin Biswas
{"title":"A comprehensive review of explainable AI for disease diagnosis","authors":"Al Amin Biswas","doi":"10.1016/j.array.2024.100345","DOIUrl":"https://doi.org/10.1016/j.array.2024.100345","url":null,"abstract":"<div><p>Nowadays, artificial intelligence (AI) has been utilized in several domains of the healthcare sector. Despite its effectiveness in healthcare settings, its massive adoption remains limited due to the transparency issue, which is considered a significant obstacle. To achieve the trust of end users, it is necessary to explain the AI models' output. Therefore, explainable AI (XAI) has become apparent as a potential solution by providing transparent explanations of the AI models' output. In this review paper, the primary aim is to review articles that are mainly related to machine learning (ML) or deep learning (DL) based human disease diagnoses, and the model's decision-making process is explained by XAI techniques. To do that, two journal databases (Scopus and the IEEE Xplore Digital Library) were thoroughly searched using a few predetermined relevant keywords. The PRISMA guidelines have been followed to determine the papers for the final analysis, where studies that did not meet the requirements were eliminated. Finally, 90 Q1 journal articles are selected for in-depth analysis, covering several XAI techniques. Then, the summarization of the several findings has been presented, and appropriate responses to the proposed research questions have been outlined. In addition, several challenges related to XAI in the case of human disease diagnosis and future research directions in this sector are presented.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"22 ","pages":"Article 100345"},"PeriodicalIF":0.0,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000110/pdfft?md5=e1abc0e28d1ca274ca3562e4e862960b&pid=1-s2.0-S2590005624000110-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140816446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArrayPub Date : 2024-04-23DOI: 10.1016/j.array.2024.100346
Salman Fazle Rabby , Muhammad Abdullah Arafat , Taufiq Hasan
{"title":"BT-Net: An end-to-end multi-task architecture for brain tumor classification, segmentation, and localization from MRI images","authors":"Salman Fazle Rabby , Muhammad Abdullah Arafat , Taufiq Hasan","doi":"10.1016/j.array.2024.100346","DOIUrl":"10.1016/j.array.2024.100346","url":null,"abstract":"<div><p>Brain tumors are severe medical conditions that can prove fatal if not detected and treated early. Radiologists often use MRI and CT scan imaging to diagnose brain tumors early. However, a shortage of skilled radiologists to analyze medical images can be problematic in low-resource healthcare settings. To overcome this issue, deep learning-based automatic analysis of medical images can be an effective tool for assistive diagnosis. Conventional methods generally focus on developing specialized algorithms to address a single aspect, such as segmentation, classification, or localization of brain tumors. In this work, a novel multi-task network was proposed, modified from the conventional VGG16, along with a U-Net variant concatenation, that can simultaneously achieve segmentation, classification, and localization using the same architecture. We trained the classification branch using the <em>Brain Tumor MRI Dataset</em>, and the segmentation branch using a “<em>Brain Tumor Segmentation</em> dataset. The integration of our method’s output can aid in simultaneous classification, segmentation, and localization of four types of brain tumors in MRI scans. The proposed multi-task framework achieved 97% accuracy in classification and a dice similarity score of 0.86 for segmentation. In addition, the method shows higher computational efficiency compared to existing methods. Our method can be a promising tool for assistive diagnosis in low-resource healthcare settings where skilled radiologists are scarce.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"22 ","pages":"Article 100346"},"PeriodicalIF":0.0,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000122/pdfft?md5=36c2c4383abffb72e6a44ae52a4e5a0c&pid=1-s2.0-S2590005624000122-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140769030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArrayPub Date : 2024-04-17DOI: 10.1016/j.array.2024.100344
Hong Fang , Dahao Liang , Weiyu Xiang
{"title":"Single-Stage Extensive Semantic Fusion for multi-modal sarcasm detection","authors":"Hong Fang , Dahao Liang , Weiyu Xiang","doi":"10.1016/j.array.2024.100344","DOIUrl":"https://doi.org/10.1016/j.array.2024.100344","url":null,"abstract":"<div><p>With the rise of social media and online interactions, there is a growing need for analytical models capable of understanding the nuanced, multi-modal communication inherent in platforms, especially for detecting sarcasm. Existing research employs multi-stage models along with extensive semantic information extractions and single-modal encoders. These models often struggle with efficient aligning and fusing multi-modal representations. Addressing these shortcomings, we introduce the Single-Stage Extensive Semantic Fusion (SSESF) model, designed to concurrently process multi-modal inputs in a unified framework, which performs encoding and fusing in the same architecture with shared parameters. A projection mechanism is employed to overcome the challenges posed by the diversity of inputs and the integration of a wide range of semantic information. Additionally, we design a multi-objective optimization that enhances the model’s ability to learn latent semantic nuances with supervised contrastive learning. The unified framework emphasizes the interaction and integration of multi-modal data, while multi-objective optimization preserves the complexity of semantic nuances for sarcasm detection. Experimental results on a public multi-modal sarcasm dataset demonstrate the superiority of our model, achieving state-of-the-art performance. The findings highlight the model’s capability to integrate extensive semantic information, demonstrating its effectiveness in the simultaneous interpretation and fusion of multi-modal data for sarcasm detection.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"22 ","pages":"Article 100344"},"PeriodicalIF":0.0,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000109/pdfft?md5=5136c2ac1ad918984ba24754918dce68&pid=1-s2.0-S2590005624000109-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140619309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArrayPub Date : 2024-04-09DOI: 10.1016/j.array.2024.100343
Qingsong Huang , Junqing Fan , Haoran Xu , Wei Han , Xiaohui Huang , Yunliang Chen
{"title":"AFENet: Attention-guided feature enhancement network and a benchmark for low-altitude UAV sewage outfall detection","authors":"Qingsong Huang , Junqing Fan , Haoran Xu , Wei Han , Xiaohui Huang , Yunliang Chen","doi":"10.1016/j.array.2024.100343","DOIUrl":"https://doi.org/10.1016/j.array.2024.100343","url":null,"abstract":"<div><p>Inspecting sewage outfall into rivers is significant to the precise management of the ecological environment because they are the last gate for pollutants to enter the river. Unmanned Aerial Vehicles (UAVs) have the characteristics of maneuverability and high-resolution images and have been used as an important means to inspect sewage outfalls. UAVs are widely used in daily sewage outfall inspections, but relying on manual interpretation lacks the corresponding low-altitude sewage outfall images dataset. Meanwhile, because of the sparse spatial distribution of sewage outfalls, problems like less labeled sample data, complex background types, and weak objects are also prominent. In order to promote the inspection of sewage outfalls, this paper proposes a low-attitude sewage outfall object detection dataset, namely UAV-SOD, and an attention-guided feature enhancement network, namely AFENet. The UAV-SOD dataset features high resolution, complex backgrounds, and diverse objects. Some of the outfall objects are limited by multi-scale, single-colored, and weak feature responses, leading to low detection accuracy. To localize these objects effectively, AFENet first uses the global context block (GCB) to jointly explore valuable global and local information, and then the region of interest (RoI) attention module (RAM) is used to explore the relationships between RoI features. Experimental results show that the proposed method improves detection performance on the proposed UAV-SOD dataset than representative state-of-the-art two-stage object detection methods.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"22 ","pages":"Article 100343"},"PeriodicalIF":0.0,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000092/pdfft?md5=c8639340099f7cc1f4ba21449477dc2a&pid=1-s2.0-S2590005624000092-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140551183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Small group pedestrian crossing behaviour prediction using temporal angular 2D skeletal pose","authors":"Hanugra Aulia Sidharta , Berlian Al Kindhi , Eko Mulyanto Yuniarno , Mauridhi Hery Purnomo","doi":"10.1016/j.array.2024.100341","DOIUrl":"10.1016/j.array.2024.100341","url":null,"abstract":"<div><p>A pedestrian is classified as a Vulnerable Road User (VRU) because they do not have the protective equipment that would make them fatal if they were involved in an accident. An accident can happen while a pedestrian is on the road, especially when crossing the road. To ensure pedestrian safety, it is necessary to understand and predict pedestrian behaviour when crossing the road. We propose pedestrian intention prediction using a 2D pose estimation approach with temporal angle as a feature. Based on visual observation of the Joint Attention in Autonomous Driving (JAAD) dataset, we found that pedestrians tend to walk together in small groups while waiting to cross, and then this group is disbanded on the opposite side of the road. Thus, we propose to perform prediction with small group of pedestrians, based on pedestrian statistical data, we define a small group of pedestrians as consisting of 4 pedestrians. Another problem raised is 2D pose estimation is processing each pedestrian index individually, which creates ambiguous pedestrian index in consecutive frame. We propose Multi Input Single Output (MISO), which has capabilities to process multiple pedestrians together, and use summation layer at the end of the model to solve the ambiguous pedestrian index problem without performing tracking on each pedestrian. The performance of our proposed model achieves model accuracy of 0.9306 with prediction performance of 0.8317.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"22 ","pages":"Article 100341"},"PeriodicalIF":0.0,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000079/pdfft?md5=255bf8dee6ebbdca068e698762cee29a&pid=1-s2.0-S2590005624000079-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140091770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing object detection in low-resolution images via frequency domain learning","authors":"Shuaiqiang Gao , Yunliang Chen , Ningning Cui , Wenjian Qin","doi":"10.1016/j.array.2024.100342","DOIUrl":"https://doi.org/10.1016/j.array.2024.100342","url":null,"abstract":"<div><p>To meet the requirements of navigation devices in terms of weight, power consumption, and size, it is necessary to capture low-resolution images or transmit low-resolution images to a server for object detection. However, due to the lack of details and frequency information, even state-of-the-art detection methods face challenges in accurately identifying objects. To tackle this issue, we introduce a novel upsampling method termed multi-wave representation upsampling, accompanied by a training strategy aimed at reinstating high-frequency details and augmenting the precision of object detection. Finally, we conduct empirical experiments showing that compared to alternative methodologies, our proposed approach yields images exhibiting minimal disparities in frequency compared to high-resolution counterparts. Additionally, it exhibits superior performance across objects of varying scales, while simultaneously demonstrating reduced parameter count and enhanced computational efficiency.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"22 ","pages":"Article 100342"},"PeriodicalIF":0.0,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000080/pdfft?md5=5c4a2e90b7f870b58f73cec79a3a6c25&pid=1-s2.0-S2590005624000080-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140122445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArrayPub Date : 2024-03-03DOI: 10.1016/j.array.2024.100340
Yuzheng Liu , Jianxun Zhang , Lei Shi , Mingxiang Huang , Linyu Lin , Lingfeng Zhu , Xianglu Lin , Chuanlei Zhang
{"title":"Detection method of the seat belt for workers at height based on UAV image and YOLO algorithm","authors":"Yuzheng Liu , Jianxun Zhang , Lei Shi , Mingxiang Huang , Linyu Lin , Lingfeng Zhu , Xianglu Lin , Chuanlei Zhang","doi":"10.1016/j.array.2024.100340","DOIUrl":"https://doi.org/10.1016/j.array.2024.100340","url":null,"abstract":"<div><p>In the domain of outdoor construction within the power industry, working at significant heights is common, requiring stringent safety measures. Workers are mandated to wear hard hats and secure themselves with seat belts to prevent potential falls, ensuring their safety and reducing the risk of injuries. Detecting seat belt usage holds immense significance in safety inspections within the power industry. This study introduces detection method of the seat belt for workers at height based on UAV Image and YOLO Algorithm. The YOLOv5 approach involves integrating CSPNet into the Darknet53 backbone, incorporating the Focus layer into CSP-Darknet53, replacing the SPPF block in the SPP model, and implementing the CSPNet strategy in the PANet model. Experimental results demonstrate that the YOLOv5 algorithm achieves an elevated average accuracy of 99.2%, surpassing benchmarks set by FastRcnn, SSD, YOLOX-m, and YOLOv7. It also demonstrates superior adaptability in scenarios involving smaller objects, validated using a UAV-collected dataset of seat belt images. These findings confirm the algorithm's compliance with performance criteria for seat belt detection at power construction sites, making a significant contribution to enhancing safety measures within the power industry's construction practices.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"22 ","pages":"Article 100340"},"PeriodicalIF":0.0,"publicationDate":"2024-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000067/pdfft?md5=50dec4f4bfbf478e832b65943e75f531&pid=1-s2.0-S2590005624000067-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140042676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArrayPub Date : 2024-03-01DOI: 10.1016/j.array.2024.100339
Arif Mahmud, Afjal Hossan Sarower, Amir Sohel, Md Assaduzzaman, Touhid Bhuiyan
{"title":"Adoption of ChatGPT by university students for academic purposes: Partial least square, artificial neural network, deep neural network and classification algorithms approach","authors":"Arif Mahmud, Afjal Hossan Sarower, Amir Sohel, Md Assaduzzaman, Touhid Bhuiyan","doi":"10.1016/j.array.2024.100339","DOIUrl":"https://doi.org/10.1016/j.array.2024.100339","url":null,"abstract":"<div><p>Given the limited extent of study conducted on the application of ChatGPT in the realm of education, this domain still needs to be explored. Consequently, the primary objective of this study is to evaluate the impact of factors within the extended value-based adoption model (VAM) and to delineate the individual contributions of these factors toward shaping the attitudes of university students regarding the utilization of ChatGPT for instructional purposes. This investigation incorporates dimensions such as social influence, self-efficacy, and personal innovativeness to augment the VAM. This augmentation aims to identify components where a hybrid approach, integrating partial least squares (PLS), artificial neural networks (ANN), deep neural networks (DNN), and classification algorithms, is employed to accurately discern both linear and nonlinear correlations. The data for this study were obtained through an online survey administered to university students, and a purposive sample technique was employed to select 369 valid responses. Following the initial data preparation, the assessment process comprised three successive stages: PLS, ANN, DNN and classification algorithms analysis. Intention is influenced by attitude, which is predicted by perceived usefulness, perceived enjoyment, social influence, self-efficacy, and personal innovativeness. Moreover, personal innovativeness has the maximum contribution to attitude followed by self-efficacy, enjoyment, usefulness, social influence, technicality, and cost. These findings will support the creation and prioritization of student-centered educational services. Additionally, this study can contribute to creating an efficient learning management system to enhance students' academic performance and professional efficiency.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"21 ","pages":"Article 100339"},"PeriodicalIF":0.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000055/pdfft?md5=349b8d60b9358f4b9c5452ad78d09c0d&pid=1-s2.0-S2590005624000055-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140042372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArrayPub Date : 2024-02-22DOI: 10.1016/j.array.2024.100336
Abdillah Abdillah , Ida Widianingsih , Rd Ahmad Buchari , Heru Nurasa
{"title":"Big data security & individual (psychological) resilience: A review of social media risks and lessons learned from Indonesia","authors":"Abdillah Abdillah , Ida Widianingsih , Rd Ahmad Buchari , Heru Nurasa","doi":"10.1016/j.array.2024.100336","DOIUrl":"https://doi.org/10.1016/j.array.2024.100336","url":null,"abstract":"<div><p>This research aims to reduce social media security risks and develop best practices to help governments address social media security risks more effectively. This research begins by reviewing the different discussions in the literature about social media security risks and mitigation techniques. Based on the extensive review, several key insights were identified and summarized to help organizations address social media security risks more effectively. Many national governments around the world do not have effective social media security policies and are unsure how to develop effective social media security strategies to mitigate social media security risks. This research provides guidance to national governments on mitigating potential social media security risks. This study incorporates ongoing debates in the literature and provides guidance on how to reduce social media security and technological risks. Practical insights are identified and summarized from the extensive literature. More discussions and studies are needed on strategies and practical insights to reduce social media risk for the Indonesian government.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"21 ","pages":"Article 100336"},"PeriodicalIF":0.0,"publicationDate":"2024-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S259000562400002X/pdfft?md5=ba831e3d2d41e5a91bcf0ce7cc29aec7&pid=1-s2.0-S259000562400002X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139936136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ArrayPub Date : 2024-02-22DOI: 10.1016/j.array.2024.100338
Peng Zhu , Gang Wang , Jingheng He , Yueli Dong , Yu Chang
{"title":"An encrypted traffic identification method based on multi-scale feature fusion","authors":"Peng Zhu , Gang Wang , Jingheng He , Yueli Dong , Yu Chang","doi":"10.1016/j.array.2024.100338","DOIUrl":"https://doi.org/10.1016/j.array.2024.100338","url":null,"abstract":"<div><p>As data privacy issues become more and more sensitive, increasing numbers of websites usually encrypt traffic when transmitting it. This method can largely protect privacy, but it also brings a huge challenge. Aiming at the problem that encrypted traffic classification makes it difficult to obtain a global optimal solution, this paper proposes an encrypted traffic identification model called the ET-BERT and 1D-CNN fusion network (BCFNet), based on multi-scale feature fusion. This method combines feature learning with classification tasks, unified into an end-to-end model. The local features of encrypted traffic extracted based on the improved Inception one-dimensional convolutional neural network structure are fused with the global features extracted by the ET-BERT model. The one-dimensional convolutional neural network is more suitable for the encrypted traffic of a one-dimensional sequence than the commonly used two-dimensional convolutional neural network. The proposed model can learn the nonlinear relationship between the input data and the expected label and obtain the global optimal solution with a greater probability. This paper verifies the ISCX VPN-nonVPN dataset and compares the results of the BCFNet model with the other five baseline models on accuracy, precision, recall, and F1 indicators. The experimental results demonstrate that the BCFNet model has a greater overall effect than the other five models. <em>Its accuracy can reach 98.88%.</em></p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"21 ","pages":"Article 100338"},"PeriodicalIF":0.0,"publicationDate":"2024-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000043/pdfft?md5=9bdc10d2ece62e4a288fe5d295082936&pid=1-s2.0-S2590005624000043-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139985551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}