{"title":"Bacterial image analysis using multi-task deep learning approaches for clinical microscopy","authors":"Shuang Yee Chin, Jian Dong, Khairunnisa Hasikin, Romano Ngui, Khin Wee Lai, Pauline Shan Qing Yeoh, Xiang Wu","doi":"10.7717/peerj-cs.2180","DOIUrl":"https://doi.org/10.7717/peerj-cs.2180","url":null,"abstract":"Background Bacterial image analysis plays a vital role in various fields, providing valuable information and insights for studying bacterial structural biology, diagnosing and treating infectious diseases caused by pathogenic bacteria, discovering and developing drugs that can combat bacterial infections, etc. As a result, it has prompted efforts to automate bacterial image analysis tasks. By automating analysis tasks and leveraging more advanced computational techniques, such as deep learning (DL) algorithms, bacterial image analysis can contribute to rapid, more accurate, efficient, reliable, and standardised analysis, leading to enhanced understanding, diagnosis, and control of bacterial-related phenomena. Methods Three object detection networks of DL algorithms, namely SSD-MobileNetV2, EfficientDet, and YOLOv4, were developed to automatically detect Escherichia coli (E. coli) bacteria from microscopic images. The multi-task DL framework is developed to classify the bacteria according to their respective growth stages, which include rod-shaped cells, dividing cells, and microcolonies. Data preprocessing steps were carried out before training the object detection models, including image augmentation, image annotation, and data splitting. The performance of the DL techniques is evaluated using the quantitative assessment method based on mean average precision (mAP), precision, recall, and F1-score. The performance metrics of the models were compared and analysed. The best DL model was then selected to perform multi-task object detections in identifying rod-shaped cells, dividing cells, and microcolonies. Results The output of the test images generated from the three proposed DL models displayed high detection accuracy, with YOLOv4 achieving the highest confidence score range of detection and being able to create different coloured bounding boxes for different growth stages of E. coli bacteria. In terms of statistical analysis, among the three proposed models, YOLOv4 demonstrates superior performance, achieving the highest mAP of 98% with the highest precision, recall, and F1-score of 86%, 97%, and 91%, respectively. Conclusions This study has demonstrated the effectiveness, potential, and applicability of DL approaches in multi-task bacterial image analysis, focusing on automating the detection and classification of bacteria from microscopic images. The proposed models can output images with bounding boxes surrounding each detected E. coli bacteria, labelled with their growth stage and confidence level of detection. All proposed object detection models have achieved promising results, with YOLOv4 outperforming the other models.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ahmad M. Nagm, Mona M. Moussa, Rasha Shoitan, Ahmed Ali, Mohamed Mashhour, Ahmed S. Salama, Hamada I. AbdulWakel
{"title":"Detecting image manipulation with ELA-CNN integration: a powerful framework for authenticity verification","authors":"Ahmad M. Nagm, Mona M. Moussa, Rasha Shoitan, Ahmed Ali, Mohamed Mashhour, Ahmed S. Salama, Hamada I. AbdulWakel","doi":"10.7717/peerj-cs.2205","DOIUrl":"https://doi.org/10.7717/peerj-cs.2205","url":null,"abstract":"The exponential progress of image editing software has contributed to a rapid rise in the production of fake images. Consequently, various techniques and approaches have been developed to detect manipulated images. These methods aim to discern between genuine and altered images, effectively combating the proliferation of deceptive visual content. However, additional advancements are necessary to enhance their accuracy and precision. Therefore, this research proposes an image forgery algorithm that integrates error level analysis (ELA) and a convolutional neural network (CNN) to detect the manipulation. The system primarily focuses on detecting copy-move and splicing forgeries in images. The input image is fed to the ELA algorithm to identify regions within the image that have different compression levels. Afterward, the created ELA images are used as input to train the proposed CNN model. The CNN model is constructed from two consecutive convolution layers, followed by one max pooling layer and two dense layers. Two dropout layers are inserted between the layers to improve model generalization. The experiments are applied to the CASIA 2 dataset, and the simulation results show that the proposed algorithm demonstrates remarkable performance metrics, including a training accuracy of 99.05%, testing accuracy of 94.14%, precision of 94.1%, and recall of 94.07%. Notably, it outperforms state-of-the-art techniques in both accuracy and precision.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Forecasting CO2 emissions of fuel vehicles for an ecological world using ensemble learning, machine learning, and deep learning models","authors":"Fatih Gurcan","doi":"10.7717/peerj-cs.2234","DOIUrl":"https://doi.org/10.7717/peerj-cs.2234","url":null,"abstract":"Background The continuous increase in carbon dioxide (CO2) emissions from fuel vehicles generates a greenhouse effect in the atmosphere, which has a negative impact on global warming and climate change and raises serious concerns about environmental sustainability. Therefore, research on estimating and reducing vehicle CO2 emissions is crucial in promoting environmental sustainability and reducing greenhouse gas emissions in the atmosphere. Methods This study performed a comparative regression analysis using 18 different regression algorithms based on machine learning, ensemble learning, and deep learning paradigms to evaluate and predict CO2 emissions from fuel vehicles. The performance of each algorithm was evaluated using metrics including R2, Adjusted R2, root mean square error (RMSE), and runtime. Results The findings revealed that ensemble learning methods have higher prediction accuracy and lower error rates. Ensemble learning algorithms that included Extreme Gradient Boosting (XGB), Random Forest, and Light Gradient-Boosting Machine (LGBM) demonstrated high R2 and low RMSE values. As a result, these ensemble learning-based algorithms were discovered to be the most effective methods of predicting CO2 emissions. Although deep learning models with complex structures, such as the convolutional neural network (CNN), deep neural network (DNN) and gated recurrent unit (GRU), achieved high R2 values, it was discovered that they take longer to train and require more computational resources. The methodology and findings of our research provide a number of important implications for the different stakeholders striving for environmental sustainability and an ecological world.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Georgiana Tucudean, Marian Bucos, Bogdan Dragulescu, Catalin Daniel Caleanu
{"title":"Natural language processing with transformers: a review","authors":"Georgiana Tucudean, Marian Bucos, Bogdan Dragulescu, Catalin Daniel Caleanu","doi":"10.7717/peerj-cs.2222","DOIUrl":"https://doi.org/10.7717/peerj-cs.2222","url":null,"abstract":"Natural language processing (NLP) tasks can be addressed with several deep learning architectures, and many different approaches have proven to be efficient. This study aims to briefly summarize the use cases for NLP tasks along with the main architectures. This research presents transformer-based solutions for NLP tasks such as Bidirectional Encoder Representations from Transformers (BERT), and Generative Pre-Training (GPT) architectures. To achieve that, we conducted a step-by-step process in the review strategy: identify the recent studies that include Transformers, apply filters to extract the most consistent studies, identify and define inclusion and exclusion criteria, assess the strategy proposed in each study, and finally discuss the methods and architectures presented in the resulting articles. These steps facilitated the systematic summarization and comparative analysis of NLP applications based on Transformer architectures. The primary focus is the current state of the NLP domain, particularly regarding its applications, language models, and data set types. The results provide insights into the challenges encountered in this research domain.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ProcGCN: detecting malicious process in memory based on DGCNN","authors":"Heyu Zhang, Binglong Li, Shilong Yu, Chaowen Chang, Jinhui Li, Bohao Yang","doi":"10.7717/peerj-cs.2193","DOIUrl":"https://doi.org/10.7717/peerj-cs.2193","url":null,"abstract":"The combination of memory forensics and deep learning for malware detection has achieved certain progress, but most existing methods convert process dump to images for classification, which is still based on process byte feature classification. After the malware is loaded into memory, the original byte features will change. Compared with byte features, function call features can represent the behaviors of malware more robustly. Therefore, this article proposes the ProcGCN model, a deep learning model based on DGCNN (Deep Graph Convolutional Neural Network), to detect malicious processes in memory images. First, the process dump is extracted from the whole system memory image; then, the Function Call Graph (FCG) of the process is extracted, and feature vectors for the function node in the FCG are generated based on the word bag model; finally, the FCG is input to the ProcGCN model for classification and detection. Using a public dataset for experiments, the ProcGCN model achieved an accuracy of 98.44% and an F1 score of 0.9828. It shows a better result than the existing deep learning methods based on static features, and its detection speed is faster, which demonstrates the effectiveness of the method based on function call features and graph representation learning in memory forensics.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Umar Islam, Abeer Abdullah Alsadhan, Hathal Salamah Alwageed, Abdullah A. Al-Atawi, Gulzar Mehmood, Manel Ayadi, Shrooq Alsenan
{"title":"SentinelFusion based machine learning comprehensive approach for enhanced computer forensics","authors":"Umar Islam, Abeer Abdullah Alsadhan, Hathal Salamah Alwageed, Abdullah A. Al-Atawi, Gulzar Mehmood, Manel Ayadi, Shrooq Alsenan","doi":"10.7717/peerj-cs.2183","DOIUrl":"https://doi.org/10.7717/peerj-cs.2183","url":null,"abstract":"In the rapidly evolving landscape of modern technology, the convergence of blockchain innovation and machine learning advancements presents unparalleled opportunities to enhance computer forensics. This study introduces SentinelFusion, an ensemble-based machine learning framework designed to bolster secrecy, privacy, and data integrity within blockchain systems. By integrating cutting-edge blockchain security properties with the predictive capabilities of machine learning, SentinelFusion aims to improve the detection and prevention of security breaches and data tampering. Utilizing a comprehensive blockchain-based dataset of various criminal activities, the framework leverages multiple machine learning models, including support vector machines, K-nearest neighbors, naive Bayes, logistic regression, and decision trees, alongside the novel SentinelFusion ensemble model. Extensive evaluation metrics such as accuracy, precision, recall, and F1 score are used to assess model performance. The results demonstrate that SentinelFusion outperforms individual models, achieving an accuracy, precision, recall, and F1 score of 0.99. This study’s findings underscore the potential of combining blockchain technology and machine learning to advance computer forensics, providing valuable insights for practitioners and researchers in the field.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Early detection of abiotic stress in plants through SNARE proteins using hybrid feature fusion model","authors":"Bhargavi T., Sumathi D.","doi":"10.7717/peerj-cs.2149","DOIUrl":"https://doi.org/10.7717/peerj-cs.2149","url":null,"abstract":"Agriculture is the main source of livelihood for most of the population across the globe. Plants are often considered life savers for humanity, having evolved complex adaptations to cope with adverse environmental conditions. Protecting agricultural produce from devastating conditions such as stress is essential for the sustainable development of the nation. Plants respond to various environmental stressors such as drought, salinity, heat, cold, etc. Abiotic stress can significantly impact crop yield and development posing a major threat to agriculture. SNARE proteins play a major role in pathological processes as they are vital proteins in the life sciences. These proteins act as key players in stress responses. Feature extraction is essential for visualizing the underlying structure of the SNARE proteins in analyzing the root cause of abiotic stress in plants. To address this issue, we developed a hybrid model to capture the hidden structures of the SNAREs. A feature fusion technique has been devised by combining the potential strengths of convolutional neural networks (CNN) with a high dimensional radial basis function (RBF) network. Additionally, we employ a bi-directional long short-term memory (Bi-LSTM) network to classify the presence of SNARE proteins. Our feature fusion model successfully identified abiotic stress in plants with an accuracy of 74.6%. When compared with various existing frameworks, our model demonstrates superior classification results.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiachao Li, Ya’nan Zhou, He Zhang, Dayu Pan, Ying Gu, Bin Luo
{"title":"Maize plant height automatic reading of measurement scale based on improved YOLOv5 lightweight model","authors":"Jiachao Li, Ya’nan Zhou, He Zhang, Dayu Pan, Ying Gu, Bin Luo","doi":"10.7717/peerj-cs.2207","DOIUrl":"https://doi.org/10.7717/peerj-cs.2207","url":null,"abstract":"Background\u0000Plant height is a significant indicator of maize phenotypic morphology, and is closely related to crop growth, biomass, and lodging resistance. Obtaining the maize plant height accurately is of great significance for cultivating high-yielding maize varieties. Traditional measurement methods are labor-intensive and not conducive to data recording and storage. Therefore, it is very essential to implement the automated reading of maize plant height from measurement scales using object detection algorithms. Method\u0000This study proposed a lightweight detection model based on the improved YOLOv5. The MobileNetv3 network replaced the YOLOv5 backbone network, and the Normalization-based Attention Module attention mechanism module was introduced into the neck network. The CioU loss function was replaced with the EioU loss function. Finally, a combined algorithm was used to achieve the automatic reading of maize plant height from measurement scales. Results\u0000The improved model achieved an average precision of 98.6%, a computational complexity of 1.2 GFLOPs, and occupied 1.8 MB of memory. The detection frame rate on the computer was 54.1 fps. Through comparisons with models such as YOLOv5s, YOLOv7 and YOLOv8s, it was evident that the comprehensive performance of the improved model in this study was superior. Finally, a comparison between the algorithm’s 160 plant height data obtained from the test set and manual readings demonstrated that the relative error between the algorithm’s results and manual readings was within 0.2 cm, meeting the requirements of automatic reading of maize height measuring scale.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid quantum search with genetic algorithm optimization","authors":"Sebastian Mihai Ardelean, Mihai Udrescu","doi":"10.7717/peerj-cs.2210","DOIUrl":"https://doi.org/10.7717/peerj-cs.2210","url":null,"abstract":"Quantum genetic algorithms (QGA) integrate genetic programming and quantum computing to address search and optimization problems. The standard strategy of the hybrid QGA approach is to add quantum resources to classical genetic algorithms (GA), thus improving their efficacy (i.e., quantum optimization of a classical algorithm). However, the extent of such improvements is still unclear. Conversely, Reduced Quantum Genetic Algorithm (RQGA) is a fully quantum algorithm that reduces the GA search for the best fitness in a population of potential solutions to running Grover’s algorithm. Unfortunately, RQGA finds the best fitness value and its corresponding chromosome (i.e., the solution or one of the solutions of the problem) in exponential runtime, O(2n/2), where n is the number of qubits in the individuals’ quantum register. This article introduces a novel QGA optimization strategy, namely a classical optimization of a fully quantum algorithm, to address the RQGA complexity problem. Accordingly, we control the complexity of the RQGA algorithm by selecting a limited number of qubits in the individuals’ register and fixing the remaining ones as classical values of ‘0’ and ‘1’ with a genetic algorithm. We also improve the performance of RQGA by discarding unfit solutions and bounding the search only in the area of valid individuals. As a result, our Hybrid Quantum Algorithm with Genetic Optimization (HQAGO) solves search problems in O(2(n−k)/2) oracle queries, where k is the number of fixed classical bits in the individuals’ register.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PeerJ Computer SciencePub Date : 2024-08-02eCollection Date: 2024-01-01DOI: 10.7717/peerj-cs.2230
Ahmad Alenezi, Fergus McKiddie, Mintu Nath, Ali Mayya, Andy Welch
{"title":"Cardiotoxicity detection tool for breast cancer chemotherapy: a retrospective study.","authors":"Ahmad Alenezi, Fergus McKiddie, Mintu Nath, Ali Mayya, Andy Welch","doi":"10.7717/peerj-cs.2230","DOIUrl":"10.7717/peerj-cs.2230","url":null,"abstract":"<p><strong>Background: </strong>Patients with breast cancer undergoing biological therapy and/or chemotherapy perform multiple radionuclide angiography (RNA) or multigated acquisition (MUGA) scans to assess cardiotoxicity. The association between RNA imaging parameters and left ventricular (LV) ejection fraction (LVEF) remains unclear.</p><p><strong>Objectives: </strong>This study aimed to extract and evaluate the association of several novel imaging biomarkers to detect changes in LVEF in patients with breast cancer undergoing chemotherapy.</p><p><strong>Methods: </strong>We developed and optimized a novel set of MATLAB routines called the \"RNA Toolbox\" to extract parameters from RNA images. The code was optimized using various statistical tests (<i>e.g</i>., ANOVA, Bland-Altman, and intraclass correlation tests). We quantitatively analyzed the images to determine the association between these parameters using regression models and receiver operating characteristic (ROC) curves.</p><p><strong>Results: </strong>The code was reproducible and showed good agreement with validated clinical software for the parameters extracted from both packages. The regression model and ROC results were statistically significant in predicting LVEF (R<sup>2</sup> = 0.40, <i>P</i> < 0.001) (AUC = 0.78). Some time-based, shape-based, and count-based parameters were significantly associated with post-chemotherapy LVEF (β = 0.09, <i>P</i> < 0.001), LVEF of phase image (β = 4, <i>P</i> = 0.030), approximate entropy (ApEn) (β = 11.6, <i>P</i> = 0.001), ApEn (diastolic and systolic) (β = 39, <i>P</i> = 0.002) and LV systole size (β = 0.03, <i>P</i> = 0.010).</p><p><strong>Conclusions: </strong>Despite the limited sample size, we observed evidence of associations between several parameters and LVEF. We believe that these parameters will be more beneficial than the current methods for patients undergoing cardiotoxic chemotherapy. Moreover, this approach can aid physicians in evaluating subclinical cardiac changes during chemotherapy, and in understanding the potential benefits of cardioprotective drugs.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":null,"pages":null},"PeriodicalIF":3.5,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11323080/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141983917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}