{"title":"A survey on checkpointing strategies: Should we always checkpoint à la Young/Daly?","authors":"","doi":"10.1016/j.future.2024.07.022","DOIUrl":"10.1016/j.future.2024.07.022","url":null,"abstract":"<div><p>The Young/Daly formula provides an approximation of the optimal checkpointing period for a parallel application executing on a supercomputing platform. It was originally designed to handle fail-stop errors for preemptible tightly-coupled applications, but has been extended to other application and resilience frameworks. We provide some background and survey various scenarios to assess the usefulness and limitations of the formula, both for preemptible applications and workflow applications represented as a graph of tasks. We also discuss scenarios with uncertainties, and extend the study to silent errors. We exhibit cases where the optimal period is of a different order than that dictated by the Young/Daly formula, and finally we explain how checkpointing can be further combined with replication.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141853730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TriStack enables accurate identification of antimicrobial and anti-inflammatory peptides by combining machine learning and deep learning approaches","authors":"","doi":"10.1016/j.future.2024.07.024","DOIUrl":"10.1016/j.future.2024.07.024","url":null,"abstract":"<div><p>The identification of antimicrobial peptides (AMPs) and anti-inflammatory peptides (AIPs) is crucial for drug design and disease treatment. However, it remains a computational challenge to accurately identify these peptides due to insufficient information encoding the peptide sequences. In this study, we propose TriStack, a powerful and interpretable model for accurate identification of AMPs and AIPs by stacking a machine learning-based module using a multi-layer residual network. It first extracts three types of function-related features from peptide sequences to comprehensively characterize the composition, distribution, and physicochemical properties of residues. Furthermore, these features are fused and fed into a two-module stacked model. The first module provides the preliminary predictions based on three machine learning methods, while the second module refines these predictions further via a multi-layer residual network. After training and testing, TriStack outperforms all the compared leading methods for both AMPs and AIPs predictions. TriStack is expected to contribute to antimicrobial and anti-inflammatory drug based on peptide sequences.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141850860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy-aware virtual machine placement based on a holistic thermal model for cloud data centers","authors":"","doi":"10.1016/j.future.2024.07.020","DOIUrl":"10.1016/j.future.2024.07.020","url":null,"abstract":"<div><p>As energy-intensive infrastructures, data centers (DCs) have become a pressing challenge for managers due to their significant energy consumption and carbon emissions. Information technology (IT) and cooling systems contribute the most to energy consumption. Energy-aware virtual machine (VM) scheduling methods have been widely demonstrated to reduce energy consumption and operating costs in DCs. However, as realistic DCs exhibit complex power and thermodynamic behaviors, existing works cannot provide efficient measures to optimize computing and cooling power consumption simultaneously. To overcome this challenge, we construct a holistic thermal model (including CPU and server inlet thermal models) to accurately represent the non-uniform, dynamic thermal environment. Subsequently, this work proposes a thermal model-based energy-aware VM placement method (TEVP) to minimize the holistic energy consumption of the DCs, considering resource and thermal constraints. We develop a novel hybrid swarm intelligence algorithm (DE-ERPSO) combining differential evolution (DE) and particle swarm optimization with an elite re-selection mechanism (ERPSO) to explore more energy-efficient VM placement schemes. Extensive experiments are conducted on an extended CloudSim to validate the performance of the proposed TEVP using real-world workload traces (PlanetLab and Azure). Results show that TEVP saves over 5.6% of the total energy consumption over the advanced baselines while maintaining low thermal violations.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141844817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SHIELD: A Secure Heuristic Integrated Environment for Load Distribution in Rural-AI","authors":"","doi":"10.1016/j.future.2024.07.026","DOIUrl":"10.1016/j.future.2024.07.026","url":null,"abstract":"<div><p>The increasing adoption of edge computing in rural areas is leading to a substantial rise in data generation, necessitating the need for development of advanced load balancing algorithms. This is particularly important in applications that utilise existing, though limited, computational and data communication infrastructures. Furthermore, rural communities have growing concerns regarding the privacy, security, and ownership of the data produced within their agricultural fields. Load distribution in rural edge devices can enhance agricultural practices by improving resource usage, decision-making, and addressing network connectivity challenges. Managing resource utilisation in this way also improves economic investments made in managing and deploying edge devices in rural environments. In this work, we propose SHIELD, a security-aware load balancing framework, primarily designed for edge-based systems in rural areas. For handling environments with limited connectivity, SHIELD efficiently manages tasks and computational resources by categorising them into restricted, public and private, shared respectively. It also allocates tasks considering key performance factors such as completion time, resource utilisation, failure rate, and security. The framework is evaluated on a weed detection scenario in precision agriculture, using three federated learning (FL) variants (local model training, global model aggregation, and model prediction) with the ResNet-50 model trained on the DeepWeeds image classification dataset. The proposed framework also integrates encryption and task replication techniques for data confidentiality, integrity, and availability. Experimental results show that SHIELD demonstrates an average of 23% (using Parsl), 29% (using OpenWhisk) improvement in failure rate and 18 s (Parsl), 13 s (OpenWhisk) average improvement in makespan compared to other task allocation approaches, such as secure variants of random, round robin, and least loaded.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141838637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification and model construction of survival-associated proteins for pancreatic cancer based on deep learning","authors":"","doi":"10.1016/j.future.2024.07.023","DOIUrl":"10.1016/j.future.2024.07.023","url":null,"abstract":"<div><p>Pancreatic cancer (PC) is a malignancy typified by its insidious onset, rapid progression, limited resectability, poor treatment response, and exceedingly dismal prognosis. The transition from precursor lesions to infiltrating malignant tumors in pancreatic cancer is concomitant with the accrual of genetic mutations. The elucidation of proteins linked to the prognosis of pancreatic cancer holds paramount importance in the realm of pancreatic cancer management. Herein, we introduce DeepPCSA, a model tailored for the screening of target proteins and the prediction of survival time, employing both conventional methodologies and deep learning techniques for patient survival analysis. This framework leverages the LASSOCOX regression approach on differentially expressed genes (DEGs) to discern pivotal genes, succeeded by the development of a prognostic model employing convolutional and fully connected layers to prognosticate patient survival duration. Furthermore, we account for covariates such as age, gender, and other pertinent factors for independent prognostic scrutiny, affirming the autonomy of our model as a prognostic determinant. Employing unsupervised clustering methodologies, we delineated two molecular subtypes delineated by disparate biological processes. Ultimately, via drug sensitivity analysis, we delineated the interrelation between survival duration and pharmaceuticals, substantiating the necessity for tailored drug interventions catering to patients of varying risk strata.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141852637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Agile Optimization Framework: A framework for tensor operator optimization in neural network","authors":"","doi":"10.1016/j.future.2024.07.019","DOIUrl":"10.1016/j.future.2024.07.019","url":null,"abstract":"<div><p>In recent years, with the gradual slowing of Moore’s Law and the development of deep learning, the demand for hardware performance of executing deep learning based applications has significantly increased. In this case, deep learning compilers have been proven to maximize hardware performance while keeping computational power constant, especially the end-to-end compiler Tensor Virtual Machine (TVM). TVM optimizes tensors by finding excellent parallel computing schemes, thereby achieving the goal of improving the performance of neural network inference. However, there is still untapped potential in current optimization methods. However, existing optimization methods based on the TVM, such as Genetic Algorithms Tuner (GA-Tuner), have failed to achieve a balance between optimization performance and optimization time. The intolerable duration of optimization detracts from TVM’s usability, rendering it challenging to extend into the scientific community. This paper introduces a novel deep learning compilation optimization framework base on TVM called Agile Optimization Framework (AOF), which incorporates a tuner based on the latest Beluga Whale Optimization Algorithm (BWO). The BWO is adept at tackling complex problems characterized by numerous local optima, making it particularly suitable for hardware compilation optimization scenarios. We further propose an Evolving Epsilon Strategy (EES), a search strategy that adaptively adjusts the balance between exploration and exploitation, thereby enhancing the effectiveness of the algorithm. Additionally, we developed a supervised Tuning Accelerator (TA) aimed at reducing the time required for optimization and enhancing efficiency. Comparative experiments demonstrate that AOF achieves 11.36%–66.20% improvement in performance and 30.30%–54.60% reduction in optimization time, significantly outperforming the control group.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141704936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrating fully homomorphic encryption to enhance the security of blockchain applications","authors":"","doi":"10.1016/j.future.2024.07.015","DOIUrl":"10.1016/j.future.2024.07.015","url":null,"abstract":"<div><p>Blockchain has been widely used for secure transactions among untrusted parties, but the current design of blockchain does not provide sufficient privacy and security for the data on the chain, limiting its application in sensitive information scenarios. To address this problem, we propose integrating fully homomorphic encryption (FHE) to enhance the security of blockchain applications, which can extend the application scope of blockchain and improve the privacy and security of blockchain by the features of FHE. Our scheme classifies FHE into those supporting polynomial and non-polynomial operations and introduces the concept of ciphertext computation conversion into Ethereum, enabling conversion between different ciphertext computation types. Moreover, we analyse the security and correctness to explain the feasibility and availability of the scheme. We carry out comparative experiments using different open-source libraries for fully homomorphic encryption and the time performance evaluation of the ciphertext computation conversion under different thread counts. The experiment results demonstrate the efficiency and usability of our scheme.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141705652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TransGINmer: Identifying viral sequences from metagenomes with self-attention and Graph Isomorphism Network","authors":"","doi":"10.1016/j.future.2024.07.025","DOIUrl":"10.1016/j.future.2024.07.025","url":null,"abstract":"<div><p>Viruses, abundant across diverse environments, play pivotal roles in microbial ecosystems and impact human health. Traditional virus studies are limited by their reliance on culture cultivation, which has been mitigated by metagenomics. It obtains nucleotide sequences of all microorganisms from the environment samples through the next-generation sequencing technology. This advancement prompts the need for efficient viral identification methods. To identify viruses accurately and quickly, We propose TransGINmer, a novel deep learning model to identify viral sequences directly from metagenomes. It encodes sequences by a k-mer frequency embedding model, constructs graphs from significant codon token correlations, and classifies them using graph isomorphism neural networks. In comparative tests against some SOTA methods DeepVirFinder, VirSorter2 and PhaMer on the testing dataset, the Amazon River dataset, the Sharon dataset and the CAMI Strain dataset, TransGINmer demonstrates superior accuracy, sensitivity, specificity, and AUC values, showcasing its potential as a robust tool for viral identification from metagenomes. TransGINmer is freely available at Github (<span><span>https://github.com/xizhilangcc/TransGINmer</span><svg><path></path></svg></span>).</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141712009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FRESH: Fault-tolerant Real-time Scheduler for Heterogeneous multiprocessor platforms","authors":"","doi":"10.1016/j.future.2024.07.008","DOIUrl":"10.1016/j.future.2024.07.008","url":null,"abstract":"<div><p>Real-time embedded systems are designed to execute precise functions within strict time constraints, utilizing microcontrollers, memory, and input/output devices. These systems’ critical component are the scheduler, responsible for efficient resource allocation and job scheduling based on priority and available resources. Multiprocessor platforms have been adopted to enhance performance, scalability, redundancy, and flexibility, employing diverse scheduling approaches. Fault tolerance is crucial in safety-sensitive systems that operate in real-time as they offer advantages by dynamically adapting to temporary faults, thereby ensuring system reliability and meeting performance requirements without sacrificing resource efficiency. Additionally, reducing dynamic energy consumption plays a vital role in improving battery life and reliability and adhering to power constraints in applications. Existing fault-tolerant schemes primarily focus on homogeneous multiprocessor systems or dual-type heterogeneous systems. This work introduces a novel heuristic scheduler named FRESH, which effectively addresses energy management and fault tolerance challenges in systems with various processor types. To validate the proposed approach, we conduct experiments using benchmark programs which show that FRESH is able to create a high number of secondary copies for the jobs to mitigate transient faults and also reduce significant energy consumption.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141688703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}