Future Generation Computer Systems-The International Journal of Escience最新文献

筛选
英文 中文
DTRE: A model for predicting drug-target interactions of endometrial cancer based on heterogeneous graph DTRE:基于异质图的子宫内膜癌药物靶点相互作用预测模型
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-07-11 DOI: 10.1016/j.future.2024.07.012
{"title":"DTRE: A model for predicting drug-target interactions of endometrial cancer based on heterogeneous graph","authors":"","doi":"10.1016/j.future.2024.07.012","DOIUrl":"10.1016/j.future.2024.07.012","url":null,"abstract":"<div><p>Endometrial cancer is one of the most common gynecological malignancies affecting women worldwide, posing a serious threat to women’s health. Moreover, the identification of drug-target interactions (DTIs) is typically a time-consuming and costly critical step in drug discovery. In order to identify potential DTIs to enhance targeted therapy for endometrial cancer, we propose a deep learning model named DTRE (Drug-Target Relationship Enhanced) based on a heterogeneous graph to predict DTIs, which utilizes the relationships between drugs and targets to effectively capture their interactions. In the heterogeneous graph, nodes represent drugs and targets, and edges represent their interactions, then the representations of drugs and targets are learned through graph convolutional network, graph attention network and attention mechanism. Experimental results on the dataset proposed in this paper show that the AUC and AUPR of DTRE achieve 0.870 and 0.872 respectively, significantly outperforming comparative models and indicating that DTRE can effectively predict DTIs when applied to large-scale data. Additionally, DTRE also predicts the potential DTIs for endometrial cancer, providing new insights into targeted therapy for it.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167739X24003753/pdfft?md5=ec2c21087b0dc076540861fc166a4572&pid=1-s2.0-S0167739X24003753-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141695987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A two-stage budget-feasible mechanism for mobile crowdsensing based on maximum user revenue routing 基于最大用户收益路由的两阶段预算可行的移动众包机制
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-07-10 DOI: 10.1016/j.future.2024.06.059
{"title":"A two-stage budget-feasible mechanism for mobile crowdsensing based on maximum user revenue routing","authors":"","doi":"10.1016/j.future.2024.06.059","DOIUrl":"10.1016/j.future.2024.06.059","url":null,"abstract":"<div><p>Through user participation, mobile crowdsensing (MCS) services overcome the problem of the excessive costs of relying solely on the active deployment of sensors and of achieving large-scale and low-cost applications of the Internet of Things, which is a research hotspot. However, current research on MCS issues adopts the perspective of service providers and does not consider user strategies, so the corresponding models cannot accurately reflect the complete status of the system. Therefore, this paper decomposes the MCS problem into a two-stage game process. By doing so, the strategies of both users and service providers can be considered, thus maximizing the interest for both parties. In the first stage, users determine the optimal route based on information released by the service provider. In the second stage, the service provider determines the winning users and the corresponding payment plan based on the route and bid information submitted by all users. Specifically, we express the user’s optimal route decision-making problem as a traveling salesman problem with time windows and node number constraints. Accordingly, we design the F-MAX-RR algorithm based on an evolutionary algorithm. We show that this algorithm can achieve an approximation ratio of <span><math><mrow><mo>(</mo><mn>1</mn><mo>−</mo><mn>1</mn><mo>/</mo><mi>e</mi><mo>)</mo></mrow></math></span>, with the expected number of iterations being <span><math><mrow><mn>8</mn><mi>e</mi><msup><mrow><mi>L</mi></mrow><mrow><mn>2</mn></mrow></msup><mrow><mo>(</mo><mi>L</mi><mo>+</mo><mn>1</mn><mo>)</mo></mrow><mi>M</mi></mrow></math></span>. In the second stage, to maximize the total utility of the system, we transform the problem into an integer programming model with a budget constraint, which satisfies submodular characteristics. We design the S-MAX-TUM mechanism based on monotonic allocation and critical price theory to solve the problem of winning user decision-making and pricing. We demonstrate the economic characteristics of the mechanism, including truthfulness, individual rationality, and budget feasibility. The experimental results indicate the effective performance of the designed mechanisms.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141639446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient and scalable covariate drift detection in machine learning systems with serverless computing 利用无服务器计算在机器学习系统中进行高效、可扩展的协变量漂移检测
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-07-10 DOI: 10.1016/j.future.2024.07.010
{"title":"Efficient and scalable covariate drift detection in machine learning systems with serverless computing","authors":"","doi":"10.1016/j.future.2024.07.010","DOIUrl":"10.1016/j.future.2024.07.010","url":null,"abstract":"<div><p>As machine learning models are increasingly deployed in production, robust monitoring and detection of concept and covariate drift become critical. This paper addresses the gap in the widespread adoption of drift detection techniques by proposing a serverless-based approach for batch covariate drift detection in ML systems. Leveraging the open-source OSCAR framework and the open-source Frouros drift detection library, we develop a set of services that enable parallel execution of two key components: the ML inference pipeline and the batch covariate drift detection pipeline. To this end, our proposal takes advantage of the elasticity and efficiency of serverless computing for ML pipelines, including scalability, cost-effectiveness, and seamless integration with existing infrastructure. We evaluate this approach through an edge ML use case, showcasing its operation on a simulated batch covariate drift scenario. Our research highlights the importance of integrating drift detection as a fundamental requirement in developing robust and trustworthy AI systems and encourages the adoption of these techniques in ML deployment pipelines. In this way, organizations can proactively identify and mitigate the adverse effects of covariate drift while capitalizing on the benefits offered by serverless computing.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167739X24003716/pdfft?md5=703be1c6b28f907c0f5552d2b8f50e16&pid=1-s2.0-S0167739X24003716-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141639447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeFuseDTI: Interpretable drug target interaction prediction model with dual-branch encoder and multiview fusion DeFuseDTI:采用双分支编码器和多视图融合的可解释药物靶点相互作用预测模型
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-07-09 DOI: 10.1016/j.future.2024.07.014
{"title":"DeFuseDTI: Interpretable drug target interaction prediction model with dual-branch encoder and multiview fusion","authors":"","doi":"10.1016/j.future.2024.07.014","DOIUrl":"10.1016/j.future.2024.07.014","url":null,"abstract":"<div><p>Predicting the interaction between drugs and targets is a crucial step in drug development, and computer-based deep learning approaches have the potential to significantly reduce costs. Existing models using a single encoder often suffer from insufficient cross-modal feature extraction, with most models tending to overly focus on extracting locally aggregated information, thereby diluting the detailed features of each target residue and drug atom. Additionally, the lack of effective interaction fusion between drug and target lead to prediction results lacking reliable interpretability, posing a more urgent issue. To address these challenges, we propose a dual-branch encoder model, DeFuseDTI, which includes base encoder and detail encoder to extract locally aggregated features and detailed features of each target residue and drug atom. The detail encoder (utilizing Invertible Neural Networks for targets and graph transformers for drugs) can capture furtherly the features of each atom and residue, providing rich and precise features for model interpretability. For better achieve interactive learning of drug and target features, the Multiview Fusion Attention learning module was introduced to integrate multiview features and generate a unified representations for decoding prediction results. Based on the module's unique attention mechanism, drug-target importance matrices can be obtained, which offer interpretable analysis at the molecular level. Experimental results and analyses demonstrate that DeFuseDTI outperforms several state-of-the-art models on four public datasets. Its significant interpretability holds promise for providing accurate and scientifically meaningful guidance for biochemical experiments at the molecular level.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141710771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Portability and scalability evaluation of large-scale statistical modeling and prediction software through HPC-ready containers 通过高性能计算就绪容器对大型统计建模和预测软件的可移植性和可扩展性进行评估
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-07-09 DOI: 10.1016/j.future.2024.06.057
{"title":"Portability and scalability evaluation of large-scale statistical modeling and prediction software through HPC-ready containers","authors":"","doi":"10.1016/j.future.2024.06.057","DOIUrl":"10.1016/j.future.2024.06.057","url":null,"abstract":"<div><p>HPC-based applications often have complex workflows with many software dependencies that hinder their portability on contemporary HPC architectures. In addition, these applications often require extraordinary efforts to deploy and execute at performance potential on new HPC systems, while the users expert in these applications generally have less expertise in HPC and related technologies. This paper provides a dynamic solution that facilitates containerization for transferring HPC software onto diverse parallel systems. The study relies on the HPC Workflow as a Service (HPCWaaS) paradigm proposed by the EuroHPC eFlows4HPC project. It offers to deploy workflows through containers tailored for any of a number of specific HPC systems. Traditional container image creation tools rely on OS system packages compiled for generic architecture families (x86_64, amd64, ppc64, …) and specific MPI or GPU runtime library versions. The containerization solution proposed in this paper leverages HPC Builders such as Spack or Easybuild and multi-platform builders such as buildx to create a service for automating the creation of container images for the software specific to each hardware architecture, aiming to sustain the overall performance of the software. We assess the efficiency of our proposed solution for porting the geostatistics ExaGeoStat software on various parallel systems while preserving the computational performance. The results show that the performance of the generated images is comparable with the native execution of the software on the same architectures. On the distributed-memory system, the containerized version can scale up to 256 nodes without impacting performance.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141732295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the choice of physical constraints in artificial neural networks for predicting flow fields 论预测流场的人工神经网络中物理约束条件的选择
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-07-09 DOI: 10.1016/j.future.2024.07.009
{"title":"On the choice of physical constraints in artificial neural networks for predicting flow fields","authors":"","doi":"10.1016/j.future.2024.07.009","DOIUrl":"10.1016/j.future.2024.07.009","url":null,"abstract":"<div><p>The application of Artificial Neural Networks (ANNs) has been extensively investigated for fluid dynamic problems. A specific form of ANNs are Physics-Informed Neural Networks (PINNs). They incorporate physical laws in the training and have increasingly been explored in the last few years. In this work, the prediction accuracy of PINNs is compared with that of conventional Deep Neural Networks (DNNs). The accuracy of a DNN depends on the amount of data provided for training. The change in prediction accuracy of PINNs and DNNs is assessed using a varying amount of training data. To ensure the correctness of the training data, they are obtained from analytical and numerical solutions of classical problems in fluid mechanics. The objective of this work is to quantify the fraction of training data relative to the maximum number of data points available in the computational domain, such that the accuracy gained with PINNs justifies the increased computational cost. Furthermore, the effects of the location of sampling points in the computational domain and noise in training data are analyzed. In the considered problems, it is found that PINNs outperform DNNs when the sampling points are positioned in the Regions of Interest. PINNs for predicting potential flow around a Rankine oval have shown a better robustness against noise in training data compared to DNNs. Both models show higher prediction accuracy when sampling points are randomly positioned in the flow domain as compared to a prescribed distribution of sampling points. The findings reveal new insights on the strategies to massively improve the prediction capabilities of PINNs with respect to DNNs.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167739X24003728/pdfft?md5=06cd01a8ffd48b55efbf83d9e5e961dc&pid=1-s2.0-S0167739X24003728-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141710309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ST-TrajGAN: A synthetic trajectory generation algorithm for privacy preservation ST-TrajGAN:保护隐私的合成轨迹生成算法
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-07-09 DOI: 10.1016/j.future.2024.07.011
{"title":"ST-TrajGAN: A synthetic trajectory generation algorithm for privacy preservation","authors":"","doi":"10.1016/j.future.2024.07.011","DOIUrl":"10.1016/j.future.2024.07.011","url":null,"abstract":"<div><p>The rapid growth of large-scale trajectory data poses privacy risks for location-based services (LBS), primarily through centralized storage and processing of data, as well as insecure data transmission channels (such as the Internet and wireless networks), which can lead to unauthorized access or manipulation of users' location information by attackers. To enhance trajectory privacy protection while improving the trajectory utility, this paper proposes an efficient and secure deep learning model Semantic and Transformer-based Trajectory Generative Adversarial Networks (ST-TrajGAN) for trajectory data generation and publication. First, this article introduces a semantic trajectory encoding model for preprocessing trajectory points. Through this model, trajectory points can be transformed into vector representations with semantic information. Next, by learning the spatio-temporal and semantic features of real trajectory data, a deep learning model is used to generate synthetic trajectories with more uncertainty and practicality. Furthermore, a novel TrajLoss loss metric function was crafted to gauge the trajectory similarity loss within the trained deep learning model. Ultimately, the efficacy of the generated synthetic trajectories and the model's utility are assessed through Trajectory-User Linking (TUL) and Trajectory Sharing Percentage (TSP) values on three authentic Location-Based Services (LBS) datasets. Numerous experiments have shown that our method outperforms other methods in terms of privacy protection effectiveness and utility.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141697844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
3DSGIMD: An accurate and interpretable molecular property prediction method using 3D spatial graph focusing network and structure-based feature fusion 3DSGIMD:利用三维空间图聚焦网络和基于结构的特征融合的精确且可解释的分子特性预测方法
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-07-08 DOI: 10.1016/j.future.2024.07.004
{"title":"3DSGIMD: An accurate and interpretable molecular property prediction method using 3D spatial graph focusing network and structure-based feature fusion","authors":"","doi":"10.1016/j.future.2024.07.004","DOIUrl":"10.1016/j.future.2024.07.004","url":null,"abstract":"<div><p>A comprehensive representation of molecular structure is essential for establishing accurate and reliable molecular property prediction models. However, fully extracting and learning intrinsic molecular structure information, especially spatial structure features, remains a challenging task, leading that many molecular property prediction models still have no enough accuracy for the real application. In this study, we developed an innovative and interpretable deep learning method, termed 3DSGIMD, which predicted the molecular properties by integrating and learning the spatial structure and substructure information of molecules at multiple levels, and generated the focusing weights by aggregating spatial and adjacency information of molecules to improve understanding of prediction results. We evaluated the model on 10 public datasets and 14 cell-based phenotypic screening datasets. Extensive experimental results indicated that 3DSGIMD achieved superior or comparable predictive performance compared with some existing models, and the individually designed components contributed significantly to the advanced performance of the model. In addition, we also provided insight into the interpretability of our model via visualizing the focusing weights and perturbation analysis, and the results showed that 3DSGIMD can pinpoint crucial local structures and bits of molecular descriptors associated with the predicted properties. In summary, 3DSGIMD is a competitive molecular property prediction method that holds the potential to aid drug design and optimization.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141639445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DBPBFT: A hierarchical PBFT consensus algorithm with dual blockchain for IoT DBPBFT:适用于物联网的双区块链分层 PBFT 共识算法
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-07-08 DOI: 10.1016/j.future.2024.07.007
{"title":"DBPBFT: A hierarchical PBFT consensus algorithm with dual blockchain for IoT","authors":"","doi":"10.1016/j.future.2024.07.007","DOIUrl":"10.1016/j.future.2024.07.007","url":null,"abstract":"<div><p>The Internet of Things (IoT) is composed of smart devices connected to a network that can send and receive large amounts of data with other devices, generating a lot of data for processing and analysis. Due to the fact that every transaction in blockchain is recorded, placed in a data block, and added to an immutable and secure data chain, blockchain is becoming one of the most promising solutions for enhancing IoT security issues. As more devices become intelligent, the scale of IoT systems, including residential IoT and industrial IoT, is on the rise. Consequently, the issue of resource consumption, stemming from the escalating system communication overhead, is becoming more pronounced. In order to improve the efficiency of the consensus process for residential IoT and reduce the overhead caused by the consensus process, this paper proposes a hierarchical PBFT consensus algorithm With Dual Blockchain for IoT (DBPBFT). Compared to industrial IoT, DBPBFT is more suitable for residential IoT with small scope and clear data classification. DBPBFT separates the responsibilities of dual chains, improving system scalability while also enhancing blockchain security. A chain is divided into several small groups, each responsible for a type of data, reducing system overhead and communication overhead. To avoid unnecessary view-change as much as possible, before consensus begins, each group will select the current view primary node based on reputation values. The simulation results show that the DBPBFT algorithm is superior to traditional algorithms. In terms of reducing communication overhead, compared with EPBFT and DPNPBFT, DBPBFT has increased by 73.8% and 53.1%, respectively. In terms of consensus efficiency, DBPBFT has improved by 34% compared to DPNPBFT.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141691441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PowerTrain: Fast, generalizable time and power prediction models to optimize DNN training on accelerated edges PowerTrain:快速、可通用的时间和功率预测模型,用于优化加速边缘上的 DNN 训练
IF 6.2 2区 计算机科学
Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-07-06 DOI: 10.1016/j.future.2024.07.001
{"title":"PowerTrain: Fast, generalizable time and power prediction models to optimize DNN training on accelerated edges","authors":"","doi":"10.1016/j.future.2024.07.001","DOIUrl":"10.1016/j.future.2024.07.001","url":null,"abstract":"<div><p>Accelerated edge devices, like Nvidia’s Jetson with 1000+ CUDA cores, are increasingly used for DNN training and federated learning, rather than just for inferencing workloads. A unique feature of these compact devices is their fine-grained control over CPU, GPU, memory frequencies, and active CPU cores, which can limit their power envelope in a constrained setting while throttling the compute performance. Given this vast 10k+ parameter space, selecting a power mode for dynamically arriving training workloads to exploit power–performance trade-offs requires costly profiling for each new workload, or is done <em>ad hoc</em>. We propose <em>PowerTrain</em>, a transfer-learning approach to accurately predict the power and time that will be consumed when we train a given DNN workload (model + dataset) using any specified power mode (CPU/GPU/memory frequencies, core-count). It requires a one-time offline profiling of 1000s of power modes for a reference DNN workload on a single Jetson device (Orin AGX) to build Neural Network (NN) based prediction models for time and power. These NN models are subsequently transferred (retrained) for a new DNN workload, or even a different Jetson device, with minimal additional profiling of just 50 power modes to make accurate time and power predictions. These are then used to rapidly construct the Pareto front and select the optimal power mode for the new workload, e.g., to minimize training time while meeting a power limit. PowerTrain’s predictions are robust to new workloads, exhibiting a low MAPE of <span><math><mrow><mo>&lt;</mo><mn>6</mn><mtext>%</mtext></mrow></math></span> for power and <span><math><mrow><mo>&lt;</mo><mn>15</mn><mtext>%</mtext></mrow></math></span> for time on six new training workloads (MobileNet, YOLO, BERT, LSTM, etc.) for up to 4400 power modes, when transferred from a ResNet reference workload on Orin AGX. It is also resilient when transferred to two entirely new Jetson devices (Xavier AGX and Jetson Orin Nano) with prediction errors of <span><math><mrow><mo>&lt;</mo><mn>14</mn><mo>.</mo><mn>5</mn><mtext>%</mtext></mrow></math></span> and <span><math><mrow><mo>&lt;</mo><mn>11</mn><mtext>%</mtext></mrow></math></span>. These outperform baseline predictions by more than 10% and baseline optimizations by up to 45% on time and 88% on power.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141691287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信