Kemal Ebcioğlu, Batuhan Bulut, Atakan Doğan, Gürhan Küçük, İsmail San
{"title":"1,024-FPGA DES Supercomputer on the AWS Cloud","authors":"Kemal Ebcioğlu, Batuhan Bulut, Atakan Doğan, Gürhan Küçük, İsmail San","doi":"10.1002/cpe.70051","DOIUrl":"https://doi.org/10.1002/cpe.70051","url":null,"abstract":"<div>\u0000 \u0000 <p>We present a 1,024-FPGA DES supercomputer accelerator that is automatically compiled from a single-threaded sequential DES key search application by means of our High-Level Synthesis compiler. Our 1,024-FPGA supercomputer is deployed on several Amazon Web Services (AWS) EC2 F1 instance platforms from different AWS regions. Consequently, it can be considered the first multi-chip application-specific supercomputer that is scattered to multiple geographically distributed data centers around the world. Furthermore, invoking our 1,024-FPGA DES supercomputer is functionally identical to invoking the single-threaded sequential DES application the supercomputer accelerator is compiled from. Our 1,024-FPGA supercomputer achieves 3.016E+12 keys/sec and it performs 5,286,000 times better than an AWS EC2 m5.8xlarge Xeon x86 machine executing the original sequential application with a performance of 5.706 E+5 keys/sec.</p>\u0000 </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 6-8","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143689056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantum Computer Architecture for Quantum Error Correction With Distributing Process to Multiple Temperature Layers","authors":"Ryuji Ukai, Chihiro Yoshimura, Hiroyuki Mizuno","doi":"10.1002/cpe.8351","DOIUrl":"https://doi.org/10.1002/cpe.8351","url":null,"abstract":"<div>\u0000 \u0000 <p>Quantum computers are capable of performing large-scale calculations in a shorter time than conventional classical computers. Because quantum computers are realized in microscopic physical systems, unintended change in the quantum state is unavoidable due to interaction between environment, and it would lead to error in computation. Therefore, quantum error correction is needed to detect and correct errors that have occurred. In this paper, we propose quantum computer architecture for quantum error correction by taking account that the components of a quantum computer with quantum dots in silicon are divided into multiple temperature layers inside and outside the dilution refrigerator. Based on the required performance and possible processing capacity, each component was distributed in various temperature layers: the chip with qubits and the chip for generation of precise analog signals to control qubits are placed on 100 mK and 4 K stages inside the dilution refrigerator, respectively, while real-time digital processing is performed outside the dilution refrigerator. We then experimentally demonstrate the digital control sequence for quantum error correction combined with a simulator which simulates quantum states based on control commands from the digital processing system. The simulator enables the proof-of-principle experiment of system architecture independent of the development of the chips. The real time processing including determination of feed-forward operation and transmission of feed-forward operation commands is carried out by a field-programmable gate array (FPGA) outside the dilution refrigerator within 0.01 ms for bit-flip or phase-flip error corrections. This is a sufficiently short time compared to the assumed relaxation time, which is the approximate time that the quantum state can be preserved, meaning that our proposed architecture is applicable to quantum error correction.</p>\u0000 </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 6-8","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143646325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Efficient Hybrid Recommender System for e-Learning Based on Cloud Content in Educational Web Services","authors":"Baoqing Tai, Xianxian Yang, Ju Chong, Lei Chen","doi":"10.1002/cpe.70059","DOIUrl":"https://doi.org/10.1002/cpe.70059","url":null,"abstract":"<div>\u0000 \u0000 <p>In this article, we present a novel method for multimodal learning using Siamese networks to recommend appropriate educational content on e-learning platforms. One of the main challenges in current recommendation systems is their inability to effectively personalize content based on the unique needs and preferences of individual learners. Existing methods often struggle to capture long-term dependencies and intricate patterns in user behavior, leading to irrelevant or inadequate content suggestions. To address this, our approach utilizes two residual Siamese networks based on Long Short-Term Memory (LSTM) and Recurrent Convolutional Neural Networks (RCNN). This hybrid model effectively captures both sequential and contextual information, leveraging LSTM's strength in handling long-term dependencies and RCNN's capability to extract local features through convolutional operations. By analyzing complex patterns within the data, our method significantly enhances recommendation accuracy, considering both temporal sequences and contextual relationships. The Siamese network encodes user and item data into a high-dimensional feature space, positioning similar users and items closer together. The residual connections allow the model to capture both low-level and high-level features, leading to richer representations. Extensive experiments on real-world e-learning datasets demonstrate the superiority of our method over traditional recommendation techniques, evaluated through metrics such as precision, recall, and accuracy. The results show that our approach not only improves recommendation accuracy but also enhances the diversity and relevance of suggested content, offering more personalized learning experiences that cater to the individual needs and preferences of learners.</p>\u0000 </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 6-8","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143689058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GPU Acceleration of the GWO Optimization Algorithm: Application to the Solution of Large Nonlinear Equation Systems","authors":"Bruno Silva, Luiz Guerreiro Lopes","doi":"10.1002/cpe.70043","DOIUrl":"https://doi.org/10.1002/cpe.70043","url":null,"abstract":"<div>\u0000 \u0000 <p>Large-scale optimization problems present formidable challenges in various scientific and engineering domains. To address these challenges, population-based computational intelligence algorithms have emerged as potent tools capable of being parallelized. Among these algorithms, the gray wolf optimizer (GWO) stands out for its ability to simulate the hierarchical structure and hunting behaviors of gray wolves in the wild and has been used successfully to solve several hard optimization problems. However, the study of its applicability for solving nonlinear equation systems (NESs), which is arguably one of the most difficult classes of numerical problems, poses significant challenges in terms of computational efficiency and scalability. To address this gap, this article introduces a novel GPU-based parallel implementation of the GWO algorithm aimed at addressing the particular challenges of optimizing large-scale NESs by employing the substantial parallel processing capabilities of GPUs. The GPU-based version of GWO was developed using the Julia programming language, and its performance was evaluated with two GPUs of professional grade: the NVIDIA Tesla V100 SXM2 with 32 GB VRAM and the NVIDIA A100 PCIe with 80 GB VRAM. The testing involved a series of complex, scalable NESs with dimensions ranging from 500 to 4000. The results obtained demonstrate average speedups ranging from 154.9<span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mo>×</mo>\u0000 </mrow>\u0000 <annotation>$$ times $$</annotation>\u0000 </semantics></math> to 250.2<span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mo>×</mo>\u0000 </mrow>\u0000 <annotation>$$ times $$</annotation>\u0000 </semantics></math> for the V100 GPU and 204.0<span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mo>×</mo>\u0000 </mrow>\u0000 <annotation>$$ times $$</annotation>\u0000 </semantics></math> to 923.9<span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mo>×</mo>\u0000 </mrow>\u0000 <annotation>$$ times $$</annotation>\u0000 </semantics></math> for the A100. These results highlight the effectiveness of the proposed GPU-based acceleration technique in reducing computation times.</p>\u0000 </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 6-8","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143689054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On Reducing Stretch in Spanning Trees","authors":"Sinchan Sengupta, Sathya Peri","doi":"10.1002/cpe.70019","DOIUrl":"https://doi.org/10.1002/cpe.70019","url":null,"abstract":"<p>A parameter crucial for preserving the underlying shortest path information in spanning tree construction is called <i>stretch</i>. It is the ratio of the distance of a pair of nodes in the spanning tree to their shortest distance in the graph. In this paper, we present a distributed heuristic <i>LSTree</i> that constructs a <i>Minimum Average Stretch Spanning Tree</i> of an <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>n</mi>\u0000 <mo>−</mo>\u0000 <mtext>node</mtext>\u0000 </mrow>\u0000 <annotation>$$ n-mathrm{node} $$</annotation>\u0000 </semantics></math> undirected and unweighted graph in <span></span><math>\u0000 <mrow>\u0000 <mi>𝒪</mi>\u0000 <mo>(</mo>\u0000 <mi>n</mi>\u0000 <mo>)</mo>\u0000 </mrow></math> rounds of the CONGEST model, assuming the nodes know the size of the network. We like to stress that the LSTree protocol is the first use of <i>Betweenness Centrality</i> in constructing low-stretch trees. The heuristic outperforms the current benchmark algorithm of Alon et al. and other spanning tree construction techniques when tested against synthetic and real-world graph inputs. This paper concludes after giving a distributed edge addition technique for building an overlay while reducing the maximum stretch in the spanning tree generated by LSTree. The overlay is a relaxation in the topological requirement, albeit equivalent in functionality to the network backbone. Hence, in this way, the paper considers a holistic view towards building low-stretch spanning trees: reducing both average stretch and max stretch in a single approach.</p>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 6-8","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cpe.70019","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143632785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quadratic Estimation for Discrete Time System With State Equality Constraints and Composite Disturbances","authors":"Lei Tan, Xinmin Song","doi":"10.1002/cpe.70047","DOIUrl":"https://doi.org/10.1002/cpe.70047","url":null,"abstract":"<div>\u0000 \u0000 <p>This article investigates the state estimation issue for discrete-time systems subject to composite disturbances (CD) and state equality constraints, in which the CD include unknown inputs and non-Gaussian noise. To enhance the estimation performance under the influence of CD, a quadratic estimator incorporating state equality constraints is proposed. Initially, the Kronecker algebra technique is used to compute the second-order Kronecker powers of the raw vectors and a quadratic dynamical system is formed by combining the original vectors with their corresponding second-order Kronecker powers. Additionally, a specific condition is imposed to suppress the interference caused by unknown inputs. On this basis, a quadratic state unconstrained estimator (QSUE) is developed based on the minimum variance unbiased criterion. Furthermore, a quadratic state constrained estimator (QSCE) is designed by applying the projection technique to the QSUE. Finally, a simulation example demonstrates that the QSCE achieves better estimation performance than the QSUE and exhibits greater adaptability to CD.</p>\u0000 </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 6-8","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143632786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced Gradient-Based Optimizer Algorithm With Multi-Strategy for Feature Selection","authors":"Tianbao Liu, Yang Li, Xiwen Qin","doi":"10.1002/cpe.70034","DOIUrl":"https://doi.org/10.1002/cpe.70034","url":null,"abstract":"<div>\u0000 \u0000 <p>Feature selection is an effective tool for processing data. It is employed to eliminate redundant or irrelevant features and select optimal feature subsets to improve the performance of learning models. The gradient-based optimizer (GBO) received extensive attention in solving different optimization problems, which have the gradient search rule (GSR) and the local escaping operation (LEO). However, when addressing complex optimization and feature selection problems, GBO exhibits deficiencies in balancing global exploration and exploitation, and tends to converge to local optima. This article presents a modified version of GBO, named FWZGBO, for solving feature selection problems. Firstly, inspired by the iterative method and its theory, we propose an enhanced strategy for significantly accelerating the search capability in GSR. This strategy utilizes an optimal fourth-order iterative method to perform the corresponding function of the second-order Newton's method. Secondly, we suggest an enhanced refraction learning approach with Gaussian distribution to help the algorithm escape from local optima and enhance population diversity. Thirdly, this work devises a new adaptive weight based on the cosine strategy in both GSR and LEO to attain a harmonious balance between exploration and exploitation. To validate the performance of the FWZGBO algorithm, 28 benchmark functions and 20 well-known datasets are tested and compared with 14 optimization algorithms. The experimental results show that FWZGBO is significantly superior in solving global optimization and feature selection problems. Meanwhile, the effectiveness of the FWZGBO algorithm is validated using the Friedman test with the corresponding post-hoc test.</p>\u0000 </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 6-8","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143602780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic Algorithms for Approximate Steiner Trees","authors":"Hemraj Raikwar, Harshil Sadharakiya, Sushanta Karmakar","doi":"10.1002/cpe.70040","DOIUrl":"https://doi.org/10.1002/cpe.70040","url":null,"abstract":"<div>\u0000 \u0000 <p>This study investigates the dynamic Steiner tree problem. The objective of the Steiner tree problem is to compute a minimum-weight tree connecting a set of designated vertices called terminals in a connected weighted graph with positive real edge weights. A dynamic graph is one in which the set of edges, the set of vertices, or both can change over time. Here, we focus on dynamic graphs where edges can change over time. The work begins by establishing a lower bound on the update time required to maintain an MST heuristic based <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mo>(</mo>\u0000 <mn>2</mn>\u0000 <mo>−</mo>\u0000 <mi>ϵ</mi>\u0000 <mo>)</mo>\u0000 </mrow>\u0000 <annotation>$$ left(2-epsilon right) $$</annotation>\u0000 </semantics></math>-approximate Steiner tree (where <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>ϵ</mi>\u0000 </mrow>\u0000 <annotation>$$ epsilon $$</annotation>\u0000 </semantics></math> is a small fraction) in a general graph undergoing edge insertions or deletions. Subsequently, we propose two dynamic algorithms: A fully dynamic algorithm to maintain an approximate Steiner tree in planar graphs and an incremental algorithm to maintain an approximate Steiner tree in general graphs. We focus on edge-weighted connected graphs. The graph undergoes dynamic updates where edges with specific weights can be either inserted or deleted. The goal is to efficiently compute a Steiner tree of the updated graph, guaranteeing a solution quality (Steiner tree cost) within a good factor of the optimal Steiner tree. In the fully dynamic case, our analysis demonstrates that the presented algorithm maintains an approximation factor of <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mo>(</mo>\u0000 <mn>2</mn>\u0000 <mo>+</mo>\u0000 <mi>ϵ</mi>\u0000 <mo>)</mo>\u0000 </mrow>\u0000 <annotation>$$ left(2+epsilon right) $$</annotation>\u0000 </semantics></math>. The worst case update time for processing a series of <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>k</mi>\u0000 </mrow>\u0000 <annotation>$$ k $$</annotation>\u0000 </semantics></math> number of updates is <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>Õ</mi>\u0000 <mo>(</mo>\u0000 <mo>|</mo>\u0000 <mi>S</mi>\u0000 <msup>\u0000 <mrow>\u0000 <mo>|</mo>\u0000 </mrow>\u0000 <mrow>\u0000 <mn>2</mn>\u0000 </mrow>\u0000 </msup>\u0000 <msqrt>\u0000 <mrow>\u0000 <mi>n</mi>\u0000 ","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 6-8","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143602702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Dynamic Service Placement Approach for Edge-Enabled Vehicular Networks Based on SAC and RF","authors":"Yuan Zeng, Hengzhou Ye, Gaoxing Li","doi":"10.1002/cpe.70041","DOIUrl":"https://doi.org/10.1002/cpe.70041","url":null,"abstract":"<div>\u0000 \u0000 <p>Edge computing offers crucial computational and storage support to vehicles by providing various services within the framework of the Internet of Vehicles in intelligent transportation systems. Service placement (SP) becomes particularly challenging when edge resources are limited and vehicles exhibit high-mobility. Many current dynamic placement methods rely on real-time placement, often leading to increased costs, instability, and frequent changes. This paper proposes SACRF-SP, an adaptive dynamic service placement algorithm based on Soft Actor-Critic (SAC) and Random Forest (RF), for dynamic urban traffic scenarios. This algorithm utilizes the SAC method to identify optimal placement nodes and integrates an RF model to predict service request trends. A decision network is constructed to assess the necessity of redeployment. Extensive simulation experiments demonstrate that SACRF-SP significantly reduces latency, resource usage, and the frequency of redeployment.</p>\u0000 </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 6-8","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143602720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Robust Lattice-Based Post-Quantum Three-Party Key Exchange Scheme for Mobile Devices","authors":"Akanksha Singh, Harish Chandra, Saurabh Rana","doi":"10.1002/cpe.70036","DOIUrl":"https://doi.org/10.1002/cpe.70036","url":null,"abstract":"<div>\u0000 \u0000 <p>In this paper, we introduce a lattice-based authenticated three-party key agreement scheme for mobile devices with the aim of achieving both post-quantum security and efficiency. Our scheme is inspired by the authenticated key exchange protocol developed. We revisit the recently suggested system, which is a communication-efficient three-party password-authenticated key exchange, in which we found that the scheme is not fully correct and also demonstrate that the scheme is not safe from user's anonymity and impersonation assaults. We provide an enhanced scheme that is both effective and resistant to the mentioned assault. We also demonstrate its security in a ROM (Random Oracle Model). A comparison analysis that includes performance, security evaluations, energy consumption, and packet loss rate is also provided, proving the suitability of the suggested design.</p>\u0000 </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 6-8","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143602781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}