{"title":"HTLL: Latency-Aware Scalable Blocking Mutex","authors":"Ziqu Yu;Jinyu Gu;Zijian Wu;Nian Liu;Jian Guo","doi":"10.1109/TPDS.2025.3526859","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3526859","url":null,"abstract":"This article finds that existing mutex locks suffer from throughput collapses or latency collapses, or both, in the oversubscribed scenarios where applications create more threads than the CPU core number, e.g., database applications like mysql use per thread per connection. We make an in-depth performance analysis on existing locks and then identify three design rules for the lock primitive to achieve scalable performance in oversubscribed scenarios. First, to achieve ideal throughput, the lock design should keep adequate number of active competitors. Second, the active competitors should be arranged carefully to avoid the lock-holder preemption problem. Third, to meet latency requirements, the lock design should track the latency of each competitor and reorder the competitors according to the latency requirement. We propose a new lock library called HTLL that satisfies these rules and achieves both high throughput and low latency even when the cores are oversubscribed. HTLL only requires minimal human effort (e.g., add several lines of code) to annotate the latency requirement. Evaluation results show that HTLL achieves scalable performance in the oversubscribed scenarios. Specifically, for the real-world database, LMDB, HTLL can reduce the tail latency by up to 97% with only an average 5% degradation in throughput, compared with state-of-the-art alternatives such as Malthusian, CST, and Mutexee locks; In comparison to the widely used pthread_mutex_lock, it can increase the throughput by up to 22% and decrease the latency by up to 80%. Meanwhile, for the under-subscribed scenarios, it also shows comparable performance than state-of-the-art blocking locks.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 3","pages":"471-486"},"PeriodicalIF":5.6,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143105822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SMDP-Based Dynamic Batching for Improving Responsiveness and Energy Efficiency of Batch Services","authors":"Yaodan Xu;Sheng Zhou;Zhisheng Niu","doi":"10.1109/TPDS.2025.3526283","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3526283","url":null,"abstract":"For servers incorporating parallel computing resources, batching is a pivotal technique for providing efficient and economical services at scale. Parallel computing resources exhibit heightened computational and energy efficiency when operating with larger batch sizes. However, in the realm of online services, the adoption of a larger batch size may lead to longer response times. This paper aims to provide a dynamic batching scheme that delicately balances latency and efficiency. The system is modeled as a batch service queue with size-dependent service times. Then, the design of dynamic batching is formulated as a semi-Markov decision process (SMDP) problem, with the objective of minimizing the weighted sum of average response time and average power consumption. A method is proposed to derive an approximate optimal SMDP solution, representing the chosen dynamic batching policy. By introducing an abstract cost to reflect the impact of “tail” states, the space complexity and the time complexity of the procedure can decrease by 63.5% and 98%, respectively. Numerical results showcase the superiority of SMDP-based batching policies across various parameter setups. Additionally, the proposed scheme exhibits noteworthy flexibility in balancing power consumption and latency.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 4","pages":"659-674"},"PeriodicalIF":5.6,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Paralfetch: Fast Application Launch on Personal Computing/Communication Devices","authors":"Junhee Ryu;Dongeun Lee;Kang G. Shin;Kyungtae Kang","doi":"10.1109/TPDS.2024.3525337","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3525337","url":null,"abstract":"<monospace>Paralfetch</monospace> speeds up application launches on personal computing/communication devices, by means of: 1) accurate collection of launch-related disk read requests, 2) pre-scheduling of these requests to improve I/O throughput during prefetching, and 3) overlapping application execution with disk prefetching for hiding disk access time from the execution of the application. We implemented <monospace>Paralfetch</monospace> under Linux kernels on a desktop/laptop PC, a Raspberry Pi 3 board, and an Android smartphone. Tests with popular applications show that <monospace>Paralfetch</monospace> significantly reduces application launch times on flash-based drives and hard disk drives, and it outperforms <italic>GSoC Prefetch</i> Lichota et al. 2007 and <monospace>FAST</monospace> Joo et al. 2011, which are representative application prefetchers available for Linux-based systems.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 4","pages":"616-632"},"PeriodicalIF":5.6,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143521388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HpT: Hybrid Acceleration of Spatio-Temporal Attention Model Training on Heterogeneous Manycore Architectures","authors":"Saiman Dahal;Pratyush Dhingra;Krishu Kumar Thapa;Partha Pratim Pande;Ananth Kalyanaraman","doi":"10.1109/TPDS.2024.3522781","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3522781","url":null,"abstract":"Transformer models have become widely popular in numerous applications, and especially for building foundation large language models (LLMs). Recently, there has been a surge in the exploration of transformer-based architectures in non-LLM applications. In particular, the self-attention mechanism within the transformer architecture offers a way to exploit any hidden relations within data, making it widely applicable for a variety of spatio-temporal tasks in scientific computing domains (e.g., weather, traffic, agriculture). Most of these efforts have primarily focused on accelerating the inference phase. However, the computational resources required to train these attention-based models for scientific applications remain a significant challenge to address. Emerging non-volatile memory (NVM)-based processing-in-memory (PIM) architectures can achieve higher performance and better energy efficiency than their GPU-based counterparts. However, the frequent weight updates during training would necessitate write operations to NVM cells, posing a significant barrier for considering stand-alone NVM-based PIM architectures. In this paper, we present <monospace>HpT</monospace>, a new hybrid approach to accelerate the training of attention-based models for scientific applications. Our approach is hybrid at two different layers: at the software layer, our approach dynamically switches from a full-parameter training mode to a lower-parameter training mode by incorporating intrinsic dimensionality; and at the hardware layer, our approach harnesses the combined power of GPUs, resistive random-access memory (ReRAM)-based PIM devices, and systolic arrays. This software-hardware co-design approach is aimed at adaptively reducing both runtime and energy costs during the training phase, without compromising on quality. Experiments on four concrete real-world scientific applications demonstrate that our hybrid approach is able to significantly reduce training time (up to <inline-formula><tex-math>$11.9times$</tex-math></inline-formula>) and energy consumption (up to <inline-formula><tex-math>$12.05times$</tex-math></inline-formula>), compared to the corresponding full-parameter training executing on only GPUs. Our approach serves as an example for accelerating the training of attention-based models on heterogeneous platforms including ReRAMs.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 3","pages":"407-421"},"PeriodicalIF":5.6,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sparrow: Expediting Smart Contract Execution for Blockchain Sharding via Inter-Shard Caching","authors":"Junyuan Liang;Peiyuan Yao;Wuhui Chen;Zicong Hong;Jianting Zhang;Ting Cai;Min Sun;Zibin Zheng","doi":"10.1109/TPDS.2024.3522016","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3522016","url":null,"abstract":"Sharding is a promising solution to scale blockchain by separating the system into multiple shards to process transactions in parallel. However, due to state separation and shard isolation, it is still challenging to efficiently support smart contracts on a blockchain sharding system where smart contracts can interact with each other, involving states maintained by multiple shards. Specifically, existing sharding systems adopt a costly multi-step collaboration mechanism to execute smart contracts, resulting in long latency and low throughput. This article proposes <small>Sparrow</small>, a blockchain sharding protocol achieving one-step execution for smart contracts. To break shard isolation, inspired by non-local hotspot data caching in traditional databases, we propose a new idea of <i>inter-shard caching</i>, allowing a shard to prefetch and cache frequently accessed contract states of other shards. The miner can thus use the inter-shard cache to pre-execute a pending transaction, retrieve all its contract invocations, and commit it to multiple shards in one step. Particularly, we first propose a speculative dispersal cache synchronisation mechanism for efficient and secure cache synchronization across shards in Byzantine environments. Then, we propose a multi-branch exploration mechanism to solve the rollback problem during the optimistic one-step execution of contract invocations with dependencies. We also present a series of conflict resolution mechanisms to decrease the rollback caused by inherent transaction conflicts. We implement prototypes for <small>Sparrow</small> and existing sharding systems, and the evaluation shows that <small>Sparrow</small> improves the throughput by <inline-formula><tex-math>$2.44times$</tex-math></inline-formula> and reduces the transaction latency by 30% compared with the existing sharding systems.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 3","pages":"377-390"},"PeriodicalIF":5.6,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142992863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiangsu Du;Yuanxin Wei;Shengyuan Ye;Jiazhi Jiang;Xu Chen;Dan Huang;Yutong Lu
{"title":"Co-Designing Transformer Architectures for Distributed Inference With Low Communication","authors":"Jiangsu Du;Yuanxin Wei;Shengyuan Ye;Jiazhi Jiang;Xu Chen;Dan Huang;Yutong Lu","doi":"10.1109/TPDS.2024.3521582","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3521582","url":null,"abstract":"Transformer models have shown significant success in a wide range of tasks. However, the massive resources required for its inference prevent deployment on a single device with relatively constrainted resources, thus leaving a high threshold of integrating their advancements. Observing scenarios such as smart home applications on edge devices and cloud deployment on commodity hardware, it is promising to distribute Transformer inference across multiple devices. Unfortunately, due to the tightly-coupled feature of Transformer model, existing model parallelism approaches necessitate frequent communication to resolve data dependencies, making them unacceptable for distributed inference, especially under relatively weak interconnection. In this paper, we propose DeTransformer, a communication-efficient distributed Transformer inference system. The key idea of DeTransformer involves the co-design of Transformer architecture to reduce the communication during distributed inference. In detail, DeTransformer is based on a novel block parallelism approach, which restructures the original Transformer layer with a single block to the decoupled layer with multiple sub-blocks. Thus, it can exploit model parallelism between sub-blocks. Next, DeTransformer contains an adaptive execution approach that strikes a trade-off among communication capability, computing power and memory budget over multiple devices. It incorporates a two-phase planning for execution, namely static planning and runtime planning. The static planning runs offline, containing a profiling procedure and a weight placement strategy before execution. The runtime planning dynamically determines the optimal parallel computing strategy from an expertly crafted search space based on real-time requests. Notably, this execution approach can adapt to heterogeneous devices by distributing workload based on devices’ computing capabilities. We conduct experiments for both auto-regressive and auto-encoder tasks of Transformer models. Experimental results show that DeTransformer can reduce distributed inference latency by up to 2.81× compared to the SOTA approach on 4 devices, while effectively maintaining task accuracy and a consistent model size.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 4","pages":"717-730"},"PeriodicalIF":5.6,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High Performance Householder QR Factorization on Emerging GPU Architectures Using Tensor Cores","authors":"Yuhan Leng;Gaoyuan Zou;Hansheng Wang;Panruo Wu;Shaoshuai Zhang","doi":"10.1109/TPDS.2024.3522776","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3522776","url":null,"abstract":"Since 2017, NVIDIA GPUs have been equipped with specialized units known as Tensor Cores, which demonstrate remarkable efficiency in processing matrix multiplications (GEMMs). Beyond GEMMs, researchers have explored the potential applications of Tensor Cores in matrix factorization, such as QR factorization. However, the inside GEMMs in QR factorization are typically tall and skinny. Compared to compute-bound square GEMMs, these tall and skinny GEMMs are memory bound, leading to suboptimal performance on Tensor Cores. To solve this problem, we indicate the recursive QR factorization can convert the tall and skinny GEMMs to relatively square and large GEMMs, resulting in better performance on Tensor Cores. Besides, we extend the FP16 Tensor-Cores-based QR factorization to accommodate FP32 and FP64 on FP16 and INT8 Tensor Cores, respectively. Additionally, to address the issue of orthogonality loss in the preceding Tensor Cores-based QR factorization, we transition from the Gram-Schmidt to the Householder algorithm while preserving high performance. According to our experimental evaluation conducted on NVIDIA's A100 and GeForce RTX 3090 GPU, the precision levels of FP64, FP32, and FP16 are up to 6.22x, 8.67x, and 4.03x faster, respectively, than the current state-of-the-art implementations.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 3","pages":"422-436"},"PeriodicalIF":5.6,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143105824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrated and Fungible Scheduling of Deep Learning Workloads Using Multi-Agent Reinforcement Learning","authors":"Jialun Li;Danyang Xiao;Diying Yang;Xuan Mo;Weigang Wu","doi":"10.1109/TPDS.2024.3522333","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3522333","url":null,"abstract":"GPU clusters have been widely used to co-locate various deep learning (DL) workloads in a multi-tenant way. Although such resource sharing can significantly reduce training cost, resource contention and interference among co-located workloads make task scheduling very complex and challenging. To simplify the scheduling problem, existing algorithms usually divide the procedure of scheduling into two sub-tasks, i.e., task placement and resource allocation, and allocate resources according to pre-defined and fixed resource demands. However, such a paradigm significantly constrains the selection of potential scheduling solutions. In this article, we present MAIFS, a novel multi-agent reinforcement learning based scheduling algorithm that handles task placement and resource allocation integratedly, and allows fungible resource allocation based on resource sensitivity of DL workloads. The core of MAIFS lies in two mechanisms. The multi-agent attention mechanism is designed to learn and share inter-related resource state features observed from different agents, which enables agents to explore fungible resource allocation solutions. The dynamic coordination graph mechanism is designed for coordinating interactive task placement decisions of agents during integrated scheduling, so as to mitigate potential task conflicts. Simulated experiments using two large scale production DL workload traces and physical deployment experiments based on a Kubernetes based GPU cluster show that MAIFS can outperform state-of-the-art scheduling algorithms by up to 44% in terms of makespan and 46% in terms of job completion time (JCT).","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 3","pages":"391-406"},"PeriodicalIF":5.6,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143105817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongkuan Zhou;Bingyi Zhang;Rajgopal Kannan;Carl Busart;Viktor K. Prasanna
{"title":"ViTeGNN: Towards Versatile Inference of Temporal Graph Neural Networks on FPGA","authors":"Hongkuan Zhou;Bingyi Zhang;Rajgopal Kannan;Carl Busart;Viktor K. Prasanna","doi":"10.1109/TPDS.2024.3521897","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3521897","url":null,"abstract":"Temporal Graph Neural Networks (TGNNs) are powerful models to capture temporal, structural, and contextual information on temporal graphs, outperforming other methods in many high-impact downstream tasks. However, achieving high-performance TGNN inference in production environments is challenging because TGNN models suffer from high computation complexity and intrinsic temporal data dependency that hinders data parallelism. In addition, real-world TGNN applications have different latency and throughput requirements. This work presents ViTeGNN, a versatile TGNN inference solution for memory-based TGNNs on FPGAs. ViTeGNN performs algorithm-model-architecture co-design to meet the latency and throughput requirements of real-world TGNN applications. Besides the vanilla inference mode ViTeGNN-bal that updates embeddings for nodes interacting with others, we propose ViTeGNN-lat and ViTeGNN-thpt, optimized for latency and throughput. Our model optimizations include a lightweight method to compute attention scores and a related temporal neighbor pruning strategy to reduce computation and memory accesses. These are holistically coupled with key hardware optimizations that leverage the FPGA hardware. We propose a novel hardware module to execute the complex neighbor update process efficiently. To ensure similar accuracy vis-á-vis the original model, the simplified models are trained using the knowledge distillation technique. We propose a unified hardware design that supports all of these three inference modes without FPGA reconfiguration. Enabled by our flexible hardware architecture, we further propose ViTeGNN-auto, which automatically selects the best inference mode at runtime based on latency and throughput requirements, guided by our accurate performance model. We evaluate the performance of the proposed hardware accelerator on five real-world datasets. ViTeGNN-bal reduces the computation complexity by an average of 62% and memory accesses by an average of 36% with only 0.0042 accuracy loss. Compared with state-of-the-art implementations on CPU and GPU, our FPGA implementation achieves <inline-formula><tex-math>$53.9/26.0/16.1times$</tex-math></inline-formula> speedup and <inline-formula><tex-math>$8.2/4.0/2.5times$</tex-math></inline-formula> speedup for ViTeGNN-lat/-bal/-thpt, respectively.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 3","pages":"502-519"},"PeriodicalIF":5.6,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143105821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Online Elastic Resource Provisioning With QoS Guarantee in Container-Based Cloud Computing","authors":"Shuaibing Lu;Ran Yan;Jie Wu;Jackson Yang;Xinyu Deng;Shen Wu;Zhi Cai;Juan Fang","doi":"10.1109/TPDS.2024.3522085","DOIUrl":"https://doi.org/10.1109/TPDS.2024.3522085","url":null,"abstract":"In cloud data centers, the exponential growth of data places increasing demands on computing, storage, and network resources, especially in multi-tenant environments. While this growth is crucial for ensuring Quality of Service (QoS), it also introduces challenges such as fluctuating resource requirements and static container configurations, which can lead to resource underutilization and high energy consumption. This article addresses online resource provisioning and efficient scheduling for multi-tenant environments, aiming to minimize energy consumption while balancing elasticity and QoS requirements. To address this, we propose a novel optimization framework that reformulates the resource provisioning problem into a more manageable form. By reducing the original multi-constraint optimization to a container placement problem, we apply the interior-point barrier method to simplify the optimization, integrating constraints directly into the objective function for efficient computation. We also introduce elasticity as a key parameter to balance energy consumption with autonomous resource scaling, ensuring that resource consolidation does not compromise system flexibility. The proposed Energy-Efficient and Elastic Resource Provisioning (EEP) framework comprises three main modules: a distributed resource management module that employs vertical partitioning and dynamic leader election for adaptive resource allocation; a prediction module using <inline-formula><tex-math>$omega$</tex-math></inline-formula>-step prediction for accurate resource demand forecasting; and an elastic scheduling module that dynamically adjusts to tenant scaling needs, optimizing resource allocation and minimizing energy consumption. Extensive experiments across diverse cloud scenarios demonstrate that the EEP framework significantly improves energy efficiency and resource utilization compared to established baselines, supporting sustainable cloud management practices.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"36 3","pages":"361-376"},"PeriodicalIF":5.6,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143105920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}