Tianhao Fu;Zehua Yang;Zhisheng Ye;Chenxiang Ma;Yang Han;Yingwei Luo;Xiaolin Wang;Zhenlin Wang
{"title":"A Survey on the Scheduling of DL and LLM Training Jobs in GPU Clusters","authors":"Tianhao Fu;Zehua Yang;Zhisheng Ye;Chenxiang Ma;Yang Han;Yingwei Luo;Xiaolin Wang;Zhenlin Wang","doi":"10.23919/cje.2024.00.070","DOIUrl":"https://doi.org/10.23919/cje.2024.00.070","url":null,"abstract":"As deep learning (DL) technology rapidly advances in areas such as computer vision, natural language processing, and more recently, large language models (LLMs), the demand for computing resources has increasingly grown. In particular, scheduling deep learning training (DLT) jobs on graphics processing unit (GPU) clusters has become crucial for the effective utilization of computing resources and the acceleration of model training processes. However, resource management and scheduling in GPU clusters face challenges related to computing and communication, including job sharing, interference, elastic scheduling, heterogeneous resources, and fairness. This survey investigates the scheduling issues of DLT jobs in GPU clusters, focusing on scheduling optimizations at the job characteristic and cluster resource levels. We analyze the structure and training computing characteristics of traditional DL models and LLMs, as well as their requirements for iterative computation, communication, GPU sharing, and resource elasticity. In addition, we compare the main contributions of this survey with related reviews and discuss research directions, including scheduling based on job characteristics and optimization strategies for cluster resources. This survey aims to provide researchers and practitioners with a comprehensive understanding of DLT job scheduling in GPU clusters and to point out directions for future research.","PeriodicalId":50701,"journal":{"name":"Chinese Journal of Electronics","volume":"34 3","pages":"881-905"},"PeriodicalIF":1.6,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11060018","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Miniaturized Reconfigurable Dual-Band Bandstop Filter Utilizing a Novel Hybrid Resonator for Enhanced Stopband Suppression","authors":"Lin Gu;Yuandan Dong","doi":"10.23919/cje.2024.00.224","DOIUrl":"https://doi.org/10.23919/cje.2024.00.224","url":null,"abstract":"A novel varactor-loaded microstrip resonator with a hybrid structure of microstrip and parallel-coupled lines has been proposed. Stopbands are constructed utilizing the first-and second-mode of the resonator, respectively. The independent control of the two modes of this resonator was theoretically analyzed and validated. The introduction of parallel-coupled lines was employed to enhance stopband attenuation, mitigating the impact of low- <tex>$Q$</tex> value varactor diodes to some extent. Miniaturization is also achieved through the introduction of two sections of parallel-coupled lines. A 2nd-order dual-band tunable bandstop filter was designed, fabricated, and measured, with the measured results revealing high attenuation levels of 36.4 dB and 27.85 dB for the two stopbands, respectively, in addition to a compact size of <tex>$0.23lambda_{g}times 0.07lambda_{g}$</tex> (where <tex>$lambda_{g}$</tex> is the guided wavelength in the substrate at 2.2 GHz).","PeriodicalId":50701,"journal":{"name":"Chinese Journal of Electronics","volume":"34 3","pages":"766-773"},"PeriodicalIF":1.6,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11060046","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A High-Quality and Efficient Bus-Aware Global Router","authors":"Genggeng Liu;Ling Wei;Yantao Yu;Ning Xu","doi":"10.23919/cje.2023.00.061","DOIUrl":"https://doi.org/10.23919/cje.2023.00.061","url":null,"abstract":"As advanced technology nodes enter the nanometer era, the complexity of integrated circuit design is increasing, and the proportion of bus in the net is also increasing. The bus routing has become a key factor affecting the performance of the chip. In addition, the existing research does not distinguish between bus and non-bus in the complete global routing process, which directly leads to the expansion of bus deviation and the degradation of chip performance. In order to solve these problems, we propose a high-quality and efficient bus-aware global router, which includes the following key strategies: By introducing the routing density graph, we propose a routing model that can simultaneously consider the routability of non-bus and the deviation value of bus; A dynamic routing resource adjustment algorithm is proposed to optimize the bus deviation and wirelength simultaneously, which can effectively reduce the bus deviation; We propose a layer assignment algorithm consider deviation to significantly reduce the bus deviation of the 3D routing solution; And a depth-first search (DFS)-based algorithm is proposed to obtain multiple routing solutions, from which the routing result with the lowest deviation is selected. Experimental results show that the proposed algorithms can effectively reduce bus deviation compared with the existing algorithms, so as to obtain high-quality 2D and 3D routing solutions considering bus deviation.","PeriodicalId":50701,"journal":{"name":"Chinese Journal of Electronics","volume":"34 2","pages":"444-456"},"PeriodicalIF":1.6,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10982079","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143900590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Zhang;Yuhang Zhuang;Hu Zhang;Lei Yang;Jing Wang;Changchun Zhang;Yufeng Guo
{"title":"An 8–26 GHz Passive Mixer with Excellent Port Matching Utilizing Marchand Balun and Capacitor Compensation","authors":"Yi Zhang;Yuhang Zhuang;Hu Zhang;Lei Yang;Jing Wang;Changchun Zhang;Yufeng Guo","doi":"10.23919/cje.2023.00.178","DOIUrl":"https://doi.org/10.23919/cje.2023.00.178","url":null,"abstract":"In this study, a broadband monolithic microwave integrated circuit (MMIC) double-balanced mixer designed for operation within the frequency range of 8–26 GHz is presented. The design is implemented using a 0.15 μm GaAs process. Traditional Marchand baluns, when applied to wideband mixers, face challenges in simultaneously achieving broad bandwidth and good port matching characteristics. To address this issue, we employ a spiral Marchand balun with a compensation capacitor. This innovative approach not only maintains the mixer's wide bandwidth but also enhances the matching between the local oscillator (LO) and radio frequency (RF) ports. Additionally, it significantly simplifies the complexity of designing the matching circuit. The optimization principle of the compensation capacitor is elaborated in detail within this paper. Experimental results demonstrate that, with an LO power of 14 dBm, the conversion loss remains below 8.5 dB, while the voltage standing wave ratio (VSWR) of the LO and IF ports is less than 2 and the VSWR of the RF port is below 2.4. In comparison with existing literature, our designed mixer exhibits a broader bandwidth and lower loss.","PeriodicalId":50701,"journal":{"name":"Chinese Journal of Electronics","volume":"34 2","pages":"422-428"},"PeriodicalIF":1.6,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10982081","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143900628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Subspace-Based GMM Clustering Ensemble Algorithm for High-Dimensional Data","authors":"Yulin He;Yingting He;Zhaowu Zhan;Fournier-Viger Philippe;Joshua Zhexue Huang","doi":"10.23919/cje.2023.00.153","DOIUrl":"https://doi.org/10.23919/cje.2023.00.153","url":null,"abstract":"The Gaussian mixture model (GMM) is a classical probabilistic representation model widely used in unsupervised learning. GMM performs poorly on high-dimensional data (HDD) due to the requirement of estimating a large number of parameters with relatively few observations. To address this, the paper proposes a novel subspace-based GMM clustering ensemble (SubGMM-CE) algorithm tailored for HDD. The proposed SubGMM-CE algorithm comprises three key components. A series of low-dimensional subspaces are dynamically determined, considering the optimal number of GMM components. The GMM-based clustering algorithm is applied to each subspace to obtain a series of heterogeneous GMM models. These GMM base clustering results are merged using the newly-designed relabeling strategy based on the average shared affiliation probability, generating the final clustering result for high-dimensional unlabeled data. An exhaustive experimental evaluation validates the feasibility, rationality, effectiveness, and robustness to noise of the SubGMM-CE algorithm. Results show that SubGMM-CE achieves higher stability and more accurate clustering results, outperforming nine state-of-the-art clustering algorithms in normalized mutual information, clustering accuracy, and adjusted rand index scores. This demonstrates the viability of the SubGMM-CE algorithm in addressing HDD clustering challenges.","PeriodicalId":50701,"journal":{"name":"Chinese Journal of Electronics","volume":"34 2","pages":"612-629"},"PeriodicalIF":1.6,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10982075","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143900635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-Use Learning Instance for Optimized Image Retrieval","authors":"Hao Wu;Junqi Guo;Rongfang Bie","doi":"10.23919/cje.2023.00.419","DOIUrl":"https://doi.org/10.23919/cje.2023.00.419","url":null,"abstract":"Dear Editor, Retrieving target images accurately shows more and more prominent significance in the era of digital media and big data. Although there are many classic methods proposed, the overwhelming majority of them are still improved based on the strategy of machine learning. In recent years, deep learning models (such as convolutional neural networks [1]–[3], restricted Boltzmann machines [4], [5], autoencoders [6]–[8], and sparse coding [9], [10]) have used more complicated networks to extract essential features more completely. Moreover, the overwhelming advantages of experimental results support it to replace the traditional machine learning methods in a short while. On the basis of classic models, many innovative models [11], [12] have been proposed, demonstrating better practical application value. Although we must admit that deep learning models have provided revolutionary changes, the huge computing resource consumption is also a burden that can not be underestimated. Even if some methods can reduce the amount of learning instances relatively, they are at the cost of accuracy reduction in most cases, and even some models have obvious limitations which are only effective for some categories.","PeriodicalId":50701,"journal":{"name":"Chinese Journal of Electronics","volume":"34 3","pages":"1002-1005"},"PeriodicalIF":1.6,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11060047","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Meet-in-the-Middle Key Recovery Attacks on Rocca Using Differential and Integral Properties","authors":"Chan Song;Wenling Wu;Lei Zhang","doi":"10.23919/cje.2024.00.032","DOIUrl":"https://doi.org/10.23919/cje.2024.00.032","url":null,"abstract":"Rocca is an Advanced Encryption Standard (AES)-based authentication encryption scheme proposed in 2021 for beyond the fifth/sixth generation systems. The latest version of Rocca injects the key into the initialization, which makes the key recovery attack on its original version no longer valid here. In this paper, we propose new key recovery attacks based on the idea of meet-in-the-middle. Benefiting from the design of the round function, we can treat each 128-bit block as a unit and then write the expressions of the internal states in terms of the initial state and the final state, respectively. Among them, we focus on the state blocks with relatively concise expressions, which have poor diffusion, and then explore their differential and integral properties. Next, in the key recovery attacks, we first guess a part of the key to calculate the specific values of state blocks at the middle matching positions, and then use the differential or integral properties on these blocks to validate the key guesses. Uniquely, in our integral crypt-analysis, we impose appropriate conditions to constrain the propagation of nonce, which corresponds to the weak keys. Consequently, we present the 9 and 10 rounds of meet-in-the-middle key recovery attacks on Rocca, as well as the weak key recovery attack for the 11-round Rocca based on integral properties, with four sets of weak keys with 2<sup>224</sup> keys each.","PeriodicalId":50701,"journal":{"name":"Chinese Journal of Electronics","volume":"34 3","pages":"828-838"},"PeriodicalIF":1.6,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11060049","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cooperative Self-Learning: A Framework for Few-Shot Jamming Identification","authors":"Yuxin Shi;Xinjin Lu;Yifu Sun;Kang An;Yusheng Li","doi":"10.23919/cje.2023.00.229","DOIUrl":"https://doi.org/10.23919/cje.2023.00.229","url":null,"abstract":"Jamming identification is the key objective behind effective anti-jamming methods. Due to the requirement of low-complexity and the limited number of labeled shots for real jamming identification, it is highly challenging to identify jamming patterns with high accuracy. To this end, we first propose a general framework of cooperative jamming identification among multiple nodes. Moreover, we further propose a novel fusion center (FC) aided self-learning scheme, which uses the guidance of the FC to improve the effectiveness of the identification. Simulation results show that the proposed framework of the cooperative jamming identification can significantly enhance the average accuracy with low-complexity. It is also demonstrated that the proposed FC aided self-learning scheme has the superior average accuracy compared with other identification schemes, which is very effective especially in the few labeled jamming shots scenarios.","PeriodicalId":50701,"journal":{"name":"Chinese Journal of Electronics","volume":"34 2","pages":"722-729"},"PeriodicalIF":1.6,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10982100","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143900615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Self-Decoupled Square Patch Antenna Arrays by Exciting and Using Mixed Electric/Magnetic Coupling Between Adjacent Radiators","authors":"Qianwen Liu;Lei Zhu;Wenjun Lu","doi":"10.23919/cje.2023.00.222","DOIUrl":"https://doi.org/10.23919/cje.2023.00.222","url":null,"abstract":"This article presents and develops a simple decoupling method for the planar square patch antenna arrays by virtue of mixed electric and magnetic coupling property. Since the resonant modes of TM<inf>10</inf> and TM<inf>01</inf> are a pair of degenerate modes in the square patch radiator which are intrinsically orthogonal, a superposed mode of them can be generated to possess consistent field distributions along all the four sides of the patch by adjusting the feeding position. By employing such superposed mode, the mutual coupling between two horizontally adjacent patch elements will become identical to that between two vertical ones, indicating an expected possibility that the complex 2-D decoupling problem in a large-scale antenna patch array can be effectively facilitated and simplified to a 1-D one. Subsequently, metallic pins and connecting strip are properly loaded in each square patch resonator, such that appropriate electric and magnetic coupling strengths can be readily achieved and thus the mutual coupling can get highly decreased. A 1 × 2 antenna array with an edge-to-edge separation of 1 mm, which corresponding to 0.0117λ<inf>o</inf>, is firstly discussed, simulated, and fabricated. The measured results show that the isolation can be highly improved from 4 dB to 17 dB across the entire passband. In final, 1 × 3, 2 × 2, and 4 × 4 antenna array prototypes are constructed and studied for verification of the expansibility and feasibility of the proposed decoupling method to both linear and 2-D antenna arrays.","PeriodicalId":50701,"journal":{"name":"Chinese Journal of Electronics","volume":"34 2","pages":"483-494"},"PeriodicalIF":1.6,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10982048","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143900627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-Level Queue Security in Switches: Adversarial Inference and Differential Privacy Protection in SDN","authors":"Xuewen Dong;Lingtao Xue;Tao Zhang;Zhichao You;Guangxia Li;Yulong Shen","doi":"10.23919/cje.2022.00.373","DOIUrl":"https://doi.org/10.23919/cje.2022.00.373","url":null,"abstract":"Network switches are critical elements in any network infrastructure for traffic forwarding and packet priority scheduling, which naturally become a target of network adversaries. Most attacks on switches focus on purposely forwarding packets to the wrong network nodes or generating flooding. However, potential privacy leakage in the multi-level priority queue of switches has not been considered. In this paper, we are the first to discuss the multi-level priority queue security and privacy protection problem in switches. Observing that packet leaving order from a queue is strongly correlated to its priority, we introduce a policy inference attack that exploits specific priority-mapping rules between different packet priorities and priority sub-queues in the multi-level queues. Next, based on the policy inference result and the built-in traffic shaping strategy, a capacity inference attack with the error probability decaying exponentially in the number of attacks is presented. In addition, we propose a differentially private priority scheduling mechanism to defend against the above attacks in OpenFlow switches. Theoretical analysis proves that our proposed mechanism can satisfy ε-differential privacy. Extensive evaluation results show that our mechanism can defend against inference attacks well and achieves up to 2.7 times priority process efficiency than a random priority scheduling strategy.","PeriodicalId":50701,"journal":{"name":"Chinese Journal of Electronics","volume":"34 2","pages":"533-547"},"PeriodicalIF":1.6,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10982054","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143900643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}