Maria Alvarez Roa;Catalina Stan;Sebastian Verschoor;Idelfonso Tafur Monroy;Simon Rommel
{"title":"Decentralized key distribution versus on-demand relaying for QKD networks","authors":"Maria Alvarez Roa;Catalina Stan;Sebastian Verschoor;Idelfonso Tafur Monroy;Simon Rommel","doi":"10.1364/JOCN.547793","DOIUrl":"https://doi.org/10.1364/JOCN.547793","url":null,"abstract":"Quantum key distribution (QKD) allows the distribution of secret keys for quantum-secure communication between two distant parties, vital in the quantum computing era in order to protect against quantum-enabled attackers. However, overcoming rate-distance limits in QKD and the establishment of quantum key distribution networks necessitate key relaying over trusted nodes. This process may be resource-intensive, consuming a substantial share of the scarce QKD key material to establish end-to-end secret keys. Hence, an efficient scheme for key relaying and the establishment of end-to-end key pools is essential for practical and extended quantum-secured networking. In this paper, we propose and compare two protocols for managing, storing, and distributing secret key material in QKD networks, addressing challenges such as the success rate of key requests, key consumption, and overhead resulting from relaying. We present an innovative, fully decentralized key distribution strategy as an alternative to the traditional hop-by-hop relaying via trusted nodes, where three experiments are considered to evaluate performance metrics under varying key demand. Our results show that the decentralized pre-flooding approach achieves higher success rates as application demands increase. This analysis highlights the strengths of each approach in enhancing QKD network performance, offering valuable insights for developing robust key distribution strategies in different scenarios.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"17 8","pages":"732-742"},"PeriodicalIF":4.0,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144704948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Che-Yu Liu;Xiaoliang Chen;Roberto Proietti;Zuqing Zhu;S. J. Ben Yoo
{"title":"Deep reinforcement learning-aided multi-step job scheduling in optical data center networks","authors":"Che-Yu Liu;Xiaoliang Chen;Roberto Proietti;Zuqing Zhu;S. J. Ben Yoo","doi":"10.1364/JOCN.562531","DOIUrl":"https://doi.org/10.1364/JOCN.562531","url":null,"abstract":"Orchestrating job scheduling and topology reconfiguration in optical data center networks (ODCNs) is essential for meeting the intensive communication demand of novel applications, such as distributed machine learning (ML) workloads. However, this task involves joint optimization of multi-dimensional resources that can barely be effectively addressed by simple rule-based policies. In this paper, we leverage the powerful state representation and self-learning capabilities from deep reinforcement learning (DRL) and propose a multi-step job schedule algorithm for ODCNs. Our design decomposes a job request into an ordered sequence of virtual machines (VMs) and the related bandwidth demand in between, and then makes a DRL agent learn how to place the VMs sequentially. To do so, we feed the agent with the global bandwidth and IT resource utilization state embedded with the previous VM allocation decisions in each step and reward the agent with both team and individual incentives. The team reward encourages the agent to jointly optimize the VM placement in multiple steps to pursue successful provisioning of the job request, while the individual reward favors advantageous local placement decisions, i.e., to prevent effective policies being overwhelmed by a few subpar decisions. We also introduce a penalty on reconfiguration to balance between performance gains and reconfiguration overheads. Simulation results under various ODCN configurations and job loads show our proposal outperforms the existing heuristic solutions and reduces the job-blocking probability and reconfiguration frequency by at least <tex>$7.35 times$</tex> and <tex>$4.59 times$</tex>, respectively.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"17 9","pages":"D96-D105"},"PeriodicalIF":4.0,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144695545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jan De Neve;Ziyue Zhang;Wouter Tavernier;Didier Colle;Mario Pickavet
{"title":"Edge coloring bipartite multigraphs for dynamically configuring optical switches","authors":"Jan De Neve;Ziyue Zhang;Wouter Tavernier;Didier Colle;Mario Pickavet","doi":"10.1364/JOCN.559454","DOIUrl":"https://doi.org/10.1364/JOCN.559454","url":null,"abstract":"Multi-chip graphics processing units (GPUs) interconnected by a photonic network-on-wafer are a promising technology to further increase the performance of GPUs. The network control algorithm managing dynamic bandwidth allocation (DBA) in this network needs to execute very frequently so that resources can be optimally used. This algorithm relies on edge coloring bipartite multigraphs to translate inter-chip bandwidth demands into updated routing tables for the GPU chips and optical switches in the network. In this work, we design fast edge coloring algorithms, both approximate and exact, for bipartite multigraphs. These algorithms are tailored to the high edge multiplicities of the multigraphs in this research. The runtimes are optimized by using efficient data structures and introducing pre- and post-processing. These new algorithms are up to <tex>${20} times$</tex> faster than the state-of-the-art baseline algorithm. New simulations show that, with such low reconfiguration periods, DBA has the potential to double the performance of a high-traffic GPU workload compared to a static network with the same bandwidth.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"17 8","pages":"720-731"},"PeriodicalIF":4.0,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144687661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Explainable AI-assisted low-latency haptic feedback prediction for human-to-machine applications over passive optical networks","authors":"Yuxiao Wang;Sourav Mondal;Ye Pu;Elaine Wong","doi":"10.1364/JOCN.560757","DOIUrl":"https://doi.org/10.1364/JOCN.560757","url":null,"abstract":"Human-to-machine applications, such as robotic teleoperation, require ultra-low latency for real-time interactions. In passive optical networks (PONs), edge AI servers at the optical line terminal can predict haptic feedback in advance based on control signals, thereby enhancing the immersive experience. To further reduce latency while preserving predictive performance, this paper proposes an eXplainable AI-assisted low-latency haptic feedback prediction framework, using XAI for feature selection to reduce inference time. In a 50G-PON network, the framework achieves the lowest round-trip delay and packet delay variation among evaluated approaches. Extensive simulations show a 64.9% reduction in inference time, 15.5% in round-trip delay, and 15.1% in delay variation under a typical traffic load of 0.5, demonstrating its effectiveness for next-generation AI-assisted optical networks.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"17 9","pages":"D83-D95"},"PeriodicalIF":4.0,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144680824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Saturn: a chiplet-based optical network architecture for breaking the memory wall","authors":"Lijing Zhu;Huaxi Gu;Kun Wang;Guangming Zhang","doi":"10.1364/JOCN.559347","DOIUrl":"https://doi.org/10.1364/JOCN.559347","url":null,"abstract":"Given the increasingly computing-intensive and data-intensive workloads of high-performance computing applications, the need for more cores and larger storage capacity is expanding. While computational power is rapidly increasing, data movement capability among cores and memory modules has not stepped forward substantially. Low energy efficiency and parallelism of data movement have become a bottleneck. Optical interconnects with better bandwidth and power performance are a promising method. In addition, chiplet technology significantly amplifies the benefits of optical interconnects. However, existing optical networks do not take the modularity and flexible assembly of chiplets into account, nor do they take advantage of new fabrication and packaging. In this paper, we propose Saturn, an optical interconnection network architecture, including two parts: a core-to-memory network (CTMN) and a core-to-core network. In the CTMN, the integration of optical broadband micro-ring technology and co-designed wavelength assignment enables memory access to be completed in a single hop, providing highly parallel bandwidth. The serpentine layout employed in the CTMN eliminates waveguide crossings, which in turn substantially reduces the insertion loss and energy consumption. Analytical simulations have validated the effectiveness and efficiency of Saturn, showing that it can improve memory access throughput performance while achieving energy reduction compared with a traditional network.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"17 8","pages":"713-719"},"PeriodicalIF":4.0,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144671222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interpretable optical network fault detection and localization with multi-task graph prototype learning","authors":"Xiaokang Chen;Xiaoliang Chen;Zuqing Zhu","doi":"10.1364/JOCN.562633","DOIUrl":"https://doi.org/10.1364/JOCN.562633","url":null,"abstract":"The recent advances in machine learning (ML) have promoted data-driven automated fault management in optical networks. However, existing ML-aided fault management approaches mainly rely on black-box models that lack intrinsic interpretability to secure their trustworthiness in mission-critical operation scenarios. In this paper, we propose an interpretable optical network fault detection and localization design leveraging multi-task graph prototype learning (MT-GPL). MT-GPL models an optical network and the optical performance monitoring data collected in it as graph-structured data and makes use of graph neural networks to learn graph embeddings that capture both topological correlations (for fault localization) and fault discriminative patterns (for root cause analysis). MT-GPL interprets its reasoning by (i) introducing a prototype layer that learns physics-aligned prototypes indicative of each fault class using the Monte Carlo tree search method and (ii) performing predictions based on the similarities between the embedding of an input graph and the learned prototypes. To enhance the scalability and interpretability of MT-GPL, we develop a multi-task architecture that performs concurrent fault localization and reasoning with node-level and device-level prototype learning and fault predictions. Performance evaluations show that our proposal achieves <tex>${gt}6.5%$</tex> higher prediction accuracy than the multi-layer perceptron model, while the visualizations of its reasoning processes verify the validity of its interpretability.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"17 9","pages":"D73-D82"},"PeriodicalIF":4.0,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144657451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Post-disaster cloud-service restoration through datacenter-carrier cooperation","authors":"Subhadeep Sahoo;Sugang Xu;Sifat Ferdousi;Yusuke Hirota;Massimo Tornatore;Yoshinari Awaji;Biswanath Mukherjee","doi":"10.1364/JOCN.561579","DOIUrl":"https://doi.org/10.1364/JOCN.561579","url":null,"abstract":"In network-cloud ecosystems, large-scale failures affecting network carrier and datacenter (DC) infrastructures can severely disrupt cloud services. Post-disaster cloud service restoration requires cooperation among carriers and DC providers (DCPs) to minimize downtime. Such cooperation is challenging due to proprietary and regulatory policies, which limit access to confidential information (detailed topology, resource availability, etc.). Accordingly, we introduce a third-party entity, a provider-neutral exchange, which enables cooperation by sharing abstracted information. We formulate an optimization problem for DCP–carrier cooperation to maximize service restoration while minimizing restoration time and cost. We propose a scalable heuristic, demonstrating significant improvement in restoration efficiency with different topologies and failure scenarios.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"17 8","pages":"700-712"},"PeriodicalIF":4.0,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144646449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-entity cooperation platform facilitating network-cloud recovery","authors":"Sugang Xu;Subhadeep Sahoo;Sifat Ferdousi;Masaki Shiraiwa;Yusuke Hirota;Massimo Tornatore;Yoshinari Awaji;Biswanath Mukherjee","doi":"10.1364/JOCN.560240","DOIUrl":"https://doi.org/10.1364/JOCN.560240","url":null,"abstract":"Cooperation among telecom carriers and datacenter providers (DCPs) is essential to ensure the resiliency of network-cloud ecosystems. To enable efficient cooperative recovery in case of traffic congestion or network failures, we introduce a novel, to our knowledge, multi-entity cooperation platform (MCP) for implementing cooperative recovery planning. The MCP is built over distributed ledger technology (DLT), which ensures decentralized and tamper-proof information exchange among stakeholders to achieve open and fair cooperation. We experimentally demonstrate a proof-of-concept DLT-based MCP on a testbed. We showcase a DCP–carrier cooperative planning process and the corresponding recovery in the data-plane, showing the possibility of multi-entity cooperation for quick recovery of network-cloud ecosystems.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"17 9","pages":"D53-D72"},"PeriodicalIF":4.0,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144641027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bang Yang;Jianwei Tang;Huiyang Yu;Yaguang Hao;Shuang Gao;Linsheng Fan;Yong Yao;Junpeng Liang;Jinlong Wei;Yanfu Yang
{"title":"Low-complexity SOP-based vibration broadband sensing and efficient recognition for stable IM/DD optical interconnects in data centers","authors":"Bang Yang;Jianwei Tang;Huiyang Yu;Yaguang Hao;Shuang Gao;Linsheng Fan;Yong Yao;Junpeng Liang;Jinlong Wei;Yanfu Yang","doi":"10.1364/JOCN.559810","DOIUrl":"https://doi.org/10.1364/JOCN.559810","url":null,"abstract":"With the rapid advancement of artificial intelligence (AI) technologies, the stability of optical interconnects in data centers has become increasingly important. Vibration sensing integrated in optical interconnect systems is conducive to identifying external disturbances in optical interconnects and achieving intelligent operation and maintenance. This paper proposes an easy-integration vibration-sensing scheme based on the state of polarization (SOP) of the fiber link. This scheme combines photonic technology with low-complexity digital signal processing (DSP) to detect link vibrations, ensuring full compatibility with intensity-modulation direct-detection (IM/DD) optical interconnect systems while minimizing additional complexity. Experiments show that our proposed scheme effectively detects SOP variations across a wide frequency range (0.5 Hz to 159 kHz). Based on the sensing system, a recognition scheme leveraging the Gramian angular field analysis and convolutional neural network (CNN) is proposed to recognize four types of vibration events simulated by a robotic arm, achieving a classification accuracy of 98%. Furthermore, experimental results confirm that the sensing system can detect SOP variations even under conditions of extremely low received optical power (ROP), where the communication system becomes inoperative. The proposed scheme enables robust event detection with minimal hardware overhead, which is suitable for real-world deployment in pluggable optical modules.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"17 8","pages":"692-699"},"PeriodicalIF":4.0,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144634781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance investigation on disaggregated artificial intelligence data centers beyond rack scale with optical switching","authors":"Fulong Yan;Hanting Huang;Yanxian Bi;Peizhao Li;Zhiwen Xue;Chao Li;Zichen Liu;Zhixue He;Zihao Li;QiDong Cao","doi":"10.1364/JOCN.559613","DOIUrl":"https://doi.org/10.1364/JOCN.559613","url":null,"abstract":"Accompanying the ever-increasing scale of big data applications, artificial intelligence data centers are facing the issue of resource fragments, resulting in a low network resource utilization ratio. Disaggregating the network resources is an efficient solution to improve the network resource utilization ratio by allocating the required amount of resources. In this paper, we focus on the problem of CPU and GPU resource disaggregation for an artificial intelligence data center. We carry out investigations for data center disaggregation exploiting optical switching. The results show that PCIe over optical (PO) guarantees 3 µs latency with 62 m of fiber. Compared with the PCIe with Ethernet switch (PE) solution, the PO scheme saves 48.3% completion time for the backprop application. Moreover, we compare the cost and power consumption of a data center architecture that scales out as the square of the port count of an optical packet switch (OPSquare) employing a PO scheme with respect to variants of OPSquare and Leaf-Spine under different network scales and interface bandwidths. Results show that the optical network architecture with the PO scheme saves 31.6% in cost and 12% in power consumption, respectively, compared with the Leaf-Spine with Ethernet solution.","PeriodicalId":50103,"journal":{"name":"Journal of Optical Communications and Networking","volume":"17 9","pages":"D43-D52"},"PeriodicalIF":4.0,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144623872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}