{"title":"Clustreams","authors":"Roy Friedman, Or Goaz, Ori Rottenstreich","doi":"10.1145/3482898.3483356","DOIUrl":null,"url":null,"abstract":"Clusteringis a basic machine learning task. In this task, a stream of input items needs to be grouped into clusters, such that all items classified into the same cluster are closer to each other than to items classified to other clusters. Each cluster is centered around a centroidpoint, which may either be given as a parameter, or must be learned during the process in the case of unsupervised online learning. This work studies the ability to perform clustering, e.g., for classifying network traffic, in programmable switches. Conducting such classification by the switches through which the traffic flows is potentially the most efficient approach. To that end, we develop Clustreams, a novel in-network clustering system designed to handle clustering in the data path. At the core of Clustreamsis a novel clustering algorithm that relies heavily on TCAM (Ternary Content Addressable Memory) match-action capabilities. This algorithm is realized for the Nvidia Spectrum-3 switch, and is limited to classification when the centroid points are known a-priori. The work includes accuracy measurements for the algorithms, as well as run-time performance measurements and analysis of the clustering algorithm on a Spectrum-3 switch. As shown in the measurements, Clustreamsobtains very high accuracy without any noticeable run-time impact on the switch' performance.","PeriodicalId":161157,"journal":{"name":"Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Clustreams\",\"authors\":\"Roy Friedman, Or Goaz, Ori Rottenstreich\",\"doi\":\"10.1145/3482898.3483356\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clusteringis a basic machine learning task. In this task, a stream of input items needs to be grouped into clusters, such that all items classified into the same cluster are closer to each other than to items classified to other clusters. Each cluster is centered around a centroidpoint, which may either be given as a parameter, or must be learned during the process in the case of unsupervised online learning. This work studies the ability to perform clustering, e.g., for classifying network traffic, in programmable switches. Conducting such classification by the switches through which the traffic flows is potentially the most efficient approach. To that end, we develop Clustreams, a novel in-network clustering system designed to handle clustering in the data path. At the core of Clustreamsis a novel clustering algorithm that relies heavily on TCAM (Ternary Content Addressable Memory) match-action capabilities. This algorithm is realized for the Nvidia Spectrum-3 switch, and is limited to classification when the centroid points are known a-priori. The work includes accuracy measurements for the algorithms, as well as run-time performance measurements and analysis of the clustering algorithm on a Spectrum-3 switch. As shown in the measurements, Clustreamsobtains very high accuracy without any noticeable run-time impact on the switch' performance.\",\"PeriodicalId\":161157,\"journal\":{\"name\":\"Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR)\",\"volume\":\"63 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3482898.3483356\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3482898.3483356","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Clusteringis a basic machine learning task. In this task, a stream of input items needs to be grouped into clusters, such that all items classified into the same cluster are closer to each other than to items classified to other clusters. Each cluster is centered around a centroidpoint, which may either be given as a parameter, or must be learned during the process in the case of unsupervised online learning. This work studies the ability to perform clustering, e.g., for classifying network traffic, in programmable switches. Conducting such classification by the switches through which the traffic flows is potentially the most efficient approach. To that end, we develop Clustreams, a novel in-network clustering system designed to handle clustering in the data path. At the core of Clustreamsis a novel clustering algorithm that relies heavily on TCAM (Ternary Content Addressable Memory) match-action capabilities. This algorithm is realized for the Nvidia Spectrum-3 switch, and is limited to classification when the centroid points are known a-priori. The work includes accuracy measurements for the algorithms, as well as run-time performance measurements and analysis of the clustering algorithm on a Spectrum-3 switch. As shown in the measurements, Clustreamsobtains very high accuracy without any noticeable run-time impact on the switch' performance.