{"title":"NVMe-Over-TCP对HDFS工作负载的性能影响","authors":"Nikita Sharma, Ruihao Li, Qinzhe Wu, L. John","doi":"10.1109/UCC56403.2022.00059","DOIUrl":null,"url":null,"abstract":"Storage is one of the important components in datacenters. As the data volume rises and the service scale grows, some workloads like database demand increasing amount of storage. While a single server can only host a limited number of disks, distributed file systems (e.g., Hadoop Distributed File System referred to as HDFS) enable accessing disks mounted on the other servers in the cluster, satisfying the storage requirements. On the other side, NVMe-over-Fabric protocols (e.g., NVMe-over-TCP) have been released as a solution on the device level to provide access to remote NVMe disks. Therefore, for those applications developed on top of HDFS, there are at least two choices to make use of the storage resources distributed in datacenters. A concern is whether NVMe-over-TCP will hurt the performance. The evaluation in this paper reveals that the performance degradation NVMe-over-TCP caused on HDFS-based workloads is limited, suggesting NVMe-over-TCP a performant and economical solution in datacenter design to support workloads that needs a lot of storage (such as database applications).","PeriodicalId":203244,"journal":{"name":"2022 IEEE/ACM 15th International Conference on Utility and Cloud Computing (UCC)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance Impact of NVMe-Over-TCP on HDFS Workloads\",\"authors\":\"Nikita Sharma, Ruihao Li, Qinzhe Wu, L. John\",\"doi\":\"10.1109/UCC56403.2022.00059\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Storage is one of the important components in datacenters. As the data volume rises and the service scale grows, some workloads like database demand increasing amount of storage. While a single server can only host a limited number of disks, distributed file systems (e.g., Hadoop Distributed File System referred to as HDFS) enable accessing disks mounted on the other servers in the cluster, satisfying the storage requirements. On the other side, NVMe-over-Fabric protocols (e.g., NVMe-over-TCP) have been released as a solution on the device level to provide access to remote NVMe disks. Therefore, for those applications developed on top of HDFS, there are at least two choices to make use of the storage resources distributed in datacenters. A concern is whether NVMe-over-TCP will hurt the performance. The evaluation in this paper reveals that the performance degradation NVMe-over-TCP caused on HDFS-based workloads is limited, suggesting NVMe-over-TCP a performant and economical solution in datacenter design to support workloads that needs a lot of storage (such as database applications).\",\"PeriodicalId\":203244,\"journal\":{\"name\":\"2022 IEEE/ACM 15th International Conference on Utility and Cloud Computing (UCC)\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM 15th International Conference on Utility and Cloud Computing (UCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/UCC56403.2022.00059\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 15th International Conference on Utility and Cloud Computing (UCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UCC56403.2022.00059","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Performance Impact of NVMe-Over-TCP on HDFS Workloads
Storage is one of the important components in datacenters. As the data volume rises and the service scale grows, some workloads like database demand increasing amount of storage. While a single server can only host a limited number of disks, distributed file systems (e.g., Hadoop Distributed File System referred to as HDFS) enable accessing disks mounted on the other servers in the cluster, satisfying the storage requirements. On the other side, NVMe-over-Fabric protocols (e.g., NVMe-over-TCP) have been released as a solution on the device level to provide access to remote NVMe disks. Therefore, for those applications developed on top of HDFS, there are at least two choices to make use of the storage resources distributed in datacenters. A concern is whether NVMe-over-TCP will hurt the performance. The evaluation in this paper reveals that the performance degradation NVMe-over-TCP caused on HDFS-based workloads is limited, suggesting NVMe-over-TCP a performant and economical solution in datacenter design to support workloads that needs a lot of storage (such as database applications).