{"title":"基于容器技术的电火花加工特征提取大数据处理方法","authors":"Denata Rizky Alimadji, Min-Hsiung Hung, Yu-Chuan Lin, Benny Suryajaya, Chao-Chun Chen","doi":"10.1109/SNPD51163.2021.9704989","DOIUrl":null,"url":null,"abstract":"EDM (Electrical Discharge Machining) is a process to remove metal from conductive materials using electrical sparks. To monitor the EDM process using virtual metrology (VM), we need to obtain the electrode’s voltage and current signals of a machine tool. Due to the nature of EDM, the sensors installed on the machine tool acquire the signals at a high sampling rate and generate a vast amount of data in a short time, thereby raising the big-data processing issue. Our previous work proposed an efficient approach called BEDPS to process the EDM big data in a Hadoop distributed cluster. This paper presents a novel big data processing approach to feature extraction for EDM by using container technology (i.e., Docker and Kubernetes). We re-implement some Spark algorithms of BEDPS in Python (originally in Scala) and then run the refined BEDPS in containers in a Kubernetes cluster. Testing results show that the refined BEDPS developed in this study can reduce the execution time by almost half, compared to the original Scala version (9.6577 minutes vs. 19.2735 minutes). The adoption of Python in Spark is also shown to have similar performance with Scala, although there are some cases where Python performance falls short, for example, parallel processing using Python parallel processing library. The results also show that the Kubernetes cluster is promising to be an alternative way, other than the Hadoop, for processing big data. At the same time, it can bring some advantages to the big data processing applications, such as easy deployment, robustly running, load balance, self-healing, failover, and horizontal auto-scaling for containerized applications.","PeriodicalId":235370,"journal":{"name":"2021 IEEE/ACIS 22nd International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Novel Big Data Processing Approach to Feature Extraction for Electrical Discharge Machining based on Container Technology\",\"authors\":\"Denata Rizky Alimadji, Min-Hsiung Hung, Yu-Chuan Lin, Benny Suryajaya, Chao-Chun Chen\",\"doi\":\"10.1109/SNPD51163.2021.9704989\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"EDM (Electrical Discharge Machining) is a process to remove metal from conductive materials using electrical sparks. To monitor the EDM process using virtual metrology (VM), we need to obtain the electrode’s voltage and current signals of a machine tool. Due to the nature of EDM, the sensors installed on the machine tool acquire the signals at a high sampling rate and generate a vast amount of data in a short time, thereby raising the big-data processing issue. Our previous work proposed an efficient approach called BEDPS to process the EDM big data in a Hadoop distributed cluster. This paper presents a novel big data processing approach to feature extraction for EDM by using container technology (i.e., Docker and Kubernetes). We re-implement some Spark algorithms of BEDPS in Python (originally in Scala) and then run the refined BEDPS in containers in a Kubernetes cluster. Testing results show that the refined BEDPS developed in this study can reduce the execution time by almost half, compared to the original Scala version (9.6577 minutes vs. 19.2735 minutes). The adoption of Python in Spark is also shown to have similar performance with Scala, although there are some cases where Python performance falls short, for example, parallel processing using Python parallel processing library. The results also show that the Kubernetes cluster is promising to be an alternative way, other than the Hadoop, for processing big data. At the same time, it can bring some advantages to the big data processing applications, such as easy deployment, robustly running, load balance, self-healing, failover, and horizontal auto-scaling for containerized applications.\",\"PeriodicalId\":235370,\"journal\":{\"name\":\"2021 IEEE/ACIS 22nd International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE/ACIS 22nd International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SNPD51163.2021.9704989\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACIS 22nd International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SNPD51163.2021.9704989","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Novel Big Data Processing Approach to Feature Extraction for Electrical Discharge Machining based on Container Technology
EDM (Electrical Discharge Machining) is a process to remove metal from conductive materials using electrical sparks. To monitor the EDM process using virtual metrology (VM), we need to obtain the electrode’s voltage and current signals of a machine tool. Due to the nature of EDM, the sensors installed on the machine tool acquire the signals at a high sampling rate and generate a vast amount of data in a short time, thereby raising the big-data processing issue. Our previous work proposed an efficient approach called BEDPS to process the EDM big data in a Hadoop distributed cluster. This paper presents a novel big data processing approach to feature extraction for EDM by using container technology (i.e., Docker and Kubernetes). We re-implement some Spark algorithms of BEDPS in Python (originally in Scala) and then run the refined BEDPS in containers in a Kubernetes cluster. Testing results show that the refined BEDPS developed in this study can reduce the execution time by almost half, compared to the original Scala version (9.6577 minutes vs. 19.2735 minutes). The adoption of Python in Spark is also shown to have similar performance with Scala, although there are some cases where Python performance falls short, for example, parallel processing using Python parallel processing library. The results also show that the Kubernetes cluster is promising to be an alternative way, other than the Hadoop, for processing big data. At the same time, it can bring some advantages to the big data processing applications, such as easy deployment, robustly running, load balance, self-healing, failover, and horizontal auto-scaling for containerized applications.