使用字符串长度分布增强数据质量

World Academy of Science, Engineering and Technology, International Journal of Computer and Information Engineering Pub Date : 2017-02-01 DOI:10.5281/ZENODO.1129633

Q. Xiu, H. Hota, Yohsuke Ishii, T. Oda

{"title":"使用字符串长度分布增强数据质量","authors":"Q. Xiu, H. Hota, Yohsuke Ishii, T. Oda","doi":"10.5281/ZENODO.1129633","DOIUrl":null,"url":null,"abstract":"Recently, collectable manufacturing data are rapidly \nincreasing. On the other hand, mega recall is getting serious as \na social problem. Under such circumstances, there are increasing \nneeds for preventing mega recalls by defect analysis such as \nroot cause analysis and abnormal detection utilizing manufacturing \ndata. However, the time to classify strings in manufacturing data \nby traditional method is too long to meet requirement of quick \ndefect analysis. Therefore, we present String Length Distribution \nClassification method (SLDC) to correctly classify strings in a short \ntime. This method learns character features, especially string length \ndistribution from Product ID, Machine ID in BOM and asset list. \nBy applying the proposal to strings in actual manufacturing data, we \nverified that the classification time of strings can be reduced by 80%. \nAs a result, it can be estimated that the requirement of quick defect \nanalysis can be fulfilled.","PeriodicalId":104722,"journal":{"name":"World Academy of Science, Engineering and Technology, International Journal of Computer and Information Engineering","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data Quality Enhancement with String Length Distribution\",\"authors\":\"Q. Xiu, H. Hota, Yohsuke Ishii, T. Oda\",\"doi\":\"10.5281/ZENODO.1129633\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, collectable manufacturing data are rapidly \\nincreasing. On the other hand, mega recall is getting serious as \\na social problem. Under such circumstances, there are increasing \\nneeds for preventing mega recalls by defect analysis such as \\nroot cause analysis and abnormal detection utilizing manufacturing \\ndata. However, the time to classify strings in manufacturing data \\nby traditional method is too long to meet requirement of quick \\ndefect analysis. Therefore, we present String Length Distribution \\nClassification method (SLDC) to correctly classify strings in a short \\ntime. This method learns character features, especially string length \\ndistribution from Product ID, Machine ID in BOM and asset list. \\nBy applying the proposal to strings in actual manufacturing data, we \\nverified that the classification time of strings can be reduced by 80%. \\nAs a result, it can be estimated that the requirement of quick defect \\nanalysis can be fulfilled.\",\"PeriodicalId\":104722,\"journal\":{\"name\":\"World Academy of Science, Engineering and Technology, International Journal of Computer and Information Engineering\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"World Academy of Science, Engineering and Technology, International Journal of Computer and Information Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5281/ZENODO.1129633\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Academy of Science, Engineering and Technology, International Journal of Computer and Information Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5281/ZENODO.1129633","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

最近，可收集的制造业数据正在迅速增加。另一方面，大规模召回作为社会问题日益严重。在这种情况下，利用生产数据进行根本原因分析和异常检测等缺陷分析，防止大规模召回的必要性日益增加。然而，传统方法对制造数据中的字符串进行分类耗时过长，无法满足快速缺陷分析的要求。为此，我们提出了字符串长度分布分类方法(SLDC)，以便在短时间内对字符串进行正确分类。该方法从BOM中的Product ID、Machine ID和资产列表中学习字符特征，特别是字符串长度分布。通过将该方法应用于实际生产数据中的管柱，我们验证了管柱的分类时间可以减少80%。结果表明，可以满足快速缺陷分析的要求。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Data Quality Enhancement with String Length Distribution

Recently, collectable manufacturing data are rapidly increasing. On the other hand, mega recall is getting serious as a social problem. Under such circumstances, there are increasing needs for preventing mega recalls by defect analysis such as root cause analysis and abnormal detection utilizing manufacturing data. However, the time to classify strings in manufacturing data by traditional method is too long to meet requirement of quick defect analysis. Therefore, we present String Length Distribution Classification method (SLDC) to correctly classify strings in a short time. This method learns character features, especially string length distribution from Product ID, Machine ID in BOM and asset list. By applying the proposal to strings in actual manufacturing data, we verified that the classification time of strings can be reduced by 80%. As a result, it can be estimated that the requirement of quick defect analysis can be fulfilled.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

World Academy of Science, Engineering and Technology, International Journal of Computer and Information Engineering

自引率

0.00%

发文量