Tanmay Sanjay Hukkeri, G. Shobha, Shubham Milind Phal, Jyothi Shetty, R. YatishH., Naweed Mohammed
{"title":"基于HPCC系统大数据平台的大规模可扩展图像处理","authors":"Tanmay Sanjay Hukkeri, G. Shobha, Shubham Milind Phal, Jyothi Shetty, R. YatishH., Naweed Mohammed","doi":"10.1145/3378936.3378978","DOIUrl":null,"url":null,"abstract":"Today's fast-moving world sees an abundance of image data in everyday life. From messages to insurance claims to even judicial systems, image data plays a pivotal role in facilitating several critical Big Data applications. Some of these applications such as automatic license plate recognition (ALPR) use CCTV cameras to capture snapshots of traffic from real-time video, inadvertently resulting in the generation vast amounts of image data on a daily basis. This brings with it the herculean task of processing these images to extract the essential information as efficiently as possible. The conventional method of processing images in a sequential manner can be very time consuming on account of the vast multitude of images and the intensive computation involved in order to process these. Distributed image processing seeks to provide a solution to this problem by splitting the computations involved across multiple nodes. This paper presents a novel framework to implement distributed image processing via OpenCV on HPCC Systems distributed node architecture*, a set of high-performance computing clusters. The proposed approach when tested on the Indian License Plates Dataset was found to be 85 percent accurate. Additionally, a 30 percent decrease in computation time was observed when executed on a multi-node setup without any impact to accuracy.","PeriodicalId":304149,"journal":{"name":"Proceedings of the 3rd International Conference on Software Engineering and Information Management","volume":"230 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Massively Scalable Image Processing on the HPCC Systems Big Data Platform\",\"authors\":\"Tanmay Sanjay Hukkeri, G. Shobha, Shubham Milind Phal, Jyothi Shetty, R. YatishH., Naweed Mohammed\",\"doi\":\"10.1145/3378936.3378978\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Today's fast-moving world sees an abundance of image data in everyday life. From messages to insurance claims to even judicial systems, image data plays a pivotal role in facilitating several critical Big Data applications. Some of these applications such as automatic license plate recognition (ALPR) use CCTV cameras to capture snapshots of traffic from real-time video, inadvertently resulting in the generation vast amounts of image data on a daily basis. This brings with it the herculean task of processing these images to extract the essential information as efficiently as possible. The conventional method of processing images in a sequential manner can be very time consuming on account of the vast multitude of images and the intensive computation involved in order to process these. Distributed image processing seeks to provide a solution to this problem by splitting the computations involved across multiple nodes. This paper presents a novel framework to implement distributed image processing via OpenCV on HPCC Systems distributed node architecture*, a set of high-performance computing clusters. The proposed approach when tested on the Indian License Plates Dataset was found to be 85 percent accurate. Additionally, a 30 percent decrease in computation time was observed when executed on a multi-node setup without any impact to accuracy.\",\"PeriodicalId\":304149,\"journal\":{\"name\":\"Proceedings of the 3rd International Conference on Software Engineering and Information Management\",\"volume\":\"230 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd International Conference on Software Engineering and Information Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3378936.3378978\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Conference on Software Engineering and Information Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3378936.3378978","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Massively Scalable Image Processing on the HPCC Systems Big Data Platform
Today's fast-moving world sees an abundance of image data in everyday life. From messages to insurance claims to even judicial systems, image data plays a pivotal role in facilitating several critical Big Data applications. Some of these applications such as automatic license plate recognition (ALPR) use CCTV cameras to capture snapshots of traffic from real-time video, inadvertently resulting in the generation vast amounts of image data on a daily basis. This brings with it the herculean task of processing these images to extract the essential information as efficiently as possible. The conventional method of processing images in a sequential manner can be very time consuming on account of the vast multitude of images and the intensive computation involved in order to process these. Distributed image processing seeks to provide a solution to this problem by splitting the computations involved across multiple nodes. This paper presents a novel framework to implement distributed image processing via OpenCV on HPCC Systems distributed node architecture*, a set of high-performance computing clusters. The proposed approach when tested on the Indian License Plates Dataset was found to be 85 percent accurate. Additionally, a 30 percent decrease in computation time was observed when executed on a multi-node setup without any impact to accuracy.