采用卡尔曼滤波递归方法对基于体积、种类和速度的大数据进行滤波

2017 IEEE 3rd International Conference on Engineering Technologies and Social Sciences (ICETSS) Pub Date : 2017-08-01 DOI:10.1109/ICETSS.2017.8324195

Fatima Riaz, Muhammad Alam, Attra Ali

{"title":"采用卡尔曼滤波递归方法对基于体积、种类和速度的大数据进行滤波","authors":"Fatima Riaz, Muhammad Alam, Attra Ali","doi":"10.1109/ICETSS.2017.8324195","DOIUrl":null,"url":null,"abstract":"For the past seven decades the term Big Data is known, but due to the emerging technology shift of this era, it is captivating a lot of attention from the researchers of mathematics, computing, telecommunication, information technology, data warehousing, and mining. As this generation is living in the age of technology where data is playing a vital role and especially the Big Data has lots of success stories, but at the same time it is becoming the biggest threat to network service provider, telecom industry, and homeland security. Every device such as smart phones, laptop, desktop, etc. connected with the network is contributing to add data to a Big Data pool by using different applications. Social media such as Instagram, Facebook, WhatsApp, Apple, Google, Google+, Twitter, Flickr, etc. are few famous tools which are used to add redundant data. The question appears, is it mandatory to store and especially process all the data either useful or redundant? This research paper is focusing for filtering useful data from redundant data by using their parameters which are velocity, variety, and volume. In proposed architecture, Memcache DB (for velocity), Voldemort layers (for variety) and MapReduce (for volume) are linked with Hadoop to achieve filtered data. Kalman filter recursive approach is used to inject the data back into Hadoop Distributed File System to reduce processing cost of next iterations.","PeriodicalId":228333,"journal":{"name":"2017 IEEE 3rd International Conference on Engineering Technologies and Social Sciences (ICETSS)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Filtering the big data based on volume, variety and velocity by using Kalman filter recursive approach\",\"authors\":\"Fatima Riaz, Muhammad Alam, Attra Ali\",\"doi\":\"10.1109/ICETSS.2017.8324195\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For the past seven decades the term Big Data is known, but due to the emerging technology shift of this era, it is captivating a lot of attention from the researchers of mathematics, computing, telecommunication, information technology, data warehousing, and mining. As this generation is living in the age of technology where data is playing a vital role and especially the Big Data has lots of success stories, but at the same time it is becoming the biggest threat to network service provider, telecom industry, and homeland security. Every device such as smart phones, laptop, desktop, etc. connected with the network is contributing to add data to a Big Data pool by using different applications. Social media such as Instagram, Facebook, WhatsApp, Apple, Google, Google+, Twitter, Flickr, etc. are few famous tools which are used to add redundant data. The question appears, is it mandatory to store and especially process all the data either useful or redundant? This research paper is focusing for filtering useful data from redundant data by using their parameters which are velocity, variety, and volume. In proposed architecture, Memcache DB (for velocity), Voldemort layers (for variety) and MapReduce (for volume) are linked with Hadoop to achieve filtered data. Kalman filter recursive approach is used to inject the data back into Hadoop Distributed File System to reduce processing cost of next iterations.\",\"PeriodicalId\":228333,\"journal\":{\"name\":\"2017 IEEE 3rd International Conference on Engineering Technologies and Social Sciences (ICETSS)\",\"volume\":\"82 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 3rd International Conference on Engineering Technologies and Social Sciences (ICETSS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICETSS.2017.8324195\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 3rd International Conference on Engineering Technologies and Social Sciences (ICETSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICETSS.2017.8324195","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

摘要

在过去的七十年里，“大数据”这个词一直为人所知，但由于这个时代的新兴技术转变，它吸引了数学、计算、电信、信息技术、数据仓库和挖掘等领域研究人员的大量关注。由于这一代人生活在技术时代，数据扮演着至关重要的角色，特别是大数据有很多成功的故事，但同时它也成为网络服务提供商、电信行业和国土安全的最大威胁。每一台连接到网络的设备，如智能手机、笔记本电脑、台式机等，都在通过不同的应用程序为大数据池添加数据做出贡献。社交媒体，如Instagram, Facebook, WhatsApp, Apple, Google, Google+， Twitter, Flickr等是少数著名的用于添加冗余数据的工具。问题出现了，是否必须存储和处理所有有用或冗余的数据?本文的研究重点是利用冗余数据的速度、种类、体积等参数，从冗余数据中筛选出有用的数据。在提议的架构中，Memcache DB(用于速度)、Voldemort层(用于多样性)和MapReduce(用于容量)与Hadoop相关联，以实现过滤数据。采用卡尔曼滤波递归方法将数据注入Hadoop分布式文件系统，以减少下一次迭代的处理成本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Filtering the big data based on volume, variety and velocity by using Kalman filter recursive approach

For the past seven decades the term Big Data is known, but due to the emerging technology shift of this era, it is captivating a lot of attention from the researchers of mathematics, computing, telecommunication, information technology, data warehousing, and mining. As this generation is living in the age of technology where data is playing a vital role and especially the Big Data has lots of success stories, but at the same time it is becoming the biggest threat to network service provider, telecom industry, and homeland security. Every device such as smart phones, laptop, desktop, etc. connected with the network is contributing to add data to a Big Data pool by using different applications. Social media such as Instagram, Facebook, WhatsApp, Apple, Google, Google+, Twitter, Flickr, etc. are few famous tools which are used to add redundant data. The question appears, is it mandatory to store and especially process all the data either useful or redundant? This research paper is focusing for filtering useful data from redundant data by using their parameters which are velocity, variety, and volume. In proposed architecture, Memcache DB (for velocity), Voldemort layers (for variety) and MapReduce (for volume) are linked with Hadoop to achieve filtered data. Kalman filter recursive approach is used to inject the data back into Hadoop Distributed File System to reduce processing cost of next iterations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE 3rd International Conference on Engineering Technologies and Social Sciences (ICETSS)

自引率

0.00%

发文量