推荐使用语义引导机器学习的环境大数据

R. Dutta, Ahsan Morshed, J. Aryal
{"title":"推荐使用语义引导机器学习的环境大数据","authors":"R. Dutta, Ahsan Morshed, J. Aryal","doi":"10.1201/b17112-16","DOIUrl":null,"url":null,"abstract":"In information technology, Big Data is a collection of data sets so large and complex \nthat it becomes difficult to process using on-hand database management tools \nor traditional data-processing applications. The trend to larger data sets is \ndue to the additional information derivable from analysis of a single large set of \nrelated data, as compared with separate smaller sets with the same total amount of \ndata. Scientists regularly encounter limitations due to large data sets \nin many areas, including meteorology, genetics, complex physics simulations, and \nenvironmental research. Wireless technology-based automated data gathering \nfrom the large environmental sensor networks have increased the quantity of sensor \ndata available for analysis and sensor informatics. Next-generation environmental \nmonitoring, natural resource management, and agricultural decision support systems \nare becoming heavily dependent on very large scale multiple sensor network deployments, \nmassive-scale accumulation, harmonization, web-based Big Data integration \nand interpretation of Big Data. With large amount of the data availability, the complexity \nof data has also increased hence regular maintenance of large-scale sensor \nare becoming a difficult challenge. Uncertainty factors in the environmental monitoring \nprocesses are more evident than before due to current technological transparency \nachieved by most recent advanced communication technologies. \nThe other challenges include capture, storage, search, sharing, analysis, and visualization. \nData availability from a particular environmental sensor web is often very \nlimited and data quality is subsequently very poor. This practical limitation could be \ndue to difficult geographical location of the sensor node or sensor station, extreme \nenvironmental conditions, communication network failure, and lastly technical failure \nof the sensor node. Data uncertainty from a sensor network makes the network \nunreliable and inefficient. This inefficiency leads to failure of natural resource management \nsystems such as agricultural water resource management, weather forecast, \ncrop management including irrigation scheduling and natural resource-based \ncrop business model systems. The ultimate challenge in environmental forecasting \nand decision support systems, is to overcome the data uncertainty and make the \nderived output more accurate. It is evident that there is a need to capture and integrate \nenvironmental knowledge from various independent sources including sensor \nnetworks, individual sensory system, large-scale environmental simulation models, \nand historical environmental data for each of the independent \nsources). It is not good enough to produce efficient decision support system using a \nsingle data source. So there is an urgent requirement for on demand complementary \nknowledge integration where different sources of environmental sensor data could \nbe used to complement each other automatically.","PeriodicalId":448182,"journal":{"name":"Large Scale and Big Data","volume":"121 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Recommending Environmental Big Data Using Semantically Guided Machine Learning\",\"authors\":\"R. Dutta, Ahsan Morshed, J. Aryal\",\"doi\":\"10.1201/b17112-16\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In information technology, Big Data is a collection of data sets so large and complex \\nthat it becomes difficult to process using on-hand database management tools \\nor traditional data-processing applications. The trend to larger data sets is \\ndue to the additional information derivable from analysis of a single large set of \\nrelated data, as compared with separate smaller sets with the same total amount of \\ndata. Scientists regularly encounter limitations due to large data sets \\nin many areas, including meteorology, genetics, complex physics simulations, and \\nenvironmental research. Wireless technology-based automated data gathering \\nfrom the large environmental sensor networks have increased the quantity of sensor \\ndata available for analysis and sensor informatics. Next-generation environmental \\nmonitoring, natural resource management, and agricultural decision support systems \\nare becoming heavily dependent on very large scale multiple sensor network deployments, \\nmassive-scale accumulation, harmonization, web-based Big Data integration \\nand interpretation of Big Data. With large amount of the data availability, the complexity \\nof data has also increased hence regular maintenance of large-scale sensor \\nare becoming a difficult challenge. Uncertainty factors in the environmental monitoring \\nprocesses are more evident than before due to current technological transparency \\nachieved by most recent advanced communication technologies. \\nThe other challenges include capture, storage, search, sharing, analysis, and visualization. \\nData availability from a particular environmental sensor web is often very \\nlimited and data quality is subsequently very poor. This practical limitation could be \\ndue to difficult geographical location of the sensor node or sensor station, extreme \\nenvironmental conditions, communication network failure, and lastly technical failure \\nof the sensor node. Data uncertainty from a sensor network makes the network \\nunreliable and inefficient. This inefficiency leads to failure of natural resource management \\nsystems such as agricultural water resource management, weather forecast, \\ncrop management including irrigation scheduling and natural resource-based \\ncrop business model systems. The ultimate challenge in environmental forecasting \\nand decision support systems, is to overcome the data uncertainty and make the \\nderived output more accurate. It is evident that there is a need to capture and integrate \\nenvironmental knowledge from various independent sources including sensor \\nnetworks, individual sensory system, large-scale environmental simulation models, \\nand historical environmental data for each of the independent \\nsources). It is not good enough to produce efficient decision support system using a \\nsingle data source. So there is an urgent requirement for on demand complementary \\nknowledge integration where different sources of environmental sensor data could \\nbe used to complement each other automatically.\",\"PeriodicalId\":448182,\"journal\":{\"name\":\"Large Scale and Big Data\",\"volume\":\"121 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Large Scale and Big Data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1201/b17112-16\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Large Scale and Big Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1201/b17112-16","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

在信息技术中,大数据是一组庞大而复杂的数据集,以至于使用现有的数据库管理工具或传统的数据处理应用程序很难进行处理。更大数据集的趋势是由于与具有相同数据总量的单独的较小数据集相比,从单个大型相关数据集的分析中可以获得额外的信息。在许多领域,包括气象学、遗传学、复杂物理模拟和环境研究,科学家经常会遇到大型数据集的限制。基于无线技术的大型环境传感器网络自动数据收集增加了可用于分析和传感器信息学的传感器数据的数量。下一代环境监测、自然资源管理和农业决策支持系统正变得严重依赖于大规模多传感器网络部署、大规模积累、协调、基于web的大数据集成和大数据解释。随着数据的大量可用性,数据的复杂性也随之增加,大型传感器的定期维护成为一项艰巨的挑战。由于目前最先进的通信技术实现了技术透明度,环境监测过程中的不确定性因素比以前更加明显。其他挑战包括捕获、存储、搜索、共享、分析和可视化。来自特定环境传感器网络的数据可用性通常非常有限,因此数据质量非常差。这种实际限制可能是由于传感器节点或传感器站的地理位置困难、极端环境条件、通信网络故障以及传感器节点的最后技术故障。传感器网络数据的不确定性会导致网络的不可靠和低效。这种低效率导致自然资源管理系统的失败,如农业水资源管理、天气预报、包括灌溉调度在内的作物管理和基于自然资源的作物商业模式系统。环境预测和决策支持系统的最终挑战是克服数据的不确定性,使导出的输出更准确。很明显,有必要从各种独立来源(包括传感器网络、单个感官系统、大规模环境模拟模型和每个独立来源的历史环境数据)获取和整合环境知识。使用单一的数据源来生成高效的决策支持系统是不够的。因此,迫切需要按需补充知识集成,利用不同来源的环境传感器数据进行自动补充。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Recommending Environmental Big Data Using Semantically Guided Machine Learning
In information technology, Big Data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data-processing applications. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared with separate smaller sets with the same total amount of data. Scientists regularly encounter limitations due to large data sets in many areas, including meteorology, genetics, complex physics simulations, and environmental research. Wireless technology-based automated data gathering from the large environmental sensor networks have increased the quantity of sensor data available for analysis and sensor informatics. Next-generation environmental monitoring, natural resource management, and agricultural decision support systems are becoming heavily dependent on very large scale multiple sensor network deployments, massive-scale accumulation, harmonization, web-based Big Data integration and interpretation of Big Data. With large amount of the data availability, the complexity of data has also increased hence regular maintenance of large-scale sensor are becoming a difficult challenge. Uncertainty factors in the environmental monitoring processes are more evident than before due to current technological transparency achieved by most recent advanced communication technologies. The other challenges include capture, storage, search, sharing, analysis, and visualization. Data availability from a particular environmental sensor web is often very limited and data quality is subsequently very poor. This practical limitation could be due to difficult geographical location of the sensor node or sensor station, extreme environmental conditions, communication network failure, and lastly technical failure of the sensor node. Data uncertainty from a sensor network makes the network unreliable and inefficient. This inefficiency leads to failure of natural resource management systems such as agricultural water resource management, weather forecast, crop management including irrigation scheduling and natural resource-based crop business model systems. The ultimate challenge in environmental forecasting and decision support systems, is to overcome the data uncertainty and make the derived output more accurate. It is evident that there is a need to capture and integrate environmental knowledge from various independent sources including sensor networks, individual sensory system, large-scale environmental simulation models, and historical environmental data for each of the independent sources). It is not good enough to produce efficient decision support system using a single data source. So there is an urgent requirement for on demand complementary knowledge integration where different sources of environmental sensor data could be used to complement each other automatically.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信