{"title":"推荐使用语义引导机器学习的环境大数据","authors":"R. Dutta, Ahsan Morshed, J. Aryal","doi":"10.1201/b17112-16","DOIUrl":null,"url":null,"abstract":"In information technology, Big Data is a collection of data sets so large and complex \nthat it becomes difficult to process using on-hand database management tools \nor traditional data-processing applications. The trend to larger data sets is \ndue to the additional information derivable from analysis of a single large set of \nrelated data, as compared with separate smaller sets with the same total amount of \ndata. Scientists regularly encounter limitations due to large data sets \nin many areas, including meteorology, genetics, complex physics simulations, and \nenvironmental research. Wireless technology-based automated data gathering \nfrom the large environmental sensor networks have increased the quantity of sensor \ndata available for analysis and sensor informatics. Next-generation environmental \nmonitoring, natural resource management, and agricultural decision support systems \nare becoming heavily dependent on very large scale multiple sensor network deployments, \nmassive-scale accumulation, harmonization, web-based Big Data integration \nand interpretation of Big Data. With large amount of the data availability, the complexity \nof data has also increased hence regular maintenance of large-scale sensor \nare becoming a difficult challenge. Uncertainty factors in the environmental monitoring \nprocesses are more evident than before due to current technological transparency \nachieved by most recent advanced communication technologies. \nThe other challenges include capture, storage, search, sharing, analysis, and visualization. \nData availability from a particular environmental sensor web is often very \nlimited and data quality is subsequently very poor. This practical limitation could be \ndue to difficult geographical location of the sensor node or sensor station, extreme \nenvironmental conditions, communication network failure, and lastly technical failure \nof the sensor node. Data uncertainty from a sensor network makes the network \nunreliable and inefficient. This inefficiency leads to failure of natural resource management \nsystems such as agricultural water resource management, weather forecast, \ncrop management including irrigation scheduling and natural resource-based \ncrop business model systems. The ultimate challenge in environmental forecasting \nand decision support systems, is to overcome the data uncertainty and make the \nderived output more accurate. It is evident that there is a need to capture and integrate \nenvironmental knowledge from various independent sources including sensor \nnetworks, individual sensory system, large-scale environmental simulation models, \nand historical environmental data for each of the independent \nsources). It is not good enough to produce efficient decision support system using a \nsingle data source. So there is an urgent requirement for on demand complementary \nknowledge integration where different sources of environmental sensor data could \nbe used to complement each other automatically.","PeriodicalId":448182,"journal":{"name":"Large Scale and Big Data","volume":"121 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Recommending Environmental Big Data Using Semantically Guided Machine Learning\",\"authors\":\"R. Dutta, Ahsan Morshed, J. Aryal\",\"doi\":\"10.1201/b17112-16\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In information technology, Big Data is a collection of data sets so large and complex \\nthat it becomes difficult to process using on-hand database management tools \\nor traditional data-processing applications. The trend to larger data sets is \\ndue to the additional information derivable from analysis of a single large set of \\nrelated data, as compared with separate smaller sets with the same total amount of \\ndata. Scientists regularly encounter limitations due to large data sets \\nin many areas, including meteorology, genetics, complex physics simulations, and \\nenvironmental research. Wireless technology-based automated data gathering \\nfrom the large environmental sensor networks have increased the quantity of sensor \\ndata available for analysis and sensor informatics. Next-generation environmental \\nmonitoring, natural resource management, and agricultural decision support systems \\nare becoming heavily dependent on very large scale multiple sensor network deployments, \\nmassive-scale accumulation, harmonization, web-based Big Data integration \\nand interpretation of Big Data. With large amount of the data availability, the complexity \\nof data has also increased hence regular maintenance of large-scale sensor \\nare becoming a difficult challenge. Uncertainty factors in the environmental monitoring \\nprocesses are more evident than before due to current technological transparency \\nachieved by most recent advanced communication technologies. \\nThe other challenges include capture, storage, search, sharing, analysis, and visualization. \\nData availability from a particular environmental sensor web is often very \\nlimited and data quality is subsequently very poor. This practical limitation could be \\ndue to difficult geographical location of the sensor node or sensor station, extreme \\nenvironmental conditions, communication network failure, and lastly technical failure \\nof the sensor node. Data uncertainty from a sensor network makes the network \\nunreliable and inefficient. This inefficiency leads to failure of natural resource management \\nsystems such as agricultural water resource management, weather forecast, \\ncrop management including irrigation scheduling and natural resource-based \\ncrop business model systems. The ultimate challenge in environmental forecasting \\nand decision support systems, is to overcome the data uncertainty and make the \\nderived output more accurate. It is evident that there is a need to capture and integrate \\nenvironmental knowledge from various independent sources including sensor \\nnetworks, individual sensory system, large-scale environmental simulation models, \\nand historical environmental data for each of the independent \\nsources). It is not good enough to produce efficient decision support system using a \\nsingle data source. So there is an urgent requirement for on demand complementary \\nknowledge integration where different sources of environmental sensor data could \\nbe used to complement each other automatically.\",\"PeriodicalId\":448182,\"journal\":{\"name\":\"Large Scale and Big Data\",\"volume\":\"121 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Large Scale and Big Data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1201/b17112-16\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Large Scale and Big Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1201/b17112-16","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Recommending Environmental Big Data Using Semantically Guided Machine Learning
In information technology, Big Data is a collection of data sets so large and complex
that it becomes difficult to process using on-hand database management tools
or traditional data-processing applications. The trend to larger data sets is
due to the additional information derivable from analysis of a single large set of
related data, as compared with separate smaller sets with the same total amount of
data. Scientists regularly encounter limitations due to large data sets
in many areas, including meteorology, genetics, complex physics simulations, and
environmental research. Wireless technology-based automated data gathering
from the large environmental sensor networks have increased the quantity of sensor
data available for analysis and sensor informatics. Next-generation environmental
monitoring, natural resource management, and agricultural decision support systems
are becoming heavily dependent on very large scale multiple sensor network deployments,
massive-scale accumulation, harmonization, web-based Big Data integration
and interpretation of Big Data. With large amount of the data availability, the complexity
of data has also increased hence regular maintenance of large-scale sensor
are becoming a difficult challenge. Uncertainty factors in the environmental monitoring
processes are more evident than before due to current technological transparency
achieved by most recent advanced communication technologies.
The other challenges include capture, storage, search, sharing, analysis, and visualization.
Data availability from a particular environmental sensor web is often very
limited and data quality is subsequently very poor. This practical limitation could be
due to difficult geographical location of the sensor node or sensor station, extreme
environmental conditions, communication network failure, and lastly technical failure
of the sensor node. Data uncertainty from a sensor network makes the network
unreliable and inefficient. This inefficiency leads to failure of natural resource management
systems such as agricultural water resource management, weather forecast,
crop management including irrigation scheduling and natural resource-based
crop business model systems. The ultimate challenge in environmental forecasting
and decision support systems, is to overcome the data uncertainty and make the
derived output more accurate. It is evident that there is a need to capture and integrate
environmental knowledge from various independent sources including sensor
networks, individual sensory system, large-scale environmental simulation models,
and historical environmental data for each of the independent
sources). It is not good enough to produce efficient decision support system using a
single data source. So there is an urgent requirement for on demand complementary
knowledge integration where different sources of environmental sensor data could
be used to complement each other automatically.