{"title":"挖掘异质时间序列信息,预测海洋叶绿素累积量","authors":"Atharva Ramgirkar , Vadiraj Rao , Janhavi Talhar , Tusar Kanti Mishra , Swathi Jamjala Narayanan , Shashank Mouli Satapathy , Boominathan Perumal","doi":"10.1016/j.suscom.2024.100980","DOIUrl":null,"url":null,"abstract":"<div><p>Harmful algal blooms cause environmental harm, financial losses, and disease epidemics. It is also known that the algal blooms cannot be eradicated; hence the best option is to foresee their growth and regulate it. Machine learning algorithms can be used to forecast their presence and further classify the threat that each concentration level presents. In this research work, the dataset collected from Santa Monica, US region is analyzed and processed to predict algae concentration using machine learning algorithms. In this process, the machine learning models such as multiple linear regression, Regression Gradient Boosting Decision Tree (RGBDT), and Hidden Markov Model (HMM) are applied to predict the chlorophyll (Chl-a) content, which serves as a proxy for the presence of algae in the water. The obtained results show that for prediction, the Multilinear regression model outperforms the RGBDT (Regression Gradient Boosting Decision Tree) algorithm. Similarly, for modeling chlorophyll using HMM (Hidden Markov Model), parameter <em>bbp555.00_sd</em> is the best among parameters like <em>aot443.00_sd</em>, <em>kd490.00_sd</em>, <em>poc_sd</em> and <em>pic_sd</em>. The multiple linear regression model gave an adjusted R-squared error of 0.94 with the parameter pic_sd having the least VIF value of 1.78 followed by <em>aot</em> and <em>bbp</em> which have VIF<span><math><mo><</mo></math></span>5 (2.28 and 4.95 respectively). The outcome of the HMM-based model represents the probability of the presence of chlorophyll given the presence of each of the variables individually. From the results, it is observed that <em>bbp</em> has the highest probability of 0.405 implying that there is a 40% chance of chlorophyll in the presence of <em>bbp</em>.</p></div>","PeriodicalId":48686,"journal":{"name":"Sustainable Computing-Informatics & Systems","volume":"42 ","pages":"Article 100980"},"PeriodicalIF":3.8000,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mining of heterogeneous time series information for predicting chlorophyll accumulation in oceans\",\"authors\":\"Atharva Ramgirkar , Vadiraj Rao , Janhavi Talhar , Tusar Kanti Mishra , Swathi Jamjala Narayanan , Shashank Mouli Satapathy , Boominathan Perumal\",\"doi\":\"10.1016/j.suscom.2024.100980\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Harmful algal blooms cause environmental harm, financial losses, and disease epidemics. It is also known that the algal blooms cannot be eradicated; hence the best option is to foresee their growth and regulate it. Machine learning algorithms can be used to forecast their presence and further classify the threat that each concentration level presents. In this research work, the dataset collected from Santa Monica, US region is analyzed and processed to predict algae concentration using machine learning algorithms. In this process, the machine learning models such as multiple linear regression, Regression Gradient Boosting Decision Tree (RGBDT), and Hidden Markov Model (HMM) are applied to predict the chlorophyll (Chl-a) content, which serves as a proxy for the presence of algae in the water. The obtained results show that for prediction, the Multilinear regression model outperforms the RGBDT (Regression Gradient Boosting Decision Tree) algorithm. Similarly, for modeling chlorophyll using HMM (Hidden Markov Model), parameter <em>bbp555.00_sd</em> is the best among parameters like <em>aot443.00_sd</em>, <em>kd490.00_sd</em>, <em>poc_sd</em> and <em>pic_sd</em>. The multiple linear regression model gave an adjusted R-squared error of 0.94 with the parameter pic_sd having the least VIF value of 1.78 followed by <em>aot</em> and <em>bbp</em> which have VIF<span><math><mo><</mo></math></span>5 (2.28 and 4.95 respectively). The outcome of the HMM-based model represents the probability of the presence of chlorophyll given the presence of each of the variables individually. From the results, it is observed that <em>bbp</em> has the highest probability of 0.405 implying that there is a 40% chance of chlorophyll in the presence of <em>bbp</em>.</p></div>\",\"PeriodicalId\":48686,\"journal\":{\"name\":\"Sustainable Computing-Informatics & Systems\",\"volume\":\"42 \",\"pages\":\"Article 100980\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2024-02-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sustainable Computing-Informatics & Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2210537924000258\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sustainable Computing-Informatics & Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210537924000258","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Mining of heterogeneous time series information for predicting chlorophyll accumulation in oceans
Harmful algal blooms cause environmental harm, financial losses, and disease epidemics. It is also known that the algal blooms cannot be eradicated; hence the best option is to foresee their growth and regulate it. Machine learning algorithms can be used to forecast their presence and further classify the threat that each concentration level presents. In this research work, the dataset collected from Santa Monica, US region is analyzed and processed to predict algae concentration using machine learning algorithms. In this process, the machine learning models such as multiple linear regression, Regression Gradient Boosting Decision Tree (RGBDT), and Hidden Markov Model (HMM) are applied to predict the chlorophyll (Chl-a) content, which serves as a proxy for the presence of algae in the water. The obtained results show that for prediction, the Multilinear regression model outperforms the RGBDT (Regression Gradient Boosting Decision Tree) algorithm. Similarly, for modeling chlorophyll using HMM (Hidden Markov Model), parameter bbp555.00_sd is the best among parameters like aot443.00_sd, kd490.00_sd, poc_sd and pic_sd. The multiple linear regression model gave an adjusted R-squared error of 0.94 with the parameter pic_sd having the least VIF value of 1.78 followed by aot and bbp which have VIF5 (2.28 and 4.95 respectively). The outcome of the HMM-based model represents the probability of the presence of chlorophyll given the presence of each of the variables individually. From the results, it is observed that bbp has the highest probability of 0.405 implying that there is a 40% chance of chlorophyll in the presence of bbp.
期刊介绍:
Sustainable computing is a rapidly expanding research area spanning the fields of computer science and engineering, electrical engineering as well as other engineering disciplines. The aim of Sustainable Computing: Informatics and Systems (SUSCOM) is to publish the myriad research findings related to energy-aware and thermal-aware management of computing resource. Equally important is a spectrum of related research issues such as applications of computing that can have ecological and societal impacts. SUSCOM publishes original and timely research papers and survey articles in current areas of power, energy, temperature, and environment related research areas of current importance to readers. SUSCOM has an editorial board comprising prominent researchers from around the world and selects competitively evaluated peer-reviewed papers.