Chibuike Chiedozie Ibebuchi, O. Obarein, Itohan-Osa Abu
{"title":"自动编码器人工神经网络和主成分分析在全球气温数据模式提取和空间区域化中的应用","authors":"Chibuike Chiedozie Ibebuchi, O. Obarein, Itohan-Osa Abu","doi":"10.1088/2632-2153/ad1c34","DOIUrl":null,"url":null,"abstract":"\n Spatial regionalization is instrumental in simplifying the spatial complexity of the climate system. To identify regions of significant climate variability, pattern extraction is often required prior to spatial regionalization with a clustering algorithm. In this study, the autoencoder artificial neural network (AE) was applied to extract the inherent patterns of global temperature data (from 1901 to 2021). Subsequently, Fuzzy C-means clustering was applied to the extracted patterns to classify the global temperature regions. Our analysis involved comparing AE-based and principal component analysis (PCA)-based clustering results to assess consistency. We determined the number of clusters by examining the average percentage decrease in Fuzzy Partition Coefficient and its 95% confidence interval, seeking a balance between obtaining a high Fuzzy Partition Coefficient and avoiding over-segmentation. This approach suggested that for a more general model, four clusters is reasonable. The Adjusted Rand Index between the AE-based and PCA-based clusters is 0.75, indicating that the AE-based and PCA-based clusters have considerable overlap. The observed difference between the AE-based clusters and PCA-based clusters is suggested to be associated with AE’s capability to learn and extract complex non-linear patterns, and this attribute, for example, enabled the clustering algorithm to accurately detect the Himalayas region as the “third pole” with similar temperature characteristics as the polar regions. Finally, when the analysis period is divided into two (1901-1960 and 1961-2021), the Adjusted Rand Index between the two clusters is 0.96 which suggests that historical climate change has not significantly affected the defined temperature regions over the two periods. In essence, this study indicates both AE's potential to enhance our understanding of climate variability and reveals the stability of the historical temperature regions.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":"19 26","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Application of Autoencoders Artificial Neural Network and Principal Component Analysis for Pattern Extraction and Spatial Regionalization of Global Temperature Data\",\"authors\":\"Chibuike Chiedozie Ibebuchi, O. Obarein, Itohan-Osa Abu\",\"doi\":\"10.1088/2632-2153/ad1c34\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Spatial regionalization is instrumental in simplifying the spatial complexity of the climate system. To identify regions of significant climate variability, pattern extraction is often required prior to spatial regionalization with a clustering algorithm. In this study, the autoencoder artificial neural network (AE) was applied to extract the inherent patterns of global temperature data (from 1901 to 2021). Subsequently, Fuzzy C-means clustering was applied to the extracted patterns to classify the global temperature regions. Our analysis involved comparing AE-based and principal component analysis (PCA)-based clustering results to assess consistency. We determined the number of clusters by examining the average percentage decrease in Fuzzy Partition Coefficient and its 95% confidence interval, seeking a balance between obtaining a high Fuzzy Partition Coefficient and avoiding over-segmentation. This approach suggested that for a more general model, four clusters is reasonable. The Adjusted Rand Index between the AE-based and PCA-based clusters is 0.75, indicating that the AE-based and PCA-based clusters have considerable overlap. The observed difference between the AE-based clusters and PCA-based clusters is suggested to be associated with AE’s capability to learn and extract complex non-linear patterns, and this attribute, for example, enabled the clustering algorithm to accurately detect the Himalayas region as the “third pole” with similar temperature characteristics as the polar regions. Finally, when the analysis period is divided into two (1901-1960 and 1961-2021), the Adjusted Rand Index between the two clusters is 0.96 which suggests that historical climate change has not significantly affected the defined temperature regions over the two periods. In essence, this study indicates both AE's potential to enhance our understanding of climate variability and reveals the stability of the historical temperature regions.\",\"PeriodicalId\":503691,\"journal\":{\"name\":\"Machine Learning: Science and Technology\",\"volume\":\"19 26\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Learning: Science and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1088/2632-2153/ad1c34\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning: Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2632-2153/ad1c34","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Application of Autoencoders Artificial Neural Network and Principal Component Analysis for Pattern Extraction and Spatial Regionalization of Global Temperature Data
Spatial regionalization is instrumental in simplifying the spatial complexity of the climate system. To identify regions of significant climate variability, pattern extraction is often required prior to spatial regionalization with a clustering algorithm. In this study, the autoencoder artificial neural network (AE) was applied to extract the inherent patterns of global temperature data (from 1901 to 2021). Subsequently, Fuzzy C-means clustering was applied to the extracted patterns to classify the global temperature regions. Our analysis involved comparing AE-based and principal component analysis (PCA)-based clustering results to assess consistency. We determined the number of clusters by examining the average percentage decrease in Fuzzy Partition Coefficient and its 95% confidence interval, seeking a balance between obtaining a high Fuzzy Partition Coefficient and avoiding over-segmentation. This approach suggested that for a more general model, four clusters is reasonable. The Adjusted Rand Index between the AE-based and PCA-based clusters is 0.75, indicating that the AE-based and PCA-based clusters have considerable overlap. The observed difference between the AE-based clusters and PCA-based clusters is suggested to be associated with AE’s capability to learn and extract complex non-linear patterns, and this attribute, for example, enabled the clustering algorithm to accurately detect the Himalayas region as the “third pole” with similar temperature characteristics as the polar regions. Finally, when the analysis period is divided into two (1901-1960 and 1961-2021), the Adjusted Rand Index between the two clusters is 0.96 which suggests that historical climate change has not significantly affected the defined temperature regions over the two periods. In essence, this study indicates both AE's potential to enhance our understanding of climate variability and reveals the stability of the historical temperature regions.