Irradiance dataset in the south of Colombia from 2013 to 2023 in 5-minutes intervals

IF 1.4 Q3 MULTIDISCIPLINARY SCIENCES
John Barco-Jiménez , Daniel Rosero , Andrés Zambrano , Francisco Eraso-Checa , Miller Ruales , José Camilo Eraso
{"title":"Irradiance dataset in the south of Colombia from 2013 to 2023 in 5-minutes intervals","authors":"John Barco-Jiménez ,&nbsp;Daniel Rosero ,&nbsp;Andrés Zambrano ,&nbsp;Francisco Eraso-Checa ,&nbsp;Miller Ruales ,&nbsp;José Camilo Eraso","doi":"10.1016/j.dib.2025.112063","DOIUrl":null,"url":null,"abstract":"<div><div>This article presents an extensive irradiance dataset collected in San Juan de Pasto, located in southern Colombia, using a Davis Vantage PRO 2 meteorological station. The dataset spans 11 years, covering the period from 2013 to 2023, with measurements taken at 5-minute intervals, resulting in approximately 603,495 irradiance records, each accompanied by a corresponding timestamp.</div><div>The construction of the dataset required a rigorous preprocessing stage. This stage included the removal of erroneous values (NaN) and outliers, the identification of missing entries, and the correction of inconsistencies in the date records. Missing values were addressed through gap-filling procedures based on averaged data, complemented by visual inspections using graphical representations. The cleaned dataset was exported after ensuring data integrity, accuracy, and consistency, which are essential for reliable analysis and subsequent modeling.</div><div>This dataset is valuable for building training datasets used as input for artificial intelligence models to perform short-, medium-, and long-term irradiance forecasting. For instance, Barco-Jiménez et al. (2021) utilized a portion of this dataset to develop multitemporal irradiance predictions. These predictive models can be applied in various domains, including energy management, grid optimization, and solar energy production planning. Furthermore, the dataset supports statistical analyses that provide insights for appropriately sizing photovoltaic systems through indicators such as Hours of Peak Sunlight (HPS), maximum and minimum irradiance values, average daily and monthly irradiance, and seasonal trends. These indicators play a fundamental role in the optimization of photovoltaic system performance, contributing to cost reduction and enhancing energy efficiency across rural, residential, and commercial applications.</div><div>This dataset supports photovoltaic system design and studies on solar energy variability and climate patterns in the region. Analysis of irradiance fluctuations over time provides insights into the influence of atmospheric conditions on solar energy availability. This information is essential for enhancing the reliability of solar power systems and effectively integrating renewable energy sources into existing power grids. The dataset can also be used in educational settings to teach data analysis techniques and renewable energy concepts, providing students and researchers with a practical resource for hands-on learning.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"63 ","pages":"Article 112063"},"PeriodicalIF":1.4000,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352340925007851","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

This article presents an extensive irradiance dataset collected in San Juan de Pasto, located in southern Colombia, using a Davis Vantage PRO 2 meteorological station. The dataset spans 11 years, covering the period from 2013 to 2023, with measurements taken at 5-minute intervals, resulting in approximately 603,495 irradiance records, each accompanied by a corresponding timestamp.
The construction of the dataset required a rigorous preprocessing stage. This stage included the removal of erroneous values (NaN) and outliers, the identification of missing entries, and the correction of inconsistencies in the date records. Missing values were addressed through gap-filling procedures based on averaged data, complemented by visual inspections using graphical representations. The cleaned dataset was exported after ensuring data integrity, accuracy, and consistency, which are essential for reliable analysis and subsequent modeling.
This dataset is valuable for building training datasets used as input for artificial intelligence models to perform short-, medium-, and long-term irradiance forecasting. For instance, Barco-Jiménez et al. (2021) utilized a portion of this dataset to develop multitemporal irradiance predictions. These predictive models can be applied in various domains, including energy management, grid optimization, and solar energy production planning. Furthermore, the dataset supports statistical analyses that provide insights for appropriately sizing photovoltaic systems through indicators such as Hours of Peak Sunlight (HPS), maximum and minimum irradiance values, average daily and monthly irradiance, and seasonal trends. These indicators play a fundamental role in the optimization of photovoltaic system performance, contributing to cost reduction and enhancing energy efficiency across rural, residential, and commercial applications.
This dataset supports photovoltaic system design and studies on solar energy variability and climate patterns in the region. Analysis of irradiance fluctuations over time provides insights into the influence of atmospheric conditions on solar energy availability. This information is essential for enhancing the reliability of solar power systems and effectively integrating renewable energy sources into existing power grids. The dataset can also be used in educational settings to teach data analysis techniques and renewable energy concepts, providing students and researchers with a practical resource for hands-on learning.
哥伦比亚南部2013 - 2023年的辐照度数据集,每隔5分钟
本文介绍了一个广泛的辐照度数据集收集在圣胡安德帕斯托,位于哥伦比亚南部,使用戴维斯Vantage PRO 2气象站。该数据集跨越11年,涵盖2013年至2023年期间,每隔5分钟进行一次测量,产生大约603,495条辐照度记录,每个记录都附有相应的时间戳。数据集的构建需要经过严格的预处理阶段。这个阶段包括去除错误值(NaN)和异常值,识别缺失的条目,以及纠正日期记录中的不一致。通过基于平均数据的空白填补程序来解决缺失值,并辅以使用图形表示的目视检查。在确保数据完整性、准确性和一致性之后,导出清理后的数据集,这对于可靠分析和后续建模至关重要。该数据集对于构建训练数据集很有价值,这些数据集用作人工智能模型的输入,以执行短期、中期和长期辐照度预测。例如,barco - jimsamnez等人(2021)利用该数据集的一部分来开发多时间辐照度预测。这些预测模型可以应用于各种领域,包括能源管理、电网优化和太阳能生产计划。此外,数据集支持统计分析,通过诸如峰值日照时间(HPS),最大和最小辐照度值,平均每日和每月辐照度以及季节性趋势等指标,为适当规模的光伏系统提供见解。这些指标在优化光伏系统性能方面发挥着重要作用,有助于降低成本,提高农村、住宅和商业应用的能源效率。该数据集支持光伏系统设计以及该地区太阳能变率和气候模式的研究。对辐照度随时间波动的分析,有助于深入了解大气条件对太阳能可利用性的影响。这些信息对于提高太阳能发电系统的可靠性和有效地将可再生能源纳入现有电网至关重要。该数据集还可用于教育环境,教授数据分析技术和可再生能源概念,为学生和研究人员提供实践学习的实用资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Data in Brief
Data in Brief MULTIDISCIPLINARY SCIENCES-
CiteScore
3.10
自引率
0.00%
发文量
996
审稿时长
70 days
期刊介绍: Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信