基于Wasserstein gan增强树模型的考古淹水木材多属性预测

IF 3.3 2区 综合性期刊 0 ARCHAEOLOGY
Tiantian Liu , Xiangna Han , Yafang Yin , Guanglan Xi , Zhiguo Zhang , Jian Sun , Gang Chen , Lintong Zhang , Liuyang Han
{"title":"基于Wasserstein gan增强树模型的考古淹水木材多属性预测","authors":"Tiantian Liu ,&nbsp;Xiangna Han ,&nbsp;Yafang Yin ,&nbsp;Guanglan Xi ,&nbsp;Zhiguo Zhang ,&nbsp;Jian Sun ,&nbsp;Gang Chen ,&nbsp;Lintong Zhang ,&nbsp;Liuyang Han","doi":"10.1016/j.culher.2025.09.005","DOIUrl":null,"url":null,"abstract":"<div><div>To address the challenges of non-destructive evaluation and limited sample availability for waterlogged archaeological wood (WAW), this study developed a predictive model for physico-mechanical properties using near-infrared (NIR) spectroscopy. Furthermore, we proposed a data augmentation framework based on the Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) to extend the NIR spectral data of WAW and associated physico-mechanical parameters - maximum water content (MWC), basic density (BD), modulus of rupture (MOR), and fracture strain (FS). Tree-based ensemble learning models (LGBM and Multi-Scale Derivative Enhanced Gradient Boosting Machine, MSDE-GBM) were built using the data generated by WGAN-GP, and the effect of extended dataset size on model performance was systematically investigated. The results showed significant correlations among the four physico-mechanical parameters of WAW, validating the feasibility of a multi-target generation mechanism to simultaneously synthesize spectral data corresponding to MWC, BD, MOR, and FS. Analysis of the generated data revealed that the WGAN-GP-generated spectral data exhibited significant noise during the initial training epochs; however, the morphology and smoothness of the synthetic spectra progressively approximated the real data with increasing training cycles, improving both diversity and authenticity. Further experiments identified optimal training epochs for different augmented dataset sizes: 4000 epochs for datasets expanded to 300 and 900 samples, and 6000 epochs for the 600-sample dataset. Subsequent modeling using data generated at these optimal epochs confirmed that WGAN-GP augmented datasets significantly improved the performance of LGBM and MSDE-GBM in predicting MWC and BD. Compared to the original dataset, the optimal models achieved RMSE reductions of 47.9 % (LGBM) and 59.9 % (MSDE-GBM) for MWC, 29.2 % (LGBM) and 13.3 % (MSDE-GBM) for BD. In contrast, the lower prediction accuracy for MOR and FS (R²&lt; 0.7) highlighted the complex mapping relationships between micro-scale mechanical parameters (tested via thermomechanical analysis, TMA) and NIR spectral data. This study pioneers the simultaneous prediction of multiple WAW performance parameters, providing a novel paradigm for small sample regression modeling in heritage conservation. The generated data were successfully applied to assess the degradation of wooden components from the Southern Song Dynasty “Nanhai I” shipwreck and the Qing Dynasty “Zhiyuan” shipwreck, providing critical data-driven support for scientific conservation strategies of waterlogged archaeological artifacts.</div></div>","PeriodicalId":15480,"journal":{"name":"Journal of Cultural Heritage","volume":"76 ","pages":"Pages 86-98"},"PeriodicalIF":3.3000,"publicationDate":"2025-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-property prediction of waterlogged archaeological wood based on Wasserstein GAN-augmented tree models\",\"authors\":\"Tiantian Liu ,&nbsp;Xiangna Han ,&nbsp;Yafang Yin ,&nbsp;Guanglan Xi ,&nbsp;Zhiguo Zhang ,&nbsp;Jian Sun ,&nbsp;Gang Chen ,&nbsp;Lintong Zhang ,&nbsp;Liuyang Han\",\"doi\":\"10.1016/j.culher.2025.09.005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>To address the challenges of non-destructive evaluation and limited sample availability for waterlogged archaeological wood (WAW), this study developed a predictive model for physico-mechanical properties using near-infrared (NIR) spectroscopy. Furthermore, we proposed a data augmentation framework based on the Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) to extend the NIR spectral data of WAW and associated physico-mechanical parameters - maximum water content (MWC), basic density (BD), modulus of rupture (MOR), and fracture strain (FS). Tree-based ensemble learning models (LGBM and Multi-Scale Derivative Enhanced Gradient Boosting Machine, MSDE-GBM) were built using the data generated by WGAN-GP, and the effect of extended dataset size on model performance was systematically investigated. The results showed significant correlations among the four physico-mechanical parameters of WAW, validating the feasibility of a multi-target generation mechanism to simultaneously synthesize spectral data corresponding to MWC, BD, MOR, and FS. Analysis of the generated data revealed that the WGAN-GP-generated spectral data exhibited significant noise during the initial training epochs; however, the morphology and smoothness of the synthetic spectra progressively approximated the real data with increasing training cycles, improving both diversity and authenticity. Further experiments identified optimal training epochs for different augmented dataset sizes: 4000 epochs for datasets expanded to 300 and 900 samples, and 6000 epochs for the 600-sample dataset. Subsequent modeling using data generated at these optimal epochs confirmed that WGAN-GP augmented datasets significantly improved the performance of LGBM and MSDE-GBM in predicting MWC and BD. Compared to the original dataset, the optimal models achieved RMSE reductions of 47.9 % (LGBM) and 59.9 % (MSDE-GBM) for MWC, 29.2 % (LGBM) and 13.3 % (MSDE-GBM) for BD. In contrast, the lower prediction accuracy for MOR and FS (R²&lt; 0.7) highlighted the complex mapping relationships between micro-scale mechanical parameters (tested via thermomechanical analysis, TMA) and NIR spectral data. This study pioneers the simultaneous prediction of multiple WAW performance parameters, providing a novel paradigm for small sample regression modeling in heritage conservation. The generated data were successfully applied to assess the degradation of wooden components from the Southern Song Dynasty “Nanhai I” shipwreck and the Qing Dynasty “Zhiyuan” shipwreck, providing critical data-driven support for scientific conservation strategies of waterlogged archaeological artifacts.</div></div>\",\"PeriodicalId\":15480,\"journal\":{\"name\":\"Journal of Cultural Heritage\",\"volume\":\"76 \",\"pages\":\"Pages 86-98\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Cultural Heritage\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1296207425002018\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"ARCHAEOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cultural Heritage","FirstCategoryId":"103","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1296207425002018","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"ARCHAEOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

为了解决水浸考古木材(WAW)无损评估和有限样本可用性的挑战,本研究开发了一种使用近红外(NIR)光谱的物理力学特性预测模型。此外,我们提出了一个基于Wasserstein梯度惩罚生成对抗网络(WGAN-GP)的数据增强框架,以扩展WAW的近红外光谱数据和相关的物理力学参数——最大含水量(MWC)、基本密度(BD)、断裂模量(MOR)和断裂应变(FS)。利用WGAN-GP生成的数据,建立了基于树的集成学习模型LGBM和多尺度导数增强梯度增强机MSDE-GBM,并系统研究了扩展数据集大小对模型性能的影响。结果表明,WAW的4个物理力学参数之间存在显著的相关性,验证了多目标生成机制同时合成MWC、BD、MOR和FS对应光谱数据的可行性。对生成数据的分析表明,wgan - gp生成的光谱数据在初始训练时期表现出明显的噪声;然而,随着训练周期的增加,合成光谱的形态和平滑度逐渐接近真实数据,提高了多样性和真实性。进一步的实验确定了不同增强数据集规模的最佳训练周期:扩展到300和900个样本的数据集为4000个周期,600个样本的数据集为6000个周期。使用这些最优时期生成的数据进行后续建模证实,WGAN-GP增强数据集显著提高了LGBM和MSDE-GBM预测MWC和BD的性能。与原始数据集相比,最优模型对MWC的RMSE分别降低了47.9% (LGBM)和59.9% (MSDE-GBM),对BD的RMSE分别降低了29.2% (LGBM)和13.3% (MSDE-GBM)。MOR和FS较低的预测精度(R²< 0.7)突出了微尺度力学参数(通过热力学分析,TMA测试)与近红外光谱数据之间复杂的映射关系。本研究率先实现了多个WAW性能参数的同时预测,为遗产保护中的小样本回归建模提供了一种新的范式。将生成的数据成功地应用于南宋“南海一号”沉船和清代“致远”沉船中木制构件的退化评估,为浸水考古文物的科学保护策略提供了关键的数据驱动支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multi-property prediction of waterlogged archaeological wood based on Wasserstein GAN-augmented tree models
To address the challenges of non-destructive evaluation and limited sample availability for waterlogged archaeological wood (WAW), this study developed a predictive model for physico-mechanical properties using near-infrared (NIR) spectroscopy. Furthermore, we proposed a data augmentation framework based on the Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) to extend the NIR spectral data of WAW and associated physico-mechanical parameters - maximum water content (MWC), basic density (BD), modulus of rupture (MOR), and fracture strain (FS). Tree-based ensemble learning models (LGBM and Multi-Scale Derivative Enhanced Gradient Boosting Machine, MSDE-GBM) were built using the data generated by WGAN-GP, and the effect of extended dataset size on model performance was systematically investigated. The results showed significant correlations among the four physico-mechanical parameters of WAW, validating the feasibility of a multi-target generation mechanism to simultaneously synthesize spectral data corresponding to MWC, BD, MOR, and FS. Analysis of the generated data revealed that the WGAN-GP-generated spectral data exhibited significant noise during the initial training epochs; however, the morphology and smoothness of the synthetic spectra progressively approximated the real data with increasing training cycles, improving both diversity and authenticity. Further experiments identified optimal training epochs for different augmented dataset sizes: 4000 epochs for datasets expanded to 300 and 900 samples, and 6000 epochs for the 600-sample dataset. Subsequent modeling using data generated at these optimal epochs confirmed that WGAN-GP augmented datasets significantly improved the performance of LGBM and MSDE-GBM in predicting MWC and BD. Compared to the original dataset, the optimal models achieved RMSE reductions of 47.9 % (LGBM) and 59.9 % (MSDE-GBM) for MWC, 29.2 % (LGBM) and 13.3 % (MSDE-GBM) for BD. In contrast, the lower prediction accuracy for MOR and FS (R²< 0.7) highlighted the complex mapping relationships between micro-scale mechanical parameters (tested via thermomechanical analysis, TMA) and NIR spectral data. This study pioneers the simultaneous prediction of multiple WAW performance parameters, providing a novel paradigm for small sample regression modeling in heritage conservation. The generated data were successfully applied to assess the degradation of wooden components from the Southern Song Dynasty “Nanhai I” shipwreck and the Qing Dynasty “Zhiyuan” shipwreck, providing critical data-driven support for scientific conservation strategies of waterlogged archaeological artifacts.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Cultural Heritage
Journal of Cultural Heritage 综合性期刊-材料科学:综合
CiteScore
6.80
自引率
9.70%
发文量
166
审稿时长
52 days
期刊介绍: The Journal of Cultural Heritage publishes original papers which comprise previously unpublished data and present innovative methods concerning all aspects of science and technology of cultural heritage as well as interpretation and theoretical issues related to preservation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信