Nan Xiao , Xuekai Qi , Bin Wang , He Huang , Rui Wen
{"title":"Rapid discrimination modeling of common wine and food residues in archaeology based on machine learning and infrared spectroscopy","authors":"Nan Xiao , Xuekai Qi , Bin Wang , He Huang , Rui Wen","doi":"10.1016/j.culher.2025.03.005","DOIUrl":null,"url":null,"abstract":"<div><div>The study of wine and food residues in archaeology offers crucial insights into ancient diets and brewing techniques. Traditional detection methods, however, are often complex and time-consuming. Bridging the gap between excavation sites and laboratories is vital for enhancing real-time analysis and artifact preservation. This paper presents a non-targeted spectral fingerprinting method that integrates simulated experiments, Fourier Transform Infrared Spectroscopy (FTIR), and machine learning algorithms for the rapid identification of food and wine residues in archaeological excavations. Infrared spectral data were collected from 23 modern food and liquor samples subjected to simulated aging. A comprehensive preprocessing protocol was developed, including smoothing, baseline correction, and normalization, to reduce unwanted variability and enhance data quality. Eight spectral preprocessing methods were assessed, including standard normal variate (SNV), multiple scatter correction (MSC), and various derivative techniques. The final model, which employed SNV preprocessing, demonstrated superior prediction accuracy and robustness. Six common machine learning algorithms—linear discriminant analysis (LDA), decision tree classification (DTC), support vector machine (SVM), random forest (RF), k-nearest neighbor (KNN), and backpropagation neural network (BPNN)—were utilized for modeling and comparison. Results indicated that the RF, KNN, and BPNN models were particularly effective, achieving prediction accuracies near 100 %. In external validation with real archaeological samples and those simulated to be aged for nearly ten years, the BPNN model achieved a confidence estimate of 99 % for validating archaeological wine residue samples, while other models provided confidence estimates above 70 %. However, due to the significant loss of characteristic substances from prolonged aging, the current model has difficulty distinguishing specific wine or food types. Future research should focus on improving model portability for on-site screening and expanding the database of simulated aged residues through multi-platform collaboration.</div></div>","PeriodicalId":15480,"journal":{"name":"Journal of Cultural Heritage","volume":"73 ","pages":"Pages 195-205"},"PeriodicalIF":3.5000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cultural Heritage","FirstCategoryId":"103","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1296207425000482","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"ARCHAEOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The study of wine and food residues in archaeology offers crucial insights into ancient diets and brewing techniques. Traditional detection methods, however, are often complex and time-consuming. Bridging the gap between excavation sites and laboratories is vital for enhancing real-time analysis and artifact preservation. This paper presents a non-targeted spectral fingerprinting method that integrates simulated experiments, Fourier Transform Infrared Spectroscopy (FTIR), and machine learning algorithms for the rapid identification of food and wine residues in archaeological excavations. Infrared spectral data were collected from 23 modern food and liquor samples subjected to simulated aging. A comprehensive preprocessing protocol was developed, including smoothing, baseline correction, and normalization, to reduce unwanted variability and enhance data quality. Eight spectral preprocessing methods were assessed, including standard normal variate (SNV), multiple scatter correction (MSC), and various derivative techniques. The final model, which employed SNV preprocessing, demonstrated superior prediction accuracy and robustness. Six common machine learning algorithms—linear discriminant analysis (LDA), decision tree classification (DTC), support vector machine (SVM), random forest (RF), k-nearest neighbor (KNN), and backpropagation neural network (BPNN)—were utilized for modeling and comparison. Results indicated that the RF, KNN, and BPNN models were particularly effective, achieving prediction accuracies near 100 %. In external validation with real archaeological samples and those simulated to be aged for nearly ten years, the BPNN model achieved a confidence estimate of 99 % for validating archaeological wine residue samples, while other models provided confidence estimates above 70 %. However, due to the significant loss of characteristic substances from prolonged aging, the current model has difficulty distinguishing specific wine or food types. Future research should focus on improving model portability for on-site screening and expanding the database of simulated aged residues through multi-platform collaboration.
期刊介绍:
The Journal of Cultural Heritage publishes original papers which comprise previously unpublished data and present innovative methods concerning all aspects of science and technology of cultural heritage as well as interpretation and theoretical issues related to preservation.