重新评估深度学习（和元学习）计算机视觉作为确定骨表面修饰中语素作用的有效方法。

IF 1.3 Q3 BIOCHEMICAL RESEARCH METHODS

Biology Methods and Protocols Pub Date : 2025-07-12 eCollection Date: 2025-01-01 DOI:10.1093/biomethods/bpaf057

Manuel Domínguez-Rodrigo, Gabriel Cifuentes-Alcobendas, Marina Vegara-Riquelme, Enrique Baquedano

{"title":"重新评估深度学习（和元学习）计算机视觉作为确定骨表面修饰中语素作用的有效方法。","authors":"Manuel Domínguez-Rodrigo, Gabriel Cifuentes-Alcobendas, Marina Vegara-Riquelme, Enrique Baquedano","doi":"10.1093/biomethods/bpaf057","DOIUrl":null,"url":null,"abstract":"Taphonomic research aims at reconstructing processes affecting the preservation and modification of paleobiological entities. Recent critiques of the reliability of deep learning (DL) for taphonomic analysis of bone surface modifications (BSMs), such as that presented by Courtenay et al. based on a selection of earlier published studies, have raised concerns about the efficacy of the method. Their critique, however, overlooked fundamental principles regarding the use of small and unbalanced datasets in DL. By reducing the size of the training and validation sets-resulting in a training set only 20% larger than the testing set, and some class validation sets that were under 10 images-these authors may inadvertently have generated underfit models in their attempt to replicate and test the original studies. Moreover, errors in coding during the preprocessing of images have resulted in the development of fundamentally biased models, which fail to effectively evaluate and replicate the reliability of the original studies. In this study, we do not aim to directly refute their critique, but instead use it as an opportunity to reassess the efficiency and resolution of DL in taphonomic research. We revisited the original DL models applied to three targeted datasets, by replicating them as new baseline models for comparison against optimized models designed to address potential biases. Specifically, we accounted for issues stemming from poor-quality image datasets and possible overfitting on validation sets. To ensure the robustness of our findings, we implemented additional methods, including enhanced image data augmentation, k-fold cross-validation of the original training-validation sets, and a few-shot learning approach using both supervised learning and model-agnostic meta-learning. The latter methods facilitated the unbiased use of separate training, validation, and testing sets. The results across all approaches were consistent, with comparable-if not almost identical-outcomes to the original baseline models. As a final validation step, we used images of recently generated BSM to act as testing sets with the baseline models. The results also remained virtually invariant. This reinforces the conclusion that the original models were not subject to methodological overfitting and highlights their nuanced efficacy in differentiating BSM. However, it is important to recognize that these models represent pilot studies, constrained by the limitations of the original datasets in terms of image quality and sample size. Future work leveraging larger datasets with higher-quality images has the potential to enhance model generalization, thereby improving the applicability and reliability of DL approaches in taphonomic research.","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"10 1","pages":"bpaf057"},"PeriodicalIF":1.3000,"publicationDate":"2025-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12343112/pdf/","citationCount":"0","resultStr":"{\"title\":\"Reassessing deep learning (and meta-learning) computer vision as an efficient method to determine taphonomic agency in bone surface modifications.\",\"authors\":\"Manuel Domínguez-Rodrigo, Gabriel Cifuentes-Alcobendas, Marina Vegara-Riquelme, Enrique Baquedano\",\"doi\":\"10.1093/biomethods/bpaf057\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Taphonomic research aims at reconstructing processes affecting the preservation and modification of paleobiological entities. Recent critiques of the reliability of deep learning (DL) for taphonomic analysis of bone surface modifications (BSMs), such as that presented by Courtenay et al. based on a selection of earlier published studies, have raised concerns about the efficacy of the method. Their critique, however, overlooked fundamental principles regarding the use of small and unbalanced datasets in DL. By reducing the size of the training and validation sets-resulting in a training set only 20% larger than the testing set, and some class validation sets that were under 10 images-these authors may inadvertently have generated underfit models in their attempt to replicate and test the original studies. Moreover, errors in coding during the preprocessing of images have resulted in the development of fundamentally biased models, which fail to effectively evaluate and replicate the reliability of the original studies. In this study, we do not aim to directly refute their critique, but instead use it as an opportunity to reassess the efficiency and resolution of DL in taphonomic research. We revisited the original DL models applied to three targeted datasets, by replicating them as new baseline models for comparison against optimized models designed to address potential biases. Specifically, we accounted for issues stemming from poor-quality image datasets and possible overfitting on validation sets. To ensure the robustness of our findings, we implemented additional methods, including enhanced image data augmentation, k-fold cross-validation of the original training-validation sets, and a few-shot learning approach using both supervised learning and model-agnostic meta-learning. The latter methods facilitated the unbiased use of separate training, validation, and testing sets. The results across all approaches were consistent, with comparable-if not almost identical-outcomes to the original baseline models. As a final validation step, we used images of recently generated BSM to act as testing sets with the baseline models. The results also remained virtually invariant. This reinforces the conclusion that the original models were not subject to methodological overfitting and highlights their nuanced efficacy in differentiating BSM. However, it is important to recognize that these models represent pilot studies, constrained by the limitations of the original datasets in terms of image quality and sample size. Future work leveraging larger datasets with higher-quality images has the potential to enhance model generalization, thereby improving the applicability and reliability of DL approaches in taphonomic research.\",\"PeriodicalId\":36528,\"journal\":{\"name\":\"Biology Methods and Protocols\",\"volume\":\"10 1\",\"pages\":\"bpaf057\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2025-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12343112/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biology Methods and Protocols\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/biomethods/bpaf057\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q3\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biology Methods and Protocols","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/biomethods/bpaf057","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

地形学研究旨在重建影响古生物实体保存和修饰的过程。最近对深度学习（DL）用于骨表面修饰（bsm）的地形学分析的可靠性的批评，如Courtenay等人基于早期发表的研究的选择，提出了对该方法有效性的担忧。然而，他们的批评忽略了关于在DL中使用小型和不平衡数据集的基本原则。通过减少训练集和验证集的大小——导致训练集只比测试集大20%，一些类验证集少于10个图像——这些作者在试图复制和测试原始研究时可能无意中产生了不拟合模型。此外，图像预处理过程中的编码错误导致了模型的根本偏差，这些模型无法有效地评估和复制原始研究的可靠性。在这项研究中，我们的目的不是直接反驳他们的批评，而是利用它作为一个机会，重新评估深度学习在语音学研究中的效率和解决方案。我们重新审视了应用于三个目标数据集的原始深度学习模型，将它们复制为新的基线模型，与旨在解决潜在偏差的优化模型进行比较。具体来说，我们考虑了由低质量图像数据集和验证集上可能的过拟合引起的问题。为了确保研究结果的稳健性，我们实施了其他方法，包括增强图像数据增强、原始训练-验证集的k倍交叉验证，以及使用监督学习和模型不可知元学习的少量学习方法。后一种方法促进了独立训练、验证和测试集的无偏使用。所有方法的结果都是一致的，与原始基线模型的结果相比较（如果不是几乎相同的话）。作为最后的验证步骤，我们使用最近生成的BSM图像作为基线模型的测试集。结果也几乎保持不变。这加强了原始模型不受方法过拟合的结论，并强调了它们在区分BSM方面的细微功效。然而，重要的是要认识到这些模型代表了试点研究，受到原始数据集在图像质量和样本量方面的限制。利用更大的数据集和更高质量的图像，未来的工作有可能增强模型的泛化，从而提高深度学习方法在分类学研究中的适用性和可靠性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reassessing deep learning (and meta-learning) computer vision as an efficient method to determine taphonomic agency in bone surface modifications.

Taphonomic research aims at reconstructing processes affecting the preservation and modification of paleobiological entities. Recent critiques of the reliability of deep learning (DL) for taphonomic analysis of bone surface modifications (BSMs), such as that presented by Courtenay et al. based on a selection of earlier published studies, have raised concerns about the efficacy of the method. Their critique, however, overlooked fundamental principles regarding the use of small and unbalanced datasets in DL. By reducing the size of the training and validation sets-resulting in a training set only 20% larger than the testing set, and some class validation sets that were under 10 images-these authors may inadvertently have generated underfit models in their attempt to replicate and test the original studies. Moreover, errors in coding during the preprocessing of images have resulted in the development of fundamentally biased models, which fail to effectively evaluate and replicate the reliability of the original studies. In this study, we do not aim to directly refute their critique, but instead use it as an opportunity to reassess the efficiency and resolution of DL in taphonomic research. We revisited the original DL models applied to three targeted datasets, by replicating them as new baseline models for comparison against optimized models designed to address potential biases. Specifically, we accounted for issues stemming from poor-quality image datasets and possible overfitting on validation sets. To ensure the robustness of our findings, we implemented additional methods, including enhanced image data augmentation, k-fold cross-validation of the original training-validation sets, and a few-shot learning approach using both supervised learning and model-agnostic meta-learning. The latter methods facilitated the unbiased use of separate training, validation, and testing sets. The results across all approaches were consistent, with comparable-if not almost identical-outcomes to the original baseline models. As a final validation step, we used images of recently generated BSM to act as testing sets with the baseline models. The results also remained virtually invariant. This reinforces the conclusion that the original models were not subject to methodological overfitting and highlights their nuanced efficacy in differentiating BSM. However, it is important to recognize that these models represent pilot studies, constrained by the limitations of the original datasets in terms of image quality and sample size. Future work leveraging larger datasets with higher-quality images has the potential to enhance model generalization, thereby improving the applicability and reliability of DL approaches in taphonomic research.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Biology Methods and Protocols Agricultural and Biological Sciences-Agricultural and Biological Sciences (all)

CiteScore

3.80

自引率

2.80%

发文量

审稿时长

19 weeks