Statystyka w archeologii, czyli dlaczego nie trzeba bać się liczb

Wojciech Rajpold
{"title":"Statystyka w archeologii, czyli dlaczego nie trzeba bać się liczb","authors":"Wojciech Rajpold","doi":"10.15584/misroa.2021.42.6","DOIUrl":null,"url":null,"abstract":"Archaeology, although it is a human science, draws from other areas of the world of science, especially from the achievements of natural sciences. Physics, chemistry and biology are widely used, for example in determining the chronology (C14, dendrology) or in the study of the chemical composition of artefacts (e.g. Raman spectroscopy). That is why it is not surprising that mathematics is also included in the arsenal of research methods used by archaeologists. The amount of archaeological materials widening the collection from year to year is impressive. However, it creates a huge challenge, including the one associated with the development of such a large number of sources. The artefacts obtained during excavations are massive, countable, therefore we can measure them and weigh them. So this is where statistics comes to the aid – the field of mathematics that organizes large numbers. The possibility of using statistical analyses can be found in many works of Polish researchers, and they show both richness and diversity, as well as usefulness of this field of science in archaeological studies. The first issue that should be indicated is the type of data surveyed by the statistics. There are two types: a) quantitative (measurable) – e.g. weight or length of the artefact (continuous data) or e.g. the number of coils on the pin head (discrete data); b) nominal (immeasurable; the variable gets a numerical label); they can be binary and multiple. This group also includes ordinal data – otherwise rankable – which arrange materials according to the intensity of the phenomenon. The type of data, in turn, determines the type of measurement scale which is going to be used. In archaeology, the socalled quotient scale gives the possibility to implement all statistic methods. In that case, we can measure weight and height. You can also use the so-called an ordinal scale that examines rankable data, where we grade the intensity of a specific feature. However, it is worth emphasizing that statistical data (depending on the methods) can be transformed so that they can be recorded in many ways and measured on various scales. In statistics, however, the most important issue is to examine whether the differences in the analysed groups are significant. For this purpose, the so-called chi^ 2, Kruskal-Wallis and U-Mann Whitney tests are used. The first test – the most common – compares the observed prevalence with the expected rates. With its help, you can check whether the ceramics obtained from a particular site is technologically homogeneous, for instance. The next two tests are used when data are expressed on measurable scales and are implemented to test the median. They can be used, inter alia, to check whether the differences in the thickness of the vessels from given areas are significant. What is most, commonly associated with statistics are various types of correlations, or interdependencies (e.g. Pearson correlation, Spearman’s Rank correlations, Kendall’s Tau correlation). However, it should be considered that not every correlation is a relationship between the studied features, therefore in any statistical method it is so important to exercise a certain degree of caution when reading the results. Above all, methods that allow archaeologists to group data are very useful in the considerations. The most popular are dendrite diagrams and correspondence boards. Additionally, the cut-off point on the ROC curve and associated with it the Odds Ratio diagram may also be an important method. Undoubtedly, the greatest power of statistics is mathematical modelling, which allows the researcher, based on the results of empirical research, to create a formula describing the analysed feature. There are many types of models, such as logistic, linear and decision trees. They are widely used, among others, in medicine, sociology and economics – and they are also implemented in archaeological analyses. The simplest type of model is a decision tree, which is useful for the graphic presentation of data and the possible results of the decisions taken. It is also helpful when selecting the features that have the greatest impact on the topic we are researching. Such an example is the determination of the age of the deceased (divided into adults and children) on the basis of the number of remains and the size of the urn, where two steps were noted that led to the correct indication of the age of the deceased. The logit and discriminant models are much more complicated. They can be created for two variables, e.g. age divided into adults and children, as well as for more variables, e.g. chronological phases. Basically, in these models we obtain a formula which, based on the examined features, e.g. as for the age – the number of remains and the size of the urn, and for the chronology – the thickness and colour of the walls, shows us (with a certain probability) which age group or chronology can be connected the given archaeological feature. Finally, it is worth mentioning that the importance of statistical analyses for archaeology will systematically grow, as constant influx of new materials takes place. Especially, there are some available statistical programs (e.g. Statistica, R and RStudio) which are really useful. However, it is necessary to remember that statistics is only an extremely helpful tool that we can and should use carefully.","PeriodicalId":281758,"journal":{"name":"Materiały i Sprawozdania Rzeszowskiego Ośrodka Archeologicznego","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Materiały i Sprawozdania Rzeszowskiego Ośrodka Archeologicznego","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15584/misroa.2021.42.6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Archaeology, although it is a human science, draws from other areas of the world of science, especially from the achievements of natural sciences. Physics, chemistry and biology are widely used, for example in determining the chronology (C14, dendrology) or in the study of the chemical composition of artefacts (e.g. Raman spectroscopy). That is why it is not surprising that mathematics is also included in the arsenal of research methods used by archaeologists. The amount of archaeological materials widening the collection from year to year is impressive. However, it creates a huge challenge, including the one associated with the development of such a large number of sources. The artefacts obtained during excavations are massive, countable, therefore we can measure them and weigh them. So this is where statistics comes to the aid – the field of mathematics that organizes large numbers. The possibility of using statistical analyses can be found in many works of Polish researchers, and they show both richness and diversity, as well as usefulness of this field of science in archaeological studies. The first issue that should be indicated is the type of data surveyed by the statistics. There are two types: a) quantitative (measurable) – e.g. weight or length of the artefact (continuous data) or e.g. the number of coils on the pin head (discrete data); b) nominal (immeasurable; the variable gets a numerical label); they can be binary and multiple. This group also includes ordinal data – otherwise rankable – which arrange materials according to the intensity of the phenomenon. The type of data, in turn, determines the type of measurement scale which is going to be used. In archaeology, the socalled quotient scale gives the possibility to implement all statistic methods. In that case, we can measure weight and height. You can also use the so-called an ordinal scale that examines rankable data, where we grade the intensity of a specific feature. However, it is worth emphasizing that statistical data (depending on the methods) can be transformed so that they can be recorded in many ways and measured on various scales. In statistics, however, the most important issue is to examine whether the differences in the analysed groups are significant. For this purpose, the so-called chi^ 2, Kruskal-Wallis and U-Mann Whitney tests are used. The first test – the most common – compares the observed prevalence with the expected rates. With its help, you can check whether the ceramics obtained from a particular site is technologically homogeneous, for instance. The next two tests are used when data are expressed on measurable scales and are implemented to test the median. They can be used, inter alia, to check whether the differences in the thickness of the vessels from given areas are significant. What is most, commonly associated with statistics are various types of correlations, or interdependencies (e.g. Pearson correlation, Spearman’s Rank correlations, Kendall’s Tau correlation). However, it should be considered that not every correlation is a relationship between the studied features, therefore in any statistical method it is so important to exercise a certain degree of caution when reading the results. Above all, methods that allow archaeologists to group data are very useful in the considerations. The most popular are dendrite diagrams and correspondence boards. Additionally, the cut-off point on the ROC curve and associated with it the Odds Ratio diagram may also be an important method. Undoubtedly, the greatest power of statistics is mathematical modelling, which allows the researcher, based on the results of empirical research, to create a formula describing the analysed feature. There are many types of models, such as logistic, linear and decision trees. They are widely used, among others, in medicine, sociology and economics – and they are also implemented in archaeological analyses. The simplest type of model is a decision tree, which is useful for the graphic presentation of data and the possible results of the decisions taken. It is also helpful when selecting the features that have the greatest impact on the topic we are researching. Such an example is the determination of the age of the deceased (divided into adults and children) on the basis of the number of remains and the size of the urn, where two steps were noted that led to the correct indication of the age of the deceased. The logit and discriminant models are much more complicated. They can be created for two variables, e.g. age divided into adults and children, as well as for more variables, e.g. chronological phases. Basically, in these models we obtain a formula which, based on the examined features, e.g. as for the age – the number of remains and the size of the urn, and for the chronology – the thickness and colour of the walls, shows us (with a certain probability) which age group or chronology can be connected the given archaeological feature. Finally, it is worth mentioning that the importance of statistical analyses for archaeology will systematically grow, as constant influx of new materials takes place. Especially, there are some available statistical programs (e.g. Statistica, R and RStudio) which are really useful. However, it is necessary to remember that statistics is only an extremely helpful tool that we can and should use carefully.
考古学虽然是一门人文科学,但它借鉴了世界上其他科学领域,尤其是自然科学的成就。物理、化学和生物学被广泛应用,例如确定年代学(C14、树木学)或研究人工制品的化学成分(如拉曼光谱)。这就是为什么数学也被包括在考古学家使用的研究方法的武器库中并不奇怪的原因。考古材料的数量逐年增加,令人印象深刻。然而,它带来了巨大的挑战,包括与开发如此大量的资源相关的挑战。在发掘过程中获得的文物是巨大的,可数的,因此我们可以测量和称重。这就是统计学的作用所在——组织大量数字的数学领域。使用统计分析的可能性可以在波兰研究人员的许多作品中找到,它们显示出丰富和多样性,以及这一科学领域在考古研究中的有用性。应该指出的第一个问题是统计所调查的数据类型。有两种类型:a)定量(可测量)-例如,人工制品的重量或长度(连续数据)或例如,针头上的线圈数量(离散数据);B)标称的;变量得到一个数字标签);它们可以是二进制和多个。这一组还包括有序数据——否则是可排序的——根据现象的强度排列材料。数据的类型反过来又决定了将要使用的测量尺度的类型。在考古学中,所谓的商量表提供了实现所有统计方法的可能性。在这种情况下,我们可以测量体重和身高。你也可以使用所谓的有序尺度来检查可排名的数据,在那里我们对特定特征的强度进行分级。然而,值得强调的是,统计数据(取决于方法)可以转换,以便以多种方式记录和以各种尺度测量。然而,在统计学中,最重要的问题是检查被分析群体之间的差异是否显著。为此,使用了所谓的chi^ 2、Kruskal-Wallis和U-Mann Whitney检验。第一个测试是最常见的,将观察到的患病率与预期的发病率进行比较。例如,在它的帮助下,您可以检查从特定地点获得的陶瓷在技术上是否同质。当数据在可测量的尺度上表示时,使用接下来的两个测试来测试中位数。除其他外,它们可以用来检查给定区域血管厚度的差异是否显著。最常与统计相关的是各种类型的相关性或相互依赖性(例如Pearson相关性,Spearman 's Rank相关性,Kendall 's Tau相关性)。然而,应该考虑的是,并不是所有的相关性都是研究特征之间的关系,因此在任何统计方法中,在阅读结果时保持一定程度的谨慎是非常重要的。最重要的是,允许考古学家将数据分组的方法在考虑时非常有用。最流行的是树突图和通信板。此外,ROC曲线上的截断点及其相关的比值比图也可能是一个重要的方法。毫无疑问,统计学最强大的力量是数学建模,它允许研究人员根据实证研究的结果,创建一个描述分析特征的公式。有许多类型的模型,如逻辑,线性和决策树。它们被广泛应用于医学、社会学和经济学等领域,也被应用于考古分析。最简单的模型类型是决策树,它对于数据的图形表示和所做决策的可能结果非常有用。在选择对我们正在研究的主题有最大影响的特征时,它也很有帮助。这样的一个例子是,根据遗骸的数量和骨灰盒的大小来确定死者的年龄(分为成人和儿童),其中指出了两个步骤,导致了死者年龄的正确指示。logit和判别模型则要复杂得多。可以为两个变量创建它们,例如分为成人和儿童的年龄,也可以为更多变量创建它们,例如时间顺序阶段。基本上,在这些模型中,我们得到一个公式,该公式基于所检查的特征,例如,关于年龄-遗骸的数量和骨灰盒的大小,以及年代-墙壁的厚度和颜色,向我们(在一定的概率下)哪个年龄组或年代可以与给定的考古特征联系起来。 最后,值得一提的是,随着新材料的不断涌入,统计分析对考古学的重要性将有系统地增加。特别是,有一些可用的统计程序(例如Statistica, R和RStudio)非常有用。然而,有必要记住,统计数据只是一个非常有用的工具,我们可以而且应该谨慎使用。 最后,值得一提的是,随着新材料的不断涌入,统计分析对考古学的重要性将有系统地增加。特别是,有一些可用的统计程序(例如Statistica, R和RStudio)非常有用。然而,有必要记住,统计数据只是一个非常有用的工具,我们可以而且应该谨慎使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信