Categorical data analysis using discretization of continuous variables to investigate associations in marine ecosystems

IF 1.5 3区 环境科学与生态学 Q4 ENVIRONMENTAL SCIENCES
Environmetrics Pub Date : 2024-06-29 DOI:10.1002/env.2867
Hiroko Kato Solvang, Shinpei Imori, Martin Biuw, Ulf Lindstrøm, Tore Haug
{"title":"Categorical data analysis using discretization of continuous variables to investigate associations in marine ecosystems","authors":"Hiroko Kato Solvang,&nbsp;Shinpei Imori,&nbsp;Martin Biuw,&nbsp;Ulf Lindstrøm,&nbsp;Tore Haug","doi":"10.1002/env.2867","DOIUrl":null,"url":null,"abstract":"<p>Understanding and predicting interactions between predators and prey and their environment are fundamental for understanding food web structure, dynamics, and ecosystem function in both terrestrial and marine ecosystems. Thus, estimating the conditional associations between species and their environments is important for exploring connections or cooperative links in the ecosystem, which in turn can help to clarify such directional relationships. For this purpose, a relevant and practical statistical method is required to link presence/absence observations with biomass, abundance, and physical quantities obtained as continuous real values. These data are sometimes sparse in oceanic space and too short as time series data. To meet this challenge, we provide an approach based on applying categorical data analysis to present/absent observations and real-number data. The real-number data used as explanatory variables for the present/absent response variable are discretized based on the optimal detection of thresholds without any prior biological/ecological information. These discretized data express two different levels, such as large/small or high/low, which give experts a simple interpretation for investigating complicated associations in marine ecosystems. This approach is implemented in the previous statistical method called CATDAP developed by Sakamoto and Akaike in 1979. Our proposed approach consists of a two-step procedure for categorical data analysis: (1) finding the appropriate threshold to discretize the real-number data for applying an independent test; and (2) identifying the best conditional probability model to investigate the possible associations among the data based on a statistical information criterion. We perform a simulation study to validate our proposed approach and investigate whether the method's observation includes many zeros (zero-inflated data), which can often occur in practical situations. Furthermore, the approach is applied to two datasets: (1) one collected during an international synoptic krill survey in the Scotia Sea west of the Antarctic Peninsula to investigate associations among krill, fin whale (<i>Balaenoptera physalus</i>), surface temperature, depth, slope in depth (flatter or steeper terrain), and temperature gradient (slope in temperature); (2) the other collected by ecosystem surveys conducted during August–September in 2014–2017 to investigate associations among common minke whales, the predatory fish Atlantic cod, and their main prey groups (zooplankton, 0-group fish) in Arctic Ocean waters to the west and north of Svalbard, Norway. The R code summarizing our proposed numerical procedure is presented in S4S1.</p>","PeriodicalId":50512,"journal":{"name":"Environmetrics","volume":"35 6","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2024-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/env.2867","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmetrics","FirstCategoryId":"93","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/env.2867","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Understanding and predicting interactions between predators and prey and their environment are fundamental for understanding food web structure, dynamics, and ecosystem function in both terrestrial and marine ecosystems. Thus, estimating the conditional associations between species and their environments is important for exploring connections or cooperative links in the ecosystem, which in turn can help to clarify such directional relationships. For this purpose, a relevant and practical statistical method is required to link presence/absence observations with biomass, abundance, and physical quantities obtained as continuous real values. These data are sometimes sparse in oceanic space and too short as time series data. To meet this challenge, we provide an approach based on applying categorical data analysis to present/absent observations and real-number data. The real-number data used as explanatory variables for the present/absent response variable are discretized based on the optimal detection of thresholds without any prior biological/ecological information. These discretized data express two different levels, such as large/small or high/low, which give experts a simple interpretation for investigating complicated associations in marine ecosystems. This approach is implemented in the previous statistical method called CATDAP developed by Sakamoto and Akaike in 1979. Our proposed approach consists of a two-step procedure for categorical data analysis: (1) finding the appropriate threshold to discretize the real-number data for applying an independent test; and (2) identifying the best conditional probability model to investigate the possible associations among the data based on a statistical information criterion. We perform a simulation study to validate our proposed approach and investigate whether the method's observation includes many zeros (zero-inflated data), which can often occur in practical situations. Furthermore, the approach is applied to two datasets: (1) one collected during an international synoptic krill survey in the Scotia Sea west of the Antarctic Peninsula to investigate associations among krill, fin whale (Balaenoptera physalus), surface temperature, depth, slope in depth (flatter or steeper terrain), and temperature gradient (slope in temperature); (2) the other collected by ecosystem surveys conducted during August–September in 2014–2017 to investigate associations among common minke whales, the predatory fish Atlantic cod, and their main prey groups (zooplankton, 0-group fish) in Arctic Ocean waters to the west and north of Svalbard, Norway. The R code summarizing our proposed numerical procedure is presented in S4S1.

Abstract Image

利用连续变量离散化进行分类数据分析,研究海洋生态系统的关联性
了解和预测捕食者与猎物及其环境之间的相互作用,对于了解陆地和海洋生态系统中的食物网结构、动态和生态系统功能至关重要。因此,估算物种与其环境之间的条件关联对于探索生态系统中的联系或合作环节非常重要,这反过来又有助于厘清这种定向关系。为此,需要一种相关且实用的统计方法,将存在/缺失观测数据与生物量、丰度和作为连续实值获得的物理量联系起来。这些数据在海洋空间中有时很稀少,作为时间序列数据也太短。为了应对这一挑战,我们提供了一种方法,将分类数据分析应用于出现/缺失观测数据和实数数据。作为现存/不存在响应变量的解释变量的实数数据,在没有任何先验生物/生态信息的情况下,根据阈值的最佳检测结果进行离散化。这些离散化数据表示了两个不同的层次,如大/小或高/低,为专家们研究海洋生态系统中的复杂关联提供了简单的解释。坂本(Sakamoto)和赤井克(Akaike)于 1979 年开发了一种名为 CATDAP 的统计方法。我们提出的方法包括分类数据分析的两步程序:(1) 找到适当的阈值来离散化实数数据,以便应用独立检验;(2) 根据统计信息标准确定最佳条件概率模型,以调查数据之间可能存在的关联。我们进行了一项模拟研究,以验证我们提出的方法,并调查该方法的观测值是否包含许多零(零膨胀数据),这在实际情况中经常会发生。此外,我们还将该方法应用于两个数据集:(1) 一个是在南极半岛以西斯科舍海进行国际磷虾同步调查时收集的数据集,用于研究磷虾、长须鲸(Balaenoptera physalus)、表面温度、深度、深度坡度(地形较平坦或陡峭)和温度梯度(温度坡度)之间的关联;(2)2014-2017 年 8 月至 9 月期间进行的生态系统调查收集的其他数据,这些调查旨在研究挪威斯瓦尔巴群岛以西和以北北冰洋水域中普通小须鲸、捕食性鱼类大西洋鳕鱼及其主要猎物群(浮游动物、0 群鱼类)之间的关联。我们建议的数值计算程序的 R 代码见 S4S1。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Environmetrics
Environmetrics 环境科学-环境科学
CiteScore
2.90
自引率
17.60%
发文量
67
审稿时长
18-36 weeks
期刊介绍: Environmetrics, the official journal of The International Environmetrics Society (TIES), an Association of the International Statistical Institute, is devoted to the dissemination of high-quality quantitative research in the environmental sciences. The journal welcomes pertinent and innovative submissions from quantitative disciplines developing new statistical and mathematical techniques, methods, and theories that solve modern environmental problems. Articles must proffer substantive, new statistical or mathematical advances to answer important scientific questions in the environmental sciences, or must develop novel or enhanced statistical methodology with clear applications to environmental science. New methods should be illustrated with recent environmental data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信