EMPLOYING COMPUTATIONAL LINGUISTIC TECHNOLOGIES AND OCULOGRAPHY TO DEVELOP DIAGNOSTIC TOOL FOR DETECTING AUTOAGGRESSIVE TENDENCIES IN YOUNG PEOPLE: A RIVETED GAZE INTO "GET RID OF THE SHACKLES OF THIS WORLD".

4区医学 Q2 Medicine

Psychiatria Danubina Pub Date : 2025-09-01

Anna Khomenko, Lala Kasimova, Evgeniy Sychugov, Marina Svyatogor, Anastasiya Komratova, Polina Domozhirova, Alina Aisina, Danil Trofimov, Kseniya Bikbaeva, Elena Sloeva, Daria Smirnova

{"title":"EMPLOYING COMPUTATIONAL LINGUISTIC TECHNOLOGIES AND OCULOGRAPHY TO DEVELOP DIAGNOSTIC TOOL FOR DETECTING AUTOAGGRESSIVE TENDENCIES IN YOUNG PEOPLE: A RIVETED GAZE INTO \"GET RID OF THE SHACKLES OF THIS WORLD\".","authors":"Anna Khomenko, Lala Kasimova, Evgeniy Sychugov, Marina Svyatogor, Anastasiya Komratova, Polina Domozhirova, Alina Aisina, Danil Trofimov, Kseniya Bikbaeva, Elena Sloeva, Daria Smirnova","doi":"","DOIUrl":null,"url":null,"abstract":"Background: Early recognition of autoaggressive tendencies in young people is essential for diagnostic screening and reducing suicidality risks. This can be achieved through psycholinguistic approaches such as corpus analysis and eye-tracking studies. Corpus research helps to develop generalized speech patterns of those at risk of suicide, while oculographic methods examine perceptual cues linked to suicidal tendencies.Methods: We formulated an algorithmic framework for constructing verbal, visual, and multimodal material to identify autoaggressive tendencies among youth. The stimuli material was created following the idiolect paradigm of forensic authorship attribution. The first stage involved analyzing corpus data including materials from social networks and social media, the Rusentiment database, and a text collection from the Privolzhsky Research Medical University. Python's NLTK and SpaCy libraries for automated text processing were used to extract corpus statistics, n-grams, keywords, and collocations for identifying linguistic markers of autoaggression. Keywords were statistically ranked using Log-likelihood, T-score, and mutual information, while collocations were derived via T-score analysis. Sentiment analysis for the Dostoevsky Python library and stylistic indices (lexical diversity, readability) were also applied. The total analyzed material comprised more than 100 million tokens. We next integrated, stimulus and filler materials into an eye-tracking application (developed by LLC Lad IT Group) using standard laptop video cameras. Oculographic data quantified gaze delay differences via a percentage excess formula to pinpoint the most diagnostically relevant stimuli. In two iterations of the pilot experiment, 66 youths from the control group and 29 from the target group participated in the oculographic experiments.Results: In multimodal texts, most stimuli derived from corpus statistics were relevant, and all individuals in the target group showed a prolonged gaze delay; visual stimuli (pseudo-self-portraits, anime/game characters) elicited 26-36% longer gaze delay in the target group. Verbal stimuli analysis revealed prolonged gaze fixations on self-referential pronouns (12-25%) and metaphorical death expressions, although direct terms, like \"suicide\" showed the gaze avoidance (-11.9 to -129% deviation). We then developed a system of weighted coefficients for an automated diagnostic model. The algorithm showed 72 % accuracy in identifying autoaggression, presenting a promising tool for early diagnostic screening of this phenomenon.Conclusions: The present methodology focuses on creating and employing a novel selective dataset consisting of visual, linguistic, and multimodal text stimuli integrated into the oculographic examination protocol. The oculographic detection of eye movement perceptual cues in response to exposure to the stimuli dataset may identify objective markers for evidence-based diagnostics of mental disorders (e.g., depression) and fundamental psychopathological phenomena (e.g., suicidality), including at-risk states (e.g., autoaggression). Furthermore, this approach may contribute to the enhancement of suicide prevention programs, particularly targeted interventions for the vulnerable population of young people who experience autoaggressive tendencies (i.e., self-aggression).","PeriodicalId":20760,"journal":{"name":"Psychiatria Danubina","volume":"37 Suppl 1","pages":"213-223"},"PeriodicalIF":0.0000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychiatria Danubina","FirstCategoryId":"3","ListUrlMain":"","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Medicine","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Early recognition of autoaggressive tendencies in young people is essential for diagnostic screening and reducing suicidality risks. This can be achieved through psycholinguistic approaches such as corpus analysis and eye-tracking studies. Corpus research helps to develop generalized speech patterns of those at risk of suicide, while oculographic methods examine perceptual cues linked to suicidal tendencies.

Methods: We formulated an algorithmic framework for constructing verbal, visual, and multimodal material to identify autoaggressive tendencies among youth. The stimuli material was created following the idiolect paradigm of forensic authorship attribution. The first stage involved analyzing corpus data including materials from social networks and social media, the Rusentiment database, and a text collection from the Privolzhsky Research Medical University. Python's NLTK and SpaCy libraries for automated text processing were used to extract corpus statistics, n-grams, keywords, and collocations for identifying linguistic markers of autoaggression. Keywords were statistically ranked using Log-likelihood, T-score, and mutual information, while collocations were derived via T-score analysis. Sentiment analysis for the Dostoevsky Python library and stylistic indices (lexical diversity, readability) were also applied. The total analyzed material comprised more than 100 million tokens. We next integrated, stimulus and filler materials into an eye-tracking application (developed by LLC Lad IT Group) using standard laptop video cameras. Oculographic data quantified gaze delay differences via a percentage excess formula to pinpoint the most diagnostically relevant stimuli. In two iterations of the pilot experiment, 66 youths from the control group and 29 from the target group participated in the oculographic experiments.

Results: In multimodal texts, most stimuli derived from corpus statistics were relevant, and all individuals in the target group showed a prolonged gaze delay; visual stimuli (pseudo-self-portraits, anime/game characters) elicited 26-36% longer gaze delay in the target group. Verbal stimuli analysis revealed prolonged gaze fixations on self-referential pronouns (12-25%) and metaphorical death expressions, although direct terms, like "suicide" showed the gaze avoidance (-11.9 to -129% deviation). We then developed a system of weighted coefficients for an automated diagnostic model. The algorithm showed 72 % accuracy in identifying autoaggression, presenting a promising tool for early diagnostic screening of this phenomenon.

Conclusions: The present methodology focuses on creating and employing a novel selective dataset consisting of visual, linguistic, and multimodal text stimuli integrated into the oculographic examination protocol. The oculographic detection of eye movement perceptual cues in response to exposure to the stimuli dataset may identify objective markers for evidence-based diagnostics of mental disorders (e.g., depression) and fundamental psychopathological phenomena (e.g., suicidality), including at-risk states (e.g., autoaggression). Furthermore, this approach may contribute to the enhancement of suicide prevention programs, particularly targeted interventions for the vulnerable population of young people who experience autoaggressive tendencies (i.e., self-aggression).

本刊更多论文

利用计算机语言技术和视觉学开发诊断工具，以检测年轻人的自我攻击倾向：专注于“摆脱这个世界的束缚”。

背景：早期识别年轻人的自我攻击倾向对诊断筛查和降低自杀风险至关重要。这可以通过语料库分析和眼球追踪研究等心理语言学方法来实现。语料库研究有助于发展那些有自杀风险的人的通用语言模式，而视觉方法检查与自杀倾向有关的感知线索。方法：我们制定了一个算法框架来构建语言、视觉和多模态材料来识别青少年的自我攻击倾向。刺激材料是根据法医作者归属的习惯范式创建的。第一阶段涉及分析语料库数据，包括来自社交网络和社交媒体的材料、Rusentiment数据库和来自Privolzhsky研究医科大学的文本集。Python的NLTK和SpaCy库用于自动文本处理，用于提取语料库统计、n-gram、关键字和搭配，以识别自动攻击的语言标记。关键词通过对数似然、t得分和互信息进行统计排序，搭配通过t得分分析得出。对陀思妥耶夫斯基Python库和风格指数（词汇多样性，可读性）的情感分析也被应用。分析的材料总数超过1亿个代币。接下来，我们将刺激材料和填充材料整合到使用标准笔记本电脑摄像机的眼球追踪应用程序中（由LLC Lad IT Group开发）。视觉数据通过一个超额百分比公式来量化凝视延迟差异，以确定最具诊断相关性的刺激。在两个迭代的先导实验中，对照组的66名青少年和目标组的29名青少年参加了视觉实验。结果：在多模态文本中，大多数来自语料库统计的刺激是相关的，目标组的所有个体都表现出较长的凝视延迟；视觉刺激（伪自画像、动漫/游戏角色）会使目标群体的凝视延迟时间延长26-36%。言语刺激分析显示，对自我指涉代词和隐喻性死亡表达的注视时间延长（12-25%），尽管“自杀”等直接术语显示了凝视回避（- 11.9%至-129%的偏差）。然后，我们为自动诊断模型开发了一个加权系数系统。该算法在识别自身攻击方面的准确率为72%，为这种现象的早期诊断筛查提供了一个有前途的工具。结论：目前的方法侧重于创建和使用一个新的选择性数据集，包括视觉、语言和多模态文本刺激，并将其集成到眼科检查方案中。对暴露于刺激数据集后的眼动知觉线索的眼视学检测可以确定客观标记，用于精神障碍（如抑郁症）和基本精神病理现象（如自杀）的循证诊断，包括危险状态（如自体攻击）。此外，这种方法可能有助于加强自杀预防计划，特别是针对经历自我攻击倾向（即自我攻击）的弱势群体的针对性干预。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Psychiatria Danubina 医学-精神病学

CiteScore

3.00

自引率

0.00%

发文量

288

审稿时长

4-8 weeks

期刊介绍： Psychiatria Danubina is a peer-reviewed open access journal of the Psychiatric Danubian Association, aimed to publish original scientific contributions in psychiatry, psychological medicine and related science (neurosciences, biological, psychological, and social sciences as well as philosophy of science and medical ethics, history, organization and economics of mental health services).