利用开箱即用的检索模型改善心理健康支持

Theo Rummer-Downing, Julie Weeds
{"title":"利用开箱即用的检索模型改善心理健康支持","authors":"Theo Rummer-Downing, Julie Weeds","doi":"10.5220/0011634300003414","DOIUrl":null,"url":null,"abstract":": This work compares the performance of several information retrieval (IR) models in the search for relevant mental health documents based on relevance to forum post queries from a fully-moderated online mental health service. Three different architectures are assessed: a sparse lexical model, BM25, is used as a base-line, alongside two neural SBERT-based architectures - the bi-encoder and the cross-encoder. We highlight the credibility of using pretrained language models (PLMs) out-of-the-box, without an additional fine-tuning stage, to achieve high retrieval quality across a limited set of resources. Error analysis of the ranking results suggested PLMs make errors on documents which contain so called red-herrings - words which are semantically related but irrelevant to the query - whereas human judgements were found to suffer when queries are vague and present no clear information need. Further, we show that bias towards an author’s writing style within a PLM affects retrieval quality and, therefore, can impact on the success of mental health support if left unaddressed.","PeriodicalId":20676,"journal":{"name":"Proceedings of the International Conference on Health Informatics and Medical Application Technology","volume":"17 1","pages":"64-73"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Leveraging Out-of-the-Box Retrieval Models to Improve Mental Health Support\",\"authors\":\"Theo Rummer-Downing, Julie Weeds\",\"doi\":\"10.5220/0011634300003414\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": This work compares the performance of several information retrieval (IR) models in the search for relevant mental health documents based on relevance to forum post queries from a fully-moderated online mental health service. Three different architectures are assessed: a sparse lexical model, BM25, is used as a base-line, alongside two neural SBERT-based architectures - the bi-encoder and the cross-encoder. We highlight the credibility of using pretrained language models (PLMs) out-of-the-box, without an additional fine-tuning stage, to achieve high retrieval quality across a limited set of resources. Error analysis of the ranking results suggested PLMs make errors on documents which contain so called red-herrings - words which are semantically related but irrelevant to the query - whereas human judgements were found to suffer when queries are vague and present no clear information need. Further, we show that bias towards an author’s writing style within a PLM affects retrieval quality and, therefore, can impact on the success of mental health support if left unaddressed.\",\"PeriodicalId\":20676,\"journal\":{\"name\":\"Proceedings of the International Conference on Health Informatics and Medical Application Technology\",\"volume\":\"17 1\",\"pages\":\"64-73\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the International Conference on Health Informatics and Medical Application Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5220/0011634300003414\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Conference on Health Informatics and Medical Application Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0011634300003414","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

这项工作比较了几种信息检索(IR)模型在搜索相关心理健康文档时的性能,这些文档是基于与来自完全调节的在线心理健康服务的论坛帖子查询的相关性。本文评估了三种不同的架构:稀疏词法模型BM25作为基线,以及两种基于sbert的神经网络架构——双编码器和交叉编码器。我们强调了使用开箱即用的预训练语言模型(PLMs)的可信度,无需额外的微调阶段,可以在有限的资源集上实现高质量的检索。对排名结果的错误分析表明,plm在包含所谓的“红鲱鱼”(语义上相关但与查询无关的单词)的文档上会出错,而当查询含糊不清且没有提供明确的信息需求时,人类的判断会受到影响。此外,我们表明,在PLM中对作者写作风格的偏见会影响检索质量,因此,如果不加以解决,可能会影响心理健康支持的成功。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Leveraging Out-of-the-Box Retrieval Models to Improve Mental Health Support
: This work compares the performance of several information retrieval (IR) models in the search for relevant mental health documents based on relevance to forum post queries from a fully-moderated online mental health service. Three different architectures are assessed: a sparse lexical model, BM25, is used as a base-line, alongside two neural SBERT-based architectures - the bi-encoder and the cross-encoder. We highlight the credibility of using pretrained language models (PLMs) out-of-the-box, without an additional fine-tuning stage, to achieve high retrieval quality across a limited set of resources. Error analysis of the ranking results suggested PLMs make errors on documents which contain so called red-herrings - words which are semantically related but irrelevant to the query - whereas human judgements were found to suffer when queries are vague and present no clear information need. Further, we show that bias towards an author’s writing style within a PLM affects retrieval quality and, therefore, can impact on the success of mental health support if left unaddressed.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信