Dr. Can See: Towards a Multi-modal Disease Diagnosis Virtual Assistant

Abhisek Tiwari, Manisimha Manthena, S. Saha, P. Bhattacharyya, Minakshi Dhar, Sarbajeet Tiwari
{"title":"Dr. Can See: Towards a Multi-modal Disease Diagnosis Virtual Assistant","authors":"Abhisek Tiwari, Manisimha Manthena, S. Saha, P. Bhattacharyya, Minakshi Dhar, Sarbajeet Tiwari","doi":"10.1145/3511808.3557296","DOIUrl":null,"url":null,"abstract":"Artificial Intelligence-based clinical decision support is gaining ever-growing popularity and demand in both the research and industry communities. One such manifestation is automatic disease diagnosis, which aims to assist clinicians in conducting symptom investigations and disease diagnoses. When we consult with doctors, we often report and describe our health conditions with visual aids. Moreover, many people are unacquainted with several symptoms and medical terms, such as mouth ulcer and skin growth. Therefore, visual form of symptom reporting is a necessity. Motivated by the efficacy of visual form of symptom reporting, we propose and build a novel end-to-end Multi-modal Disease Diagnosis Virtual Assistant (MDD-VA) using reinforcement learning technique. In conversation, users' responses are heavily influenced by the ongoing dialogue context, and multi-modal responses appear to be of no difference. We also propose and incorporate a Context-aware Symptom Image Identification module that leverages discourse context in addition to the symptom image for identifying symptoms effectively. Furthermore, we first curate a multi-modal conversational medical dialogue corpus in English that is annotated with intent, symptoms, and visual information. The proposed MDD-VA outperforms multiple uni-modal baselines in both automatic and human evaluation, which firmly establishes the critical role of symptom information provided by visuals . The dataset and code are available at https://github.com/NLP-RL/DrCanSee","PeriodicalId":389624,"journal":{"name":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3511808.3557296","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Artificial Intelligence-based clinical decision support is gaining ever-growing popularity and demand in both the research and industry communities. One such manifestation is automatic disease diagnosis, which aims to assist clinicians in conducting symptom investigations and disease diagnoses. When we consult with doctors, we often report and describe our health conditions with visual aids. Moreover, many people are unacquainted with several symptoms and medical terms, such as mouth ulcer and skin growth. Therefore, visual form of symptom reporting is a necessity. Motivated by the efficacy of visual form of symptom reporting, we propose and build a novel end-to-end Multi-modal Disease Diagnosis Virtual Assistant (MDD-VA) using reinforcement learning technique. In conversation, users' responses are heavily influenced by the ongoing dialogue context, and multi-modal responses appear to be of no difference. We also propose and incorporate a Context-aware Symptom Image Identification module that leverages discourse context in addition to the symptom image for identifying symptoms effectively. Furthermore, we first curate a multi-modal conversational medical dialogue corpus in English that is annotated with intent, symptoms, and visual information. The proposed MDD-VA outperforms multiple uni-modal baselines in both automatic and human evaluation, which firmly establishes the critical role of symptom information provided by visuals . The dataset and code are available at https://github.com/NLP-RL/DrCanSee
Dr. Can See:迈向多模式疾病诊断虚拟助手
基于人工智能的临床决策支持在研究界和产业界都越来越受欢迎和需求。其中一种表现是疾病自动诊断,其目的是协助临床医生进行症状调查和疾病诊断。当我们向医生咨询时,我们经常用视觉辅助工具报告和描述我们的健康状况。此外,许多人不熟悉一些症状和医学术语,如口腔溃疡和皮肤生长。因此,视觉形式的症状报告是必要的。基于视觉形式的症状报告的有效性,我们提出并构建了一种基于强化学习技术的端到端多模式疾病诊断虚拟助手(MDD-VA)。在对话中,用户的反应很大程度上受到正在进行的对话上下文的影响,而多模态反应似乎没有区别。我们还提出并合并了一个上下文感知症状图像识别模块,该模块除了利用症状图像外,还利用话语上下文有效地识别症状。此外,我们首先策划了一个多模态的英语会话医学对话语料库,该语料库带有意图、症状和视觉信息的注释。所提出的MDD-VA在自动和人工评估方面都优于多个单模态基线,这牢固地确立了视觉提供的症状信息的关键作用。数据集和代码可在https://github.com/NLP-RL/DrCanSee上获得
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信