Topical Cluster Discovery in Semistructured Healthcare Data

2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI) Pub Date : 2018-12-01 DOI:10.1109/WI.2018.00014

G. Costa, R. Ortale

引用次数: 0

Abstract

We propose an approach to clustering XML-based corpora of healthcare documents by their latent topic similarity. Our approach is a two-step process. Initially, the latent topic distributions of the input healthcare documents are inferred, by performing collapsed Gibbs sampling and parameter estimation under an XML topic model. Subsequently, the inferred distributions are grouped through established clustering techniques.

查看原文本刊更多论文

半结构化医疗保健数据中的局部聚类发现

我们提出了一种基于潜在主题相似度的基于xml的医疗文档语料库聚类方法。我们的方法分为两步。最初，通过在XML主题模型下执行折叠Gibbs抽样和参数估计，推断输入医疗保健文档的潜在主题分布。随后，通过已建立的聚类技术对推断的分布进行分组。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI)

自引率

0.00%

发文量