Utilizing a Transparency-Driven Environment Toward Trusted Automatic Genre Classification: A Case Study in Journalism History

2018 IEEE 14th International Conference on e-Science (e-Science) Pub Date : 2018-10-01 DOI:10.1109/eScience.2018.00137

A. Bilgin, L. Hollink, J. V. Ossenbruggen, E. T. K. Sang, Kim Smeenk, Frank Harbers, M. Broersma

引用次数: 2

Abstract

With the growing abundance of unlabeled data in real-world tasks, researchers have to rely on the predictions given by black-boxed computational models. However, it is an often neglected fact that these models may be scoring high on accuracy for the wrong reasons. In this paper, we present a practical impact analysis of enabling model transparency by various presentation forms. For this purpose, we developed an environment that empowers non-computer scientists to become practicing data scientists in their own research field. We demonstrate the gradually increasing understanding of journalism historians through a real-world use case study on automatic genre classification of newspaper articles. This study is a first step towards trusted usage of machine learning pipelines in a responsible way.

查看原文本刊更多论文

利用透明驱动的环境实现可信的自动体裁分类:新闻史案例研究

随着现实世界任务中未标记数据的日益增多，研究人员不得不依赖于黑箱计算模型给出的预测。然而，一个经常被忽视的事实是，这些模型可能因为错误的原因而在准确性上得分很高。在本文中，我们提出了通过各种表示形式实现模型透明度的实际影响分析。为此，我们开发了一个环境，使非计算机科学家能够在自己的研究领域成为实践数据科学家。我们通过对报纸文章自动体裁分类的真实案例研究，展示了新闻历史学家逐渐增加的理解。这项研究是以负责任的方式可靠地使用机器学习管道的第一步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE 14th International Conference on e-Science (e-Science)

自引率

0.00%

发文量