{"title":"ClioQuery: Interactive Query-oriented Text Analytics for Comprehensive Investigation of Historical News Archives","authors":"Abram Handler, Narges Mahyar, Brendan O’Connor","doi":"https://dl.acm.org/doi/10.1145/3524025","DOIUrl":null,"url":null,"abstract":"<p>Historians and archivists often find and analyze the occurrences of query words in newspaper archives to help answer fundamental questions about society. But much work in text analytics focuses on helping people investigate other textual units, such as events, clusters, ranked documents, entity relationships, or thematic hierarchies. Informed by a study into the needs of historians and archivists, we thus propose <span>ClioQuery</span>, a text analytics system uniquely organized around the analysis of query words in context. <span>ClioQuery</span> applies text simplification techniques from natural language processing to help historians quickly and comprehensively gather and analyze all occurrences of a query word across an archive. It also pairs these new NLP methods with more traditional features like linked views and in-text highlighting to help engender trust in summarization techniques. We evaluate <span>ClioQuery</span> with two separate user studies, in which historians explain how <span>ClioQuery</span>’s novel text simplification features can help facilitate historical research. We also evaluate with a separate quantitative comparison study, which shows that <span>ClioQuery</span> helps crowdworkers find and remember historical information. Such results suggest possible new directions for text analytics in other query-oriented settings.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"39 1","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Interactive Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/https://dl.acm.org/doi/10.1145/3524025","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Historians and archivists often find and analyze the occurrences of query words in newspaper archives to help answer fundamental questions about society. But much work in text analytics focuses on helping people investigate other textual units, such as events, clusters, ranked documents, entity relationships, or thematic hierarchies. Informed by a study into the needs of historians and archivists, we thus propose ClioQuery, a text analytics system uniquely organized around the analysis of query words in context. ClioQuery applies text simplification techniques from natural language processing to help historians quickly and comprehensively gather and analyze all occurrences of a query word across an archive. It also pairs these new NLP methods with more traditional features like linked views and in-text highlighting to help engender trust in summarization techniques. We evaluate ClioQuery with two separate user studies, in which historians explain how ClioQuery’s novel text simplification features can help facilitate historical research. We also evaluate with a separate quantitative comparison study, which shows that ClioQuery helps crowdworkers find and remember historical information. Such results suggest possible new directions for text analytics in other query-oriented settings.
期刊介绍:
The ACM Transactions on Interactive Intelligent Systems (TiiS) publishes papers on research concerning the design, realization, or evaluation of interactive systems that incorporate some form of machine intelligence. TIIS articles come from a wide range of research areas and communities. An article can take any of several complementary views of interactive intelligent systems, focusing on:
the intelligent technology,
the interaction of users with the system, or
both aspects at once.