文本分类中提示程序性弱监督的交互式视觉增强

IF 2.9 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum Pub Date : 2025-05-23 DOI:10.1111/cgf.70131

Y. Lin, S. Wei, H. Zhang, D. Qu, J. Bai

{"title":"文本分类中提示程序性弱监督的交互式视觉增强","authors":"Y. Lin, S. Wei, H. Zhang, D. Qu, J. Bai","doi":"10.1111/cgf.70131","DOIUrl":null,"url":null,"abstract":"<p>Programmatic Weak Supervision (PWS) has emerged as a powerful technique for text classification. By aggregating weak labels provided by manually written label functions, it allows training models on large-scale unlabeled data without the need for costly manual annotations. As an improvement, Prompted PWS incorporates pre-trained large language models (LLMs) as part of the label function, replacing programs coded by experts with natural language prompts. This allows for the more accessible expression of complex and ambiguous concepts. However, the existing workflow does not fully utilize the advantages of Prompted PWS, and the annotators have difficulty in effectively converging their ideas to develop high-quality LFs, and lack support during the iterations. To address this issue, this study improves the existing PWS workflow through interactive visualization. We first propose a collaborative LF development workflow between humans and LLMs, where the large language model assists humans in creating a structured development space for exploration and automatically generates prompted LFs based on human selections. Annotators can integrate their knowledge through informed selection and judgment. Then, we present an interactive visual system that supports efficient development, in-depth exploration, and iteration of LFs. Our evaluation, comprising a quantitative evaluation on the benchmark, a case study, and a user study, demonstrates the effectiveness of our approach.</p>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"44 3","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Interactive Visual Enhancement for Prompted Programmatic Weak Supervision in Text Classification\",\"authors\":\"Y. Lin, S. Wei, H. Zhang, D. Qu, J. Bai\",\"doi\":\"10.1111/cgf.70131\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Programmatic Weak Supervision (PWS) has emerged as a powerful technique for text classification. By aggregating weak labels provided by manually written label functions, it allows training models on large-scale unlabeled data without the need for costly manual annotations. As an improvement, Prompted PWS incorporates pre-trained large language models (LLMs) as part of the label function, replacing programs coded by experts with natural language prompts. This allows for the more accessible expression of complex and ambiguous concepts. However, the existing workflow does not fully utilize the advantages of Prompted PWS, and the annotators have difficulty in effectively converging their ideas to develop high-quality LFs, and lack support during the iterations. To address this issue, this study improves the existing PWS workflow through interactive visualization. We first propose a collaborative LF development workflow between humans and LLMs, where the large language model assists humans in creating a structured development space for exploration and automatically generates prompted LFs based on human selections. Annotators can integrate their knowledge through informed selection and judgment. Then, we present an interactive visual system that supports efficient development, in-depth exploration, and iteration of LFs. Our evaluation, comprising a quantitative evaluation on the benchmark, a case study, and a user study, demonstrates the effectiveness of our approach.</p>\",\"PeriodicalId\":10687,\"journal\":{\"name\":\"Computer Graphics Forum\",\"volume\":\"44 3\",\"pages\":\"\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Graphics Forum\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/cgf.70131\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Graphics Forum","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cgf.70131","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

程序化弱监督（PWS）作为一种强大的文本分类技术已经出现。通过聚合由手动编写的标签函数提供的弱标签，它允许在大规模未标记数据上训练模型，而不需要昂贵的手动注释。作为改进，prompt PWS将预训练的大型语言模型（llm）作为标签功能的一部分，用自然语言提示取代专家编写的程序。这允许更容易地表达复杂和模糊的概念。然而，现有的工作流并没有充分利用prompt PWS的优势，注释者很难有效地融合他们的想法来开发高质量的LFs，并且在迭代过程中缺乏支持。为了解决这一问题，本研究通过交互式可视化改进了现有的PWS工作流程。我们首先提出了人类和llm之间的协作LF开发工作流，其中大型语言模型帮助人类创建结构化的开发空间进行探索，并根据人类的选择自动生成提示的LF。注释者可以通过明智的选择和判断来整合他们的知识。然后，我们提出了一个交互式视觉系统，支持高效开发，深入探索和迭代的LFs。我们的评估包括对基准的定量评估、案例研究和用户研究，证明了我们方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Interactive Visual Enhancement for Prompted Programmatic Weak Supervision in Text Classification

Programmatic Weak Supervision (PWS) has emerged as a powerful technique for text classification. By aggregating weak labels provided by manually written label functions, it allows training models on large-scale unlabeled data without the need for costly manual annotations. As an improvement, Prompted PWS incorporates pre-trained large language models (LLMs) as part of the label function, replacing programs coded by experts with natural language prompts. This allows for the more accessible expression of complex and ambiguous concepts. However, the existing workflow does not fully utilize the advantages of Prompted PWS, and the annotators have difficulty in effectively converging their ideas to develop high-quality LFs, and lack support during the iterations. To address this issue, this study improves the existing PWS workflow through interactive visualization. We first propose a collaborative LF development workflow between humans and LLMs, where the large language model assists humans in creating a structured development space for exploration and automatically generates prompted LFs based on human selections. Annotators can integrate their knowledge through informed selection and judgment. Then, we present an interactive visual system that supports efficient development, in-depth exploration, and iteration of LFs. Our evaluation, comprising a quantitative evaluation on the benchmark, a case study, and a user study, demonstrates the effectiveness of our approach.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Graphics Forum 工程技术-计算机：软件工程

CiteScore

5.80

自引率

12.00%

发文量

175

审稿时长

3-6 weeks

期刊介绍： Computer Graphics Forum is the official journal of Eurographics, published in cooperation with Wiley-Blackwell, and is a unique, international source of information for computer graphics professionals interested in graphics developments worldwide. It is now one of the leading journals for researchers, developers and users of computer graphics in both commercial and academic environments. The journal reports on the latest developments in the field throughout the world and covers all aspects of the theory, practice and application of computer graphics.