Efficient and scalable masked word prediction using concept formation

IF 2.1 3区心理学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Cognitive Systems Research Pub Date : 2025-06-04 DOI:10.1016/j.cogsys.2025.101371

Xin Lian , Zekun Wang , Christopher J. MacLellan

{"title":"Efficient and scalable masked word prediction using concept formation","authors":"Xin Lian , Zekun Wang , Christopher J. MacLellan","doi":"10.1016/j.cogsys.2025.101371","DOIUrl":null,"url":null,"abstract":"<div><div>This paper introduces Cobweb/4L, a novel approach for efficient language model learning that supports masked word prediction. The approach builds on Cobweb, an incremental system that learns a hierarchy of probabilistic concepts. Each concept stores the frequencies of words that appear in instances tagged with the concept label. The system utilizes an attribute-value representation to encode words and their context into instances. Cobweb/4L uses an information-theoretic variant of category utility as well as a new performance mechanism that leverages multiple concepts to generate predictions. We demonstrate that its new performance mechanism substantially outperforms prior Cobweb performance mechanisms that use only a single node to generate predictions. Further, we demonstrate that Cobweb/4L outperforms transformer-based language models in a low-data setting by learning more rapidly and achieving better final performance. Lastly, we show that Cobweb/4L, which is hyperparameter-free, is robust across varying scales of training data and does not require any manual tuning. This is in contrast to Word2Vec, which performs best with a varying number of hidden nodes that depend on the total amount of training data; this means its hyperparameters must be manually tuned for different amounts of training data. We conclude by discussing future directions for Cobweb/4L.</div></div>","PeriodicalId":55242,"journal":{"name":"Cognitive Systems Research","volume":"92 ","pages":"Article 101371"},"PeriodicalIF":2.1000,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Systems Research","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389041725000518","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

This paper introduces Cobweb/4L, a novel approach for efficient language model learning that supports masked word prediction. The approach builds on Cobweb, an incremental system that learns a hierarchy of probabilistic concepts. Each concept stores the frequencies of words that appear in instances tagged with the concept label. The system utilizes an attribute-value representation to encode words and their context into instances. Cobweb/4L uses an information-theoretic variant of category utility as well as a new performance mechanism that leverages multiple concepts to generate predictions. We demonstrate that its new performance mechanism substantially outperforms prior Cobweb performance mechanisms that use only a single node to generate predictions. Further, we demonstrate that Cobweb/4L outperforms transformer-based language models in a low-data setting by learning more rapidly and achieving better final performance. Lastly, we show that Cobweb/4L, which is hyperparameter-free, is robust across varying scales of training data and does not require any manual tuning. This is in contrast to Word2Vec, which performs best with a varying number of hidden nodes that depend on the total amount of training data; this means its hyperparameters must be manually tuned for different amounts of training data. We conclude by discussing future directions for Cobweb/4L.

查看原文本刊更多论文

使用概念形成的高效和可扩展的掩码词预测

本文介绍了一种新的支持掩码词预测的高效语言模型学习方法——蛛网/4L。该方法建立在蛛网上，蛛网是一个学习概率概念层次结构的增量系统。每个概念存储用概念标签标记的实例中出现的单词的频率。该系统利用属性值表示将单词及其上下文编码为实例。Cobweb/4L使用类别实用程序的信息理论变体，以及利用多个概念生成预测的新性能机制。我们证明了它的新性能机制大大优于先前仅使用单个节点生成预测的蛛网性能机制。此外，我们证明了在低数据设置下，通过更快地学习和获得更好的最终性能，蜘蛛网/4L优于基于转换器的语言模型。最后，我们证明了无超参数的蛛网/4L在不同规模的训练数据上具有鲁棒性，并且不需要任何手动调优。这与Word2Vec相反，Word2Vec在不同数量的隐藏节点上表现最好，这取决于训练数据的总量；这意味着它的超参数必须针对不同数量的训练数据进行手动调整。最后，我们讨论了蛛网/4L的未来发展方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Cognitive Systems Research 工程技术-计算机：人工智能

CiteScore

9.40

自引率

5.10%

发文量

审稿时长

>12 weeks

期刊介绍： Cognitive Systems Research is dedicated to the study of human-level cognition. As such, it welcomes papers which advance the understanding, design and applications of cognitive and intelligent systems, both natural and artificial. The journal brings together a broad community studying cognition in its many facets in vivo and in silico, across the developmental spectrum, focusing on individual capacities or on entire architectures. It aims to foster debate and integrate ideas, concepts, constructs, theories, models and techniques from across different disciplines and different perspectives on human-level cognition. The scope of interest includes the study of cognitive capacities and architectures - both brain-inspired and non-brain-inspired - and the application of cognitive systems to real-world problems as far as it offers insights relevant for the understanding of cognition. Cognitive Systems Research therefore welcomes mature and cutting-edge research approaching cognition from a systems-oriented perspective, both theoretical and empirically-informed, in the form of original manuscripts, short communications, opinion articles, systematic reviews, and topical survey articles from the fields of Cognitive Science (including Philosophy of Cognitive Science), Artificial Intelligence/Computer Science, Cognitive Robotics, Developmental Science, Psychology, and Neuroscience and Neuromorphic Engineering. Empirical studies will be considered if they are supplemented by theoretical analyses and contributions to theory development and/or computational modelling studies.