Performance evaluation of character-level CNNs using tweet data and analysis for weight perturbations

IF 0.8 Q4 ROBOTICS

Artificial Life and Robotics Pub Date : 2024-04-05 DOI:10.1007/s10015-024-00944-9

Kazuteru Miyazaki, Masaaki Ida

{"title":"Performance evaluation of character-level CNNs using tweet data and analysis for weight perturbations","authors":"Kazuteru Miyazaki, Masaaki Ida","doi":"10.1007/s10015-024-00944-9","DOIUrl":null,"url":null,"abstract":"<div><p>Character-level convolutional neural networks (CLCNNs) are commonly used to classify textual data. CLCNN is used as a more versatile tool. For natural language recognition, after decomposing a sentence into character units, each unit is converted into a corresponding character code (e.g., Unicode values) and the code is input into the CLCNN network. Thus, sentences can be treated like images. We have previously applied a CLCNN to verify whether a university’s diploma and/or curriculum policies are well written. In this study, we experimentally confirm the effectiveness of CLCNN using tweet data. In particular, we focus on the effect of the number of units on performance using the following two types of data; one is a real and public tweet dataset on the reputation of a cell phone, and the other is the NTCIR-13 MedWeb task, which consists of pseudo-tweet data and is a well-known collection of tests for multi-label problems. Results of experiments conducted by varying the number of units in the all-coupled layer confirmed the agreement of the results with the theorem introduced in the Amari’s book (Amari in Mathematical Science New Development of Information Geometry, For Senior & Graduate Courses. SAIENSU-SHA Co., 2014). Furthermore, in the NTCIR-13 MedWeb task, we analyze two kinds of experiments, the effects of kernel size and weight perturbation. The results of the difference in the kernel size suggest the existence of an optimal kernel size for sentence comprehension. The results of perturbations to the convolutional layer and pooling layer indicate the possibility of relationship between the numbers of degrees of freedom and network parameters.</p></div>","PeriodicalId":46050,"journal":{"name":"Artificial Life and Robotics","volume":"29 2","pages":"266 - 273"},"PeriodicalIF":0.8000,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Life and Robotics","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s10015-024-00944-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Character-level convolutional neural networks (CLCNNs) are commonly used to classify textual data. CLCNN is used as a more versatile tool. For natural language recognition, after decomposing a sentence into character units, each unit is converted into a corresponding character code (e.g., Unicode values) and the code is input into the CLCNN network. Thus, sentences can be treated like images. We have previously applied a CLCNN to verify whether a university’s diploma and/or curriculum policies are well written. In this study, we experimentally confirm the effectiveness of CLCNN using tweet data. In particular, we focus on the effect of the number of units on performance using the following two types of data; one is a real and public tweet dataset on the reputation of a cell phone, and the other is the NTCIR-13 MedWeb task, which consists of pseudo-tweet data and is a well-known collection of tests for multi-label problems. Results of experiments conducted by varying the number of units in the all-coupled layer confirmed the agreement of the results with the theorem introduced in the Amari’s book (Amari in Mathematical Science New Development of Information Geometry, For Senior & Graduate Courses. SAIENSU-SHA Co., 2014). Furthermore, in the NTCIR-13 MedWeb task, we analyze two kinds of experiments, the effects of kernel size and weight perturbation. The results of the difference in the kernel size suggest the existence of an optimal kernel size for sentence comprehension. The results of perturbations to the convolutional layer and pooling layer indicate the possibility of relationship between the numbers of degrees of freedom and network parameters.

查看原文本刊更多论文

利用推文数据和权重扰动分析对字符级 CNN 进行性能评估

字符级卷积神经网络（CLCNN）通常用于对文本数据进行分类。CLCNN 是一种用途更为广泛的工具。在自然语言识别中，将句子分解为字符单元后，将每个单元转换为相应的字符代码（如 Unicode 值），然后将代码输入 CLCNN 网络。因此，句子可以像图像一样处理。此前，我们曾将 CLCNN 用于验证一所大学的文凭和/或课程政策是否编写得当。在本研究中，我们使用推文数据对 CLCNN 的有效性进行了实验验证。具体而言，我们使用以下两种数据重点研究了单元数对性能的影响：一种是关于手机声誉的真实公开推文数据集，另一种是 NTCIR-13 MedWeb 任务，该任务由伪推文数据组成，是众所周知的多标签问题测试集合。通过改变全耦合层中的单元数量进行的实验结果证实，实验结果与阿马里在其著作《阿马里在数学科学中的新发展：信息几何》（Amari in Mathematical Science New Development of Information Geometry, For Senior & Graduate Courses.SAIENSU-SHA Co., 2014）。此外，在 NTCIR-13 MedWeb 任务中，我们分析了两种实验，即核大小和权重扰动的影响。内核大小差异的结果表明，句子理解存在最佳内核大小。对卷积层和池化层的扰动结果表明，自由度数与网络参数之间可能存在关系。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Artificial Life and Robotics ROBOTICS-

CiteScore

2.00

自引率

22.20%

发文量

101

期刊介绍： Artificial Life and Robotics is an international journal publishing original technical papers and authoritative state-of-the-art reviews on the development of new technologies concerning artificial life and robotics, especially computer-based simulation and hardware for the twenty-first century. This journal covers a broad multidisciplinary field, including areas such as artificial brain research, artificial intelligence, artificial life, artificial living, artificial mind research, brain science, chaos, cognitive science, complexity, computer graphics, evolutionary computations, fuzzy control, genetic algorithms, innovative computations, intelligent control and modelling, micromachines, micro-robot world cup soccer tournament, mobile vehicles, neural networks, neurocomputers, neurocomputing technologies and applications, robotics, robus virtual engineering, and virtual reality. Hardware-oriented submissions are particularly welcome. Publishing body: International Symposium on Artificial Life and RoboticsEditor-in-Chiei: Hiroshi Tanaka Hatanaka R Apartment 101, Hatanaka 8-7A, Ooaza-Hatanaka, Oita city, Oita, Japan 870-0856 ©International Symposium on Artificial Life and Robotics