Miss Predicting Readability of Health Educational Resources for Children Using Semantic Features

International Linguistics Research Pub Date : 1900-01-01 DOI:10.30560/ilr.v4n2p10

Yanmeng Liu

{"title":"Miss Predicting Readability of Health Educational Resources for Children Using Semantic Features","authors":"Yanmeng Liu","doi":"10.30560/ilr.v4n2p10","DOIUrl":null,"url":null,"abstract":"The success of health education resources largely depends on their readability, as the health information can only be understood and accepted by the target readers when the information is uttered with proper reading difficulty. Unlike other populations, children feature limited knowledge and underdeveloped reading comprehension, which poses more challenges for the readability research on health education resources. This research aims to explore the readability prediction of health education resources for children by using semantic features to develop machine learning algorithms. A data-driven method was applied in this research:1000 health education articles were collected from international health organization websites, and they were grouped into resources for kids and resources for non-kids according to their sources. Moreover, 73 semantic features were used to train five machine learning algorithms (decision tree, support vector machine, k-nearest neighbors algorithm, ensemble classifier, and logistic regression). The results showed that the k-nearest neighbors algorithm and ensemble classifier outperformed in terms of area under the operating characteristic curve sensitivity, specificity, and accuracy and achieved good performance in predicting whether the readability of health education resources is suitable for children or not.","PeriodicalId":261061,"journal":{"name":"International Linguistics Research","volume":"89 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Linguistics Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30560/ilr.v4n2p10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The success of health education resources largely depends on their readability, as the health information can only be understood and accepted by the target readers when the information is uttered with proper reading difficulty. Unlike other populations, children feature limited knowledge and underdeveloped reading comprehension, which poses more challenges for the readability research on health education resources. This research aims to explore the readability prediction of health education resources for children by using semantic features to develop machine learning algorithms. A data-driven method was applied in this research:1000 health education articles were collected from international health organization websites, and they were grouped into resources for kids and resources for non-kids according to their sources. Moreover, 73 semantic features were used to train five machine learning algorithms (decision tree, support vector machine, k-nearest neighbors algorithm, ensemble classifier, and logistic regression). The results showed that the k-nearest neighbors algorithm and ensemble classifier outperformed in terms of area under the operating characteristic curve sensitivity, specificity, and accuracy and achieved good performance in predicting whether the readability of health education resources is suitable for children or not.

查看原文本刊更多论文

基于语义特征的儿童健康教育资源可读性预测缺失

健康教育资源的成功与否在很大程度上取决于其可读性，因为健康信息只有在适当的阅读难度下才能被目标读者理解和接受。与其他人群不同，儿童知识有限，阅读理解能力不强，这给健康教育资源的可读性研究带来了更多的挑战。本研究旨在利用语义特征开发机器学习算法，探索儿童健康教育资源的可读性预测。本研究采用数据驱动的方法:从国际卫生组织网站上收集1000篇健康教育文章，根据来源将其分为儿童资源和非儿童资源。此外，73个语义特征用于训练五种机器学习算法(决策树、支持向量机、k近邻算法、集成分类器和逻辑回归)。结果表明，k近邻算法和集成分类器在工作特征曲线下面积的敏感性、特异性和准确性方面都有较好的表现，在预测健康教育资源的可读性是否适合儿童方面取得了较好的效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Linguistics Research

自引率

0.00%

发文量