{"title":"小学生害羞预测与语言风格模型构建","authors":"Fang Luo, Liming Jiang, Xuetao Tian, Mengge Xiao, Y. Ma, Sheng Zhang","doi":"10.3724/SP.J.1041.2021.00155","DOIUrl":null,"url":null,"abstract":"The present study aimed to explore a new method of measuring shyness based on 1306 elementary school students’ online writing texts. A supervised learning method was used to map students' labels (tagged by their results of scale) with their text features (extracted from online writing texts based on a psychological dictionary) to build a machine learning model. Key feature sets for different dimensions of shyness were built and a machine learning model was constructed based on the selected feature to achieve automatic prediction. The labels were obtained through “National School Children Shyness Scale” completed online by elementary students. The scale includes three dimensions of shyness: shy behavior, shy cognition and shy emotion. Students with Z-scores of each dimension over 1 were labeled as shy and others were labeled as normal. Students’ online writing texts were collected from \"TeachGrid\" (https://www.jiaokee.com/), an online learning platform wherein students writing texts. The dictionary applied in the present study was Textmind, a widely used Chinese psychological dictionary developed based on Linguistic Inquiry and Word Count (LIWC). The dictionary was compiled mainly based on the corpus of adults. To ensure the validity of extracted features, we modified the original dictionary by expanding the categories and vocabulary with the real writing text of elementary students. The revised dictionary contained 118 categories. based Chi-square sentence and the frequency of social words of shy individuals were less than that of normal counterparts.), and there were certain features reflected the unique characteristics of certain dimension (Perception words predicted shy behavior reflecting that high shy behavior individuals frequently felt being watched). Based on the selected features, Python 3.6.2 was used to construct the six prediction modes: Decision Tree, Random Forest, Support Vector Machine, Logistic Stitch Regression, K-Nearest Neighbor and Multilayer Perceptron. Overall, random forests have achieved the best results in the present study. The F1 score was 0.582, 0.552 and 0.545 for behavior cognition and emotion showing the feasibility of automatically predicting shyness characteristics of elementary school students based on textual language. The implication of word embedding, and deep learning models would improve the final prediction.","PeriodicalId":36627,"journal":{"name":"心理学报","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2021-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Shyness prediction and language style model construction of elementary school students\",\"authors\":\"Fang Luo, Liming Jiang, Xuetao Tian, Mengge Xiao, Y. Ma, Sheng Zhang\",\"doi\":\"10.3724/SP.J.1041.2021.00155\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The present study aimed to explore a new method of measuring shyness based on 1306 elementary school students’ online writing texts. A supervised learning method was used to map students' labels (tagged by their results of scale) with their text features (extracted from online writing texts based on a psychological dictionary) to build a machine learning model. Key feature sets for different dimensions of shyness were built and a machine learning model was constructed based on the selected feature to achieve automatic prediction. The labels were obtained through “National School Children Shyness Scale” completed online by elementary students. The scale includes three dimensions of shyness: shy behavior, shy cognition and shy emotion. Students with Z-scores of each dimension over 1 were labeled as shy and others were labeled as normal. Students’ online writing texts were collected from \\\"TeachGrid\\\" (https://www.jiaokee.com/), an online learning platform wherein students writing texts. The dictionary applied in the present study was Textmind, a widely used Chinese psychological dictionary developed based on Linguistic Inquiry and Word Count (LIWC). The dictionary was compiled mainly based on the corpus of adults. To ensure the validity of extracted features, we modified the original dictionary by expanding the categories and vocabulary with the real writing text of elementary students. The revised dictionary contained 118 categories. based Chi-square sentence and the frequency of social words of shy individuals were less than that of normal counterparts.), and there were certain features reflected the unique characteristics of certain dimension (Perception words predicted shy behavior reflecting that high shy behavior individuals frequently felt being watched). Based on the selected features, Python 3.6.2 was used to construct the six prediction modes: Decision Tree, Random Forest, Support Vector Machine, Logistic Stitch Regression, K-Nearest Neighbor and Multilayer Perceptron. Overall, random forests have achieved the best results in the present study. The F1 score was 0.582, 0.552 and 0.545 for behavior cognition and emotion showing the feasibility of automatically predicting shyness characteristics of elementary school students based on textual language. The implication of word embedding, and deep learning models would improve the final prediction.\",\"PeriodicalId\":36627,\"journal\":{\"name\":\"心理学报\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2021-02-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"心理学报\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.3724/SP.J.1041.2021.00155\",\"RegionNum\":4,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"PSYCHOLOGY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"心理学报","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.3724/SP.J.1041.2021.00155","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
Shyness prediction and language style model construction of elementary school students
The present study aimed to explore a new method of measuring shyness based on 1306 elementary school students’ online writing texts. A supervised learning method was used to map students' labels (tagged by their results of scale) with their text features (extracted from online writing texts based on a psychological dictionary) to build a machine learning model. Key feature sets for different dimensions of shyness were built and a machine learning model was constructed based on the selected feature to achieve automatic prediction. The labels were obtained through “National School Children Shyness Scale” completed online by elementary students. The scale includes three dimensions of shyness: shy behavior, shy cognition and shy emotion. Students with Z-scores of each dimension over 1 were labeled as shy and others were labeled as normal. Students’ online writing texts were collected from "TeachGrid" (https://www.jiaokee.com/), an online learning platform wherein students writing texts. The dictionary applied in the present study was Textmind, a widely used Chinese psychological dictionary developed based on Linguistic Inquiry and Word Count (LIWC). The dictionary was compiled mainly based on the corpus of adults. To ensure the validity of extracted features, we modified the original dictionary by expanding the categories and vocabulary with the real writing text of elementary students. The revised dictionary contained 118 categories. based Chi-square sentence and the frequency of social words of shy individuals were less than that of normal counterparts.), and there were certain features reflected the unique characteristics of certain dimension (Perception words predicted shy behavior reflecting that high shy behavior individuals frequently felt being watched). Based on the selected features, Python 3.6.2 was used to construct the six prediction modes: Decision Tree, Random Forest, Support Vector Machine, Logistic Stitch Regression, K-Nearest Neighbor and Multilayer Perceptron. Overall, random forests have achieved the best results in the present study. The F1 score was 0.582, 0.552 and 0.545 for behavior cognition and emotion showing the feasibility of automatically predicting shyness characteristics of elementary school students based on textual language. The implication of word embedding, and deep learning models would improve the final prediction.