Andrey Sakhovskiy, E. Tutubalina, V. Solovyev, M. Solnyshkina
{"title":"主题建模作为教学文本结构的一种方法","authors":"Andrey Sakhovskiy, E. Tutubalina, V. Solovyev, M. Solnyshkina","doi":"10.1109/DeSE51703.2020.9450232","DOIUrl":null,"url":null,"abstract":"This article explores the problems of assigning documents to a limited number of topics and automating the process of topic structuring of Russian educational texts. For this purpose, we compiled an original corpus of school textbooks on Social Science. We utilized the Latent Dirichlet Allocation model for selection and comparative analysis of topics in the textbooks of different grades. This approach allows the reconstruction of the matrix of topics for each textbook in the сorpus. The research demonstrated a grade ranked character of the topics in the text collection under study, in particular, there is a higher cohesion of topics in high school. The research also offers an innovative methodology of quantitative describing topics dynamics in the textbook collection. It allows visualization and comparison of strategies for presenting educational topics by different authors. The results received can be beneficial for both textbook writers as well as teachers and schoolchildren.","PeriodicalId":124051,"journal":{"name":"2020 13th International Conference on Developments in eSystems Engineering (DeSE)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Topic Modeling as a Method of Educational Text Structuring\",\"authors\":\"Andrey Sakhovskiy, E. Tutubalina, V. Solovyev, M. Solnyshkina\",\"doi\":\"10.1109/DeSE51703.2020.9450232\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article explores the problems of assigning documents to a limited number of topics and automating the process of topic structuring of Russian educational texts. For this purpose, we compiled an original corpus of school textbooks on Social Science. We utilized the Latent Dirichlet Allocation model for selection and comparative analysis of topics in the textbooks of different grades. This approach allows the reconstruction of the matrix of topics for each textbook in the сorpus. The research demonstrated a grade ranked character of the topics in the text collection under study, in particular, there is a higher cohesion of topics in high school. The research also offers an innovative methodology of quantitative describing topics dynamics in the textbook collection. It allows visualization and comparison of strategies for presenting educational topics by different authors. The results received can be beneficial for both textbook writers as well as teachers and schoolchildren.\",\"PeriodicalId\":124051,\"journal\":{\"name\":\"2020 13th International Conference on Developments in eSystems Engineering (DeSE)\",\"volume\":\"144 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 13th International Conference on Developments in eSystems Engineering (DeSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DeSE51703.2020.9450232\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 13th International Conference on Developments in eSystems Engineering (DeSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DeSE51703.2020.9450232","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Topic Modeling as a Method of Educational Text Structuring
This article explores the problems of assigning documents to a limited number of topics and automating the process of topic structuring of Russian educational texts. For this purpose, we compiled an original corpus of school textbooks on Social Science. We utilized the Latent Dirichlet Allocation model for selection and comparative analysis of topics in the textbooks of different grades. This approach allows the reconstruction of the matrix of topics for each textbook in the сorpus. The research demonstrated a grade ranked character of the topics in the text collection under study, in particular, there is a higher cohesion of topics in high school. The research also offers an innovative methodology of quantitative describing topics dynamics in the textbook collection. It allows visualization and comparison of strategies for presenting educational topics by different authors. The results received can be beneficial for both textbook writers as well as teachers and schoolchildren.