Ke ZHANG, Ramazan YILMAZ, Ahmet Berk USTUN, Fatma Gizem KARAOĞLAN YILMAZ
{"title":"Learning analytics in formative assessment: A systematic literature review","authors":"Ke ZHANG, Ramazan YILMAZ, Ahmet Berk USTUN, Fatma Gizem KARAOĞLAN YILMAZ","doi":"10.21031/epod.1272054","DOIUrl":"https://doi.org/10.21031/epod.1272054","url":null,"abstract":"This systematic review examines the use of learning analytics (LA) in formative assessment (FA). LA is a powerful tool that can support FA by providing real-time feedback to students and teachers. The review analyzes studies published on Web of Science and Scopus databases between 2011 and 2022 that provide an overview of the current state of published research on the use of LA for FA in diverse learning environments and through different delivery modes. This review also explores the significant potential of LA in FA practices in digital learning. A total of 63 studies met all selection criteria and were fully reviewed by conducting multiple analyses including selected bibliometrics, a categorical meta-trends analysis and inductive content analysis. The results indicate that the number of LA in FA studies has experienced a significant surge over the past decade. The results also show the current state of research on LA in FA, through a range of disciplines, journals, research methods, learning environments and delivery modes. This review can help inform the implementation of LA in educational contexts to support effective FA practices. However, the review also highlights the need for further research.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135252156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Investigation of differential item and step functioning procedurs in polytomously scored items","authors":"Yasemin KUZU, Selahattin GELBAL","doi":"10.21031/epod.1221823","DOIUrl":"https://doi.org/10.21031/epod.1221823","url":null,"abstract":"This study aimed to compare differential item functioning (DIF) and differential step function (DSF) detection methods in polytomously scored items under various conditions. In this context, the study examined Kazakhstan, Turkey and USA data obtained from the items related to the frequency of using digital devices at school in PISA 2018 students’ “ICT Familiarity Questionnaire”. Mantel test, Liu-Agresti statistics, Cox β and poly-SIBTEST methods were used for polytomous DIF analysis while Adjacent Category Logistic Regression Model and Cumulative Category Log Odds Ratio methods were used for DSF analysis. This study was carried out with correlational survey model, by using “differential category combining, focus group sample size, focus group: reference group sample ratio and DIF/DSF detection method”. SAS and R software were utilized in the creation of conditions; SIBTEST was used for poly-SIBTEST for analysis and DIFAS programs were used for the other methods. Analyses demonstrated that the number of items/steps exhibiting high level of DIF/DSF was higher in the small sample according to polytomous DIF methods and in the large sample compared to DSF methods. During the steps, it was stated that the DIF value was lower in the items containing DSF with the opposite sign; therefore, not performing DSF analysis in an item with no DIF may yield erroneous results. Although the differential category combining conditions created within the scope of the research did not have a systematic effect on the results, it was suggested to examine this situation in future studies, considering that the frequency of marking the combined categories differentiated the results.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ability Estimation with Polytomous Items in Computerized Multistage Tests","authors":"Hasibe YAHSİ SARI, Hülya KELECİOĞLU","doi":"10.21031/epod.1056079","DOIUrl":"https://doi.org/10.21031/epod.1056079","url":null,"abstract":"The aim of the study is to examine how the ability estimations of individuals change under different conditions in tests consisting of polytomous items in an computerized multistage test environment. The research is a simulation study. In the study, 108 (3x3x6x2=108) conditions were examined consisting of three categories (3, 4 and 5), three test lengths (10, 20 and 30), six panel designs (1-2, 1-2-2, 1-3, 1-3-3, 1-4 and 1-4-4) and two routing methods (Maximum Fisher Information (MFI) and Random). Simulations and analyses were carried out in the mstR package in R program, with a pool of 200 items, 1000 people and 100 replications (e.g., iterations). As the outcomes of the research, mean absolute bias, RMSE and correlation values were calculated. It was found that as the number of categories and test length increase, the mean absolute bias and RMSE values decrease, while the correlation values increase. In terms of routing methods, although MFI and random methods have similar tendencies, MFI gives better results. There is a similarity between the panel designs in terms of results.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136278093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rubrics in Terms of Development Processes and Misconceptions","authors":"Fuat ELKONCA, Görkem CEYHAN, Mehmet ŞATA","doi":"10.21031/epod.1251470","DOIUrl":"https://doi.org/10.21031/epod.1251470","url":null,"abstract":"The present study aimed to examine the development process of rubrics in theses indexed in the national thesis database and to identify any misconceptions presented in these rubrics. A qualitative research approach utilizing document analysis was employed. The sample of theses was selected based on literature review and criteria established by expert opinions, resulting in a total of 395 theses being included in the study using criterion sampling. Data were collected through a \"thesis review form\" developed by the researchers. Descriptive analysis was employed for data analysis. Findings indicated that approximately 27% of the 395 theses contained misconceptions, with a disproportionate percentage of these misconceptions being found in master's theses. Regarding the field of the thesis, the highest rate of misconceptions was observed in health, social sciences, special education, and fine arts, while the lowest rate was found in education and linguistics. Additionally, theses with misconceptions tended to possess a lower degree of validity and reliability evidence compared to those without misconceptions. This difference was found to be statistically significant for both validity evidence and reliability evidence. In theses without misconceptions, the most frequently presented validity evidence was expert opinion, while the reliability evidence was found to be the percentage of agreement. The findings were discussed in relation to the existing literature, and recommendations were proposed.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of Peer and Self-Assessments Using the Many-facet Rasch Measurement Model and Student Opinions","authors":"Seda DEMİR","doi":"10.21031/epod.1344196","DOIUrl":"https://doi.org/10.21031/epod.1344196","url":null,"abstract":"The aim of this study is to analyze the peer and self-assessments of higher education students' oral presentation skills with the many-facet Rasch measurement model and to determine students' opinions on peer and self-assessment. In the study, convergent parallel method, one of the mixed-method research approaches, was used. The study group consisted of 11 university students studying at a state university in the 2022-2023 academic year. The FACETS program was used to analyze the data. The three facets identified in the study were the assessee (11 students), the assessor (11 students) and the items (16 items). Therefore, 11 participants scored (peer and self-assessment) on a 16-item assessment form. In addition, students' opinions on peer and self-assessment were obtained through three open-ended interview questions prepared by the researcher. According to the results of the study, it was determined that there was a statistically significant difference between the students in terms of their oral presentation skills, between the assessors in terms of their strictness/generosity in scoring, and between the criteria (items) in terms of the level of difficulty in realization. In addition, the participant opinions obtained from each interview question were analyzed through themes and sub-themes formed according to the general thoughts on peer and self-assessment, experiences, and whether the participants considered themselves as a reliable rater or not. In terms of practice, it can be suggested to provide detailed and enlightening information to students before peer and/or self-assessment in the classroom environment, and to give quick feedback to those who have not done the assessment appropriately. In addition, the reasons for the biases identified in peer and self-assessments in the current study can be investigated in future studies.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136277919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Bibliometric Analysis on Power Analysis Studies","authors":"Gül GÜLER","doi":"10.21031/epod.1343984","DOIUrl":"https://doi.org/10.21031/epod.1343984","url":null,"abstract":"The primary purpose of this study was to establish a theoretical framework for studies on power analysis conducted in the fields of education, psychology, and statistics for researchers. Therefore, the bibliometric characteristics of publications related to power analysis in the Web of Science database were analyzed using the Biblioshiny interface in the R programming language. The study identified influential studies on power analysis in education, psychology, and statistics. It also determined which concepts were associated with power analysis over the years and the authors and countries that contributed to the advancement of research regarding this concept. This research was conducted based on 515 studies that were included following specific criteria. It was found that the studies published between 1970 and 2023 were obtained from 183 sources, with a total of 1246 authors. There were 98 single-authored studies, and the number of co-authors per study was 2.88 on average. According to Bradford’s Law, Behavior Research Methods, Psychological Methods, and Multivariate Behavioral Research were the most productive journals concerning power analysis, taking up a larger proportion within the core sources compared to other journals. These journals were among the top three in terms of the number of publications, h-index, total number of citations, and publication rankings. These journals were followed by Structural Equation Modeling-A Multidisciplinary Journal, Frontiers in Psychology, and Educational and Psychological Measurement. An examination of studies on power analysis in education, psychology, and statistics according to Lotka's Law indicated that the relevant literature is insufficient and needs further development.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Investigation of The Measurement Invariance of Affective Characteristics Related to TIMSS 2019 Mathematics Achievement by Gender","authors":"Mehmet ATILGAN, Kaan Zulfikar DENİZ","doi":"10.21031/epod.1221365","DOIUrl":"https://doi.org/10.21031/epod.1221365","url":null,"abstract":"This research examines whether the affective characteristics of the TIMSS 2019 Turkey mathematics application provide measurement invariance according to gender. The research sample consists of 4048 8th-grade students participating in the TIMSS in 2019. Research data were downloaded from the international website of TIMSS. The research data collection tools are “Sense of School Belonging”, “Students Confident in Mathematics”, “Students Like Learning Mathematics”, and “Students Value Mathematics” scales. Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA) were performed in the context of validity analyses to examine measurement invariance. In terms of reliability, the Cronbach Alfa internal consistency coefficient was calculated. Accordingly, out of the four scales in the study, only “Students Confident in Mathematics” scale could not be confirmed in confirmatory factor analysis. Therefore, while “Students Confident in Mathematics” scale was not examined for measurement invariance, the other three scales were examined within the scope of measurement invariance. For measurement invariance, research data were tested with Multiple Group Confirmatory Factor Analysis (MG-CFA), one of the Structural Equation Modeling (SEM) techniques. As a result of the analyses, while the strict invariance model was provided in “Students Like Learning Mathematics” scale and “Students Value Mathematics” scale, strong invariance/scale invariance model was provided in “Sense of School Belonging” scale. It was concluded that there was no gender bias in the three scales for which MG-CFA was performed, and the mean scores were comparable according to gender. In this context, it can be said that “Sense of School Belonging”, “Students Like Learning Mathematics”, and “Students Value Mathematics” scales are valid in determining the differences according to gender.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"2013 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136278878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Weighting Method in Meta-Analysis: The Weighting with Reliability Coefficient","authors":"Yıldız YILDIRIM, Şeref TAN","doi":"10.21031/epod.1351485","DOIUrl":"https://doi.org/10.21031/epod.1351485","url":null,"abstract":"This study aimed to investigate the impact of various weighting methods for effect sizes on the outcomes of meta-analyses that examined the effects of the 5E teaching method on academic achievement in science education. Two effect size weighting methods were explored: one based on the inverse of the sampling error variance and the other utilizing the reliability of measures in primary studies. The study also assessed the influence of including gray literature on the meta-analysis results, considering factors such as high heterogeneity and publication bias. The research followed a basic research design and drew data from 112 studies, encompassing a total of 149 effect sizes. An exhaustive search of databases and archives, including Google Scholar, Dergipark, HEI Thesis Center, Proquest, Science Direct, ERIC, Taylor & Francis, EBSCOhost, Web of Science, and five journals was conducted to gather these studies. Analyses were performed by utilizing the CMA v2 software and employing the random effects model. The findings demonstrated divergent outcomes between the two weighting methods—weighting by reliability coefficient yielded higher overall effect sizes and standard errors compared to weighting by inverse variance. Ultimately, the inclusion of gray literature was found to not significantly impact any of the weighting methods employed.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136279253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Training 21st Century English Language Teachers in Turkish Context: Development of a Technology-Enhanced Measurement Curriculum","authors":"Burcu ŞENTÜRK, Beyza AKSU DÜNYA, Mehmet Can DEMİR","doi":"10.21031/epod.1261763","DOIUrl":"https://doi.org/10.21031/epod.1261763","url":null,"abstract":"A case study that included 26 English Language teacher candidates was designed for developing an evidence-based measurement curriculum in Turkey, examining teacher candidates’ experiences on the newly developed course and taking remedial actions for updating the syllabus if needed. Data was collected using multiple sources: a pre-course survey, weekly discussion board on Edmodo and a post-course survey. Survey data obtained from rating-scale items was analyzed using descriptive statistics and data visualization packages embedded in R. Open-ended survey data and discussion board data were content-analyzed using MaxQDA software. The results revealed that students had limited awareness regarding assessment for learning concept and digital tools that could be used for assessment for learning purposes at the beginning of the course. Course content, in-class activities and projects helped them develop hands-on skills in developing sound language assessments as well as raised their awareness with respect to the importance of computer-based language assessment.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135297591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Investigation of the Effect of Online (Web-Based) Formative Assessment Applications on Students' Academic Achievement","authors":"Bayram ÇETİN, Şeref AKPINAR","doi":"10.21031/epod.1320182","DOIUrl":"https://doi.org/10.21031/epod.1320182","url":null,"abstract":"The aim of this research is to determine the secondary education 10. the aim of this study is to examine the effect of the applications of providing resources for learning disabilities by the system and providing feedback for learning disabilities by the teacher within the scope of online (web-based) formative evaluation application of mathematics course of second-degree equations of classroom students on the students' achievements. In the research, it was used using a semi-experimental pattern. Pre-test - post-test success tests and monitoring facilities were used. The research was conducted in the 2022-2023 academic year with a total of 302 students selected from 4 schools and 12 branches in Göksun and Ağrın districts using stratified, random cluster sampling method. The data were analyzed by one-way analysis of variance (ANOVA) and covariance analysis (ANCOVA). According to the results of the research, it was found that there was no statistically significant difference between the pre-test averages of the groups, but a statistically significant difference appeared in the post-test. Dec. The provision of resources for learning disabilities by the system applied to the Experiment-2 group and the provision of detailed feedback by the teacher according to the Cognitive Diagnostic Modeling (BTM) for learning disabilities, the provision of resources for learning disabilities by the system applied to the Experiment-1 group and normal teaching applied to the Control group; the provision of resources for learning disabilities by the system applied to the experiment-1 group and normal teaching applied to the control group were also found to be effective. In addition, according to the results of the experimental processing process, Experiment-2 showed a higher level of development between the pre- and Decal test averages than Experiment-1 and Experiment-1 from the Control group.","PeriodicalId":43015,"journal":{"name":"Journal of Measurement and Evaluation in Education and Psychology-EPOD","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136108528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}