Educational Assessment最新文献

筛选
英文 中文
Beyond Agreement: Exploring Rater Effects in Large-Scale Mixed Format Assessments 超越协议:探索大规模混合格式评估中的评级效应
IF 1.5
Educational Assessment Pub Date : 2021-08-17 DOI: 10.1080/10627197.2021.1962277
Stefanie A. Wind, Wenjing Guo
{"title":"Beyond Agreement: Exploring Rater Effects in Large-Scale Mixed Format Assessments","authors":"Stefanie A. Wind, Wenjing Guo","doi":"10.1080/10627197.2021.1962277","DOIUrl":"https://doi.org/10.1080/10627197.2021.1962277","url":null,"abstract":"ABSTRACT Scoring procedures for the constructed-response (CR) items in large-scale mixed-format educational assessments often involve checks for rater agreement or rater reliability. Although these analyses are important, researchers have documented rater effects that persist despite rater training and that are not always detected in rater agreement and reliability analyses, such as severity/leniency, centrality/extremism, and biases. Left undetected, these effects pose threats to fairness. We illustrate how rater effects analyses can be incorporated into scoring procedures for large-scale mixed-format assessments. We used data from the National Assessment of Educational Progress (NAEP) to illustrate relatively simple analyses that can provide insight into patterns of rater judgment that may warrant additional attention. Our results suggested that the NAEP raters exhibited generally defensible psychometric properties, while also exhibiting some idiosyncrasies that could inform scoring procedures. Similar procedures could be used operationally to inform the interpretation and use of rater judgments in large-scale mixed-format assessments.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"26 1","pages":"264 - 283"},"PeriodicalIF":1.5,"publicationDate":"2021-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48562788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Investigating the Use of Assessment Data by Primary School Teachers: Insights from a Large-scale Survey in Ireland 调查小学教师评估数据的使用:来自爱尔兰大规模调查的见解
IF 1.5
Educational Assessment Pub Date : 2021-07-03 DOI: 10.1080/10627197.2021.1917358
Vasiliki Pitsia, Anastasios Karakolidis, P. Lehane
{"title":"Investigating the Use of Assessment Data by Primary School Teachers: Insights from a Large-scale Survey in Ireland","authors":"Vasiliki Pitsia, Anastasios Karakolidis, P. Lehane","doi":"10.1080/10627197.2021.1917358","DOIUrl":"https://doi.org/10.1080/10627197.2021.1917358","url":null,"abstract":"ABSTRACT Evidence suggests that the quality of teachers’ instructional practices can be improved when these are informed by relevant assessment data. Drawing on a sample of 1,300 primary school teachers in Ireland, this study examined the extent to which teachers use standardized test results for instructional purposes as well as the role of several factors in predicting this use. Specifically, the study analyzed data from a cross-sectional survey that gathered information about teachers’ use of, experiences with, and attitudes toward assessment data from standardized tests. After taking other teacher and school characteristics into consideration, the analysis revealed that teachers with more positive attitudes toward standardized tests and those who were often engaged in some form of professional development on standardized testing tended to use assessment data to inform their teaching more frequently. Based on the findings, policy and practice implications are discussed.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"26 1","pages":"145 - 162"},"PeriodicalIF":1.5,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2021.1917358","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42751021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Using Full-information Item Analysis to Improve Item Quality 利用全信息项目分析提高项目质量
IF 1.5
Educational Assessment Pub Date : 2021-07-03 DOI: 10.1080/10627197.2021.1946390
T. Haladyna, Michael C. Rodriguez
{"title":"Using Full-information Item Analysis to Improve Item Quality","authors":"T. Haladyna, Michael C. Rodriguez","doi":"10.1080/10627197.2021.1946390","DOIUrl":"https://doi.org/10.1080/10627197.2021.1946390","url":null,"abstract":"ABSTRACT Full-information item analysis provides item developers and reviewers comprehensive empirical evidence of item quality, including option response frequency, point-biserial index (PBI) for distractors, mean-scores of respondents selecting each option, and option trace lines. The multi-serial index (MSI) is introduced as a more informative item-total correlation, accounting for variable distractor performance. The overall item PBI is empirically compared to the MSI. For items from an operational mathematics and reading test, poorly performing distractors are systematically removed to recompute the MSI, indicating improvements in item quality. Case studies for specific items with different characteristics are described to illustrate a variety of outcomes, focused on improving item discrimination. Full-information item analyses are presented for each case study item, providing clear examples of interpretation and use of item analyses. A summary of recommendations for item analysts is provided.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"26 1","pages":"198 - 211"},"PeriodicalIF":1.5,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2021.1946390","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42811776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
The Impact of Disengaged Test Taking on a State’s Accountability Test Results 不参与考试对州问责制考试结果的影响
IF 1.5
Educational Assessment Pub Date : 2021-07-03 DOI: 10.1080/10627197.2021.1956897
S. Wise, Sukkeun Im, Jay Lee
{"title":"The Impact of Disengaged Test Taking on a State’s Accountability Test Results","authors":"S. Wise, Sukkeun Im, Jay Lee","doi":"10.1080/10627197.2021.1956897","DOIUrl":"https://doi.org/10.1080/10627197.2021.1956897","url":null,"abstract":"ABSTRACT This study investigated test-taking engagement on the Spring 2019 administration of a large-scale state summative assessment. Through the identification of rapid-guessing behavior – which is a validated indicator of disengagement – the percentage of Grade 8 test events with meaningful amounts of rapid guessing was 5.5% in mathematics, 6.7% in English Language Arts (ELA), and 3.5% in science. Disengagement rates on the state summative test were also found to vary materially across gender, ethnicity, Individualized Educational Plan (IEP) status, Limited English Proficient (LEP) status, free and reduced lunch (FRL) status, and disability status. However, school mean performance, proficiency rates, and relative ranking were only minimally affected by disengagement. Overall, results of this study indicate that disengagement has a material impact on individual state summative test scores, though its impact on score aggregations may be relatively minor.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"26 1","pages":"163 - 174"},"PeriodicalIF":1.5,"publicationDate":"2021-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2021.1956897","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45119734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Assessing Quality of Teaching from Different Perspectives: Measurement Invariance across Teachers and Classes 从不同角度评估教学质量:教师和班级之间的测量不变性
IF 1.5
Educational Assessment Pub Date : 2021-04-03 DOI: 10.1080/10627197.2020.1858785
G. Krammer, Barbara Pflanzl, Gerlinde Lenske, Johannes Mayr
{"title":"Assessing Quality of Teaching from Different Perspectives: Measurement Invariance across Teachers and Classes","authors":"G. Krammer, Barbara Pflanzl, Gerlinde Lenske, Johannes Mayr","doi":"10.1080/10627197.2020.1858785","DOIUrl":"https://doi.org/10.1080/10627197.2020.1858785","url":null,"abstract":"ABSTRACT Comparing teachers’ self-assessment to classes’ assessment of quality of teaching can offer insights for educational research and be a valuable resource for teachers’ continuous professional development. However, the quality of teaching needs to be measured in the same way across perspectives for this comparison to be meaningful. We used data from 622 teachers self-assessing aspects of quality of teaching and of their classes (12229 students) assessing the same aspects. Perspectives were compared with measurement invariance analyses. Teachers and classes agreed on the average level of instructional clarity, and disagreed over teacher-student relationship and performance monitoring, suggesting that mean differences across perspectives may not be as consistent as the literature claims. Results showed a nonuniform measurement bias for only one item of instructional clarity, while measurement of the other aspects was directly comparable. We conclude the viability of comparing teachers’ and classes’ perspectives of aspects of quality of teaching.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"26 1","pages":"88 - 103"},"PeriodicalIF":1.5,"publicationDate":"2021-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2020.1858785","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42517403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Predicting Retention in Higher Education from high-stakes Exams or School GPA 从高风险考试或学校平均成绩预测高等教育的保留率
IF 1.5
Educational Assessment Pub Date : 2021-02-06 DOI: 10.1080/10627197.2022.2130748
M. Meeter, M. V. van Brederode
{"title":"Predicting Retention in Higher Education from high-stakes Exams or School GPA","authors":"M. Meeter, M. V. van Brederode","doi":"10.1080/10627197.2022.2130748","DOIUrl":"https://doi.org/10.1080/10627197.2022.2130748","url":null,"abstract":"ABSTRACT The transition from secondary to tertiary education varies from country to country. In many countries, secondary school is concluded with high-stakes, national exams, or high-stakes entry tests are used for admissions to tertiary education. In other countries, secondary-school grade point average (GPA) is the determining factor. In the Netherlands, both play a role. With administrative data of close to 180,000 students, we investigated whether national exam scores or secondary school GPA was a better predictor of tertiary first-year retention. For both university education and higher professional education, secondary school GPA was the better prediction of retention, to the extent that national exams did not explain any additional variance. Moreover, for students who failed their exam, being held back by the secondary school for an additional year and entering tertiary education one year later, GPA in the year of failure remained as predictive as for students who had passed their exams and started tertiary education immediately. National exam scores, on the other hand, had no predictive value at all for these students. It is concluded that secondary school GPA measures aspects of student performance that is not included in high-stakes national exams, but that are predictive of subsequent success in tertiary education.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"28 1","pages":"1 - 10"},"PeriodicalIF":1.5,"publicationDate":"2021-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42962496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Anchors Aweigh: How the Choice of Anchor Items Affects the Vertical Scaling of 3PL Data with the Rasch Model Anchors Aweigh:锚定项目的选择如何影响Rasch模型下3PL数据的垂直缩放
IF 1.5
Educational Assessment Pub Date : 2021-01-20 DOI: 10.1080/10627197.2020.1858782
Glenn Thomas Waterbury, Christine E. DeMars
{"title":"Anchors Aweigh: How the Choice of Anchor Items Affects the Vertical Scaling of 3PL Data with the Rasch Model","authors":"Glenn Thomas Waterbury, Christine E. DeMars","doi":"10.1080/10627197.2020.1858782","DOIUrl":"https://doi.org/10.1080/10627197.2020.1858782","url":null,"abstract":"ABSTRACT Vertical scaling is used to put tests of different difficulty onto a common metric. The Rasch model is often used to perform vertical scaling, despite its strict functional form. Few, if any, studies have examined anchor item choice when using the Rasch model to vertically scale data that do not fit the model. The purpose of this study was to investigate the implications of anchor item choice on bias in growth estimates when data do not fit the Rasch model. Data were generated with varying levels of true difference between grades and levels of the lower asymptote. When true growth or the lower asymptote were zero, estimates were unbiased and anchor item choice was not consequential. As true growth and the lower asymptote both increased, growth was underestimated and choice of anchor items had an impact. Easy anchor items led to less biased estimates of growth than hard anchor items.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"26 1","pages":"175 - 197"},"PeriodicalIF":1.5,"publicationDate":"2021-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2020.1858782","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47196823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Model meets reality: Validating a new behavioral measure for test-taking effort 模型符合现实:验证测试工作的新行为度量
IF 1.5
Educational Assessment Pub Date : 2021-01-12 DOI: 10.1080/10627197.2020.1858786
Esther Ulitzsch, Christiane Penk, Matthias von Davier, S. Pohl
{"title":"Model meets reality: Validating a new behavioral measure for test-taking effort","authors":"Esther Ulitzsch, Christiane Penk, Matthias von Davier, S. Pohl","doi":"10.1080/10627197.2020.1858786","DOIUrl":"https://doi.org/10.1080/10627197.2020.1858786","url":null,"abstract":"ABSTRACT Identifying and considering test-taking effort is of utmost importance for drawing valid inferences on examinee competency in low-stakes tests. Different approaches exist for doing so. The speed-accuracy+engagement model aims at identifying non-effortful test-taking behavior in terms of nonresponse and rapid guessing based on responses and response times. The model allows for identifying rapid-guessing behavior on the item-by-examinee level whilst jointly modeling the processes underlying rapid guessing and effortful responding. To assess whether the model indeed provides a valid measure of test-taking effort, we investigate (1) convergent validity with previously developed behavioral as well as self-report measures on guessing behavior and effort, (2) fit within the nomological network of test-taking motivation derived from expectancy-value theory, and (3) ability to detect differences between groups that can be assumed to differ in test-taking effort. Results suggest that the model captures central aspects of non-effortful test-taking behavior. While it does not cover the whole spectrum of non-effortful test-taking behavior, it provides a measure for some aspects of it, in a manner that is less subjective than self-reports. The article concludes with a discussion of implications for the development of behavioral measures of non-effortful test-taking behavior.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"26 1","pages":"104 - 124"},"PeriodicalIF":1.5,"publicationDate":"2021-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2020.1858786","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46282388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Do They See What I See? Toward a Better Understanding of the 7Cs Framework of Teaching Effectiveness 他们看到我看到的了吗?更好地理解7Cs教学有效性框架
IF 1.5
Educational Assessment Pub Date : 2021-01-05 DOI: 10.1080/10627197.2020.1858784
S. Phillips, Ronald F. Ferguson, Jacob F. S. Rowley
{"title":"Do They See What I See? Toward a Better Understanding of the 7Cs Framework of Teaching Effectiveness","authors":"S. Phillips, Ronald F. Ferguson, Jacob F. S. Rowley","doi":"10.1080/10627197.2020.1858784","DOIUrl":"https://doi.org/10.1080/10627197.2020.1858784","url":null,"abstract":"ABSTRACT School systems are increasingly incorporating student perceptions of teaching effectiveness into educator accountability systems. Using Tripod’s 7Cs™ Framework of Teaching Effectiveness, this study examines key issues in validating student perception data for use in this manner. Analyses examine the internal structure of 7Cs scores and the extent to which scores predict key criteria. Results offer the first empirical evidence that 7Cs scores capture seven distinct dimensions of teaching effectiveness even as they also confirm prior research concluding 7Cs scores are largely unidimensional. At the same time, results demonstrate a modest relationship between 7Cs scores and teacher self-assessments of their own effectiveness. Together, findings suggest 7Cs scores can be used to collect meaningful information about over-arching effectiveness. However, additional evidence is warranted before giving 7Cs scores as much weight in high-stakes contexts as value-added test-score gains or expert classroom observations.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"26 1","pages":"69 - 87"},"PeriodicalIF":1.5,"publicationDate":"2021-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2020.1858784","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48730441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
The Effect of Linguistic Factors on Assessment of English Language Learners’ Mathematical Ability: A Differential Item Functioning Analysis 语言因素对英语学习者数学能力评价的影响——差异项功能分析
IF 1.5
Educational Assessment Pub Date : 2020-12-17 DOI: 10.1080/10627197.2020.1858783
Stephanie Buono, E. Jang
{"title":"The Effect of Linguistic Factors on Assessment of English Language Learners’ Mathematical Ability: A Differential Item Functioning Analysis","authors":"Stephanie Buono, E. Jang","doi":"10.1080/10627197.2020.1858783","DOIUrl":"https://doi.org/10.1080/10627197.2020.1858783","url":null,"abstract":"ABSTRACT Increasing linguistic diversity in classrooms has led researchers to examine the validity and fairness of standardized achievement tests, specifically concerning whether test score interpretations are free of bias and score use is fair for all students. This study examined whether mathematics achievement test items that contain complex language function differently between two language subgroups: native English speakers (EL1, n= 1 000), and English language learners (ELL, n= 1 000). Confirmatory Differential Item Functioning (DIF) analyses using a SIBTEST were performed on 28 mathematics assessment items. Eleven items were identified to have complex language features, and DIF analyses revealed that seven of these items (63%) favored EL1s over ELLs. Effect sizes were moderate (0.05 ≤ <0.10) for six items, and marginal ( <0.05) for one item. This paper discusses validity issues with math achievement test items assessing ELLs and calls for careful test development and instructional accommodation in the classroom.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"26 1","pages":"125 - 144"},"PeriodicalIF":1.5,"publicationDate":"2020-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2020.1858783","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49511402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信