Practical Assessment, Research and Evaluation最新文献

筛选
英文 中文
Assessing the Assessment: Rubrics Training for Pre-Service and New In-Service Teachers. 评估评估:职前教师和新在职教师的培训大纲。
Practical Assessment, Research and Evaluation Pub Date : 2011-10-01 DOI: 10.7275/SJT6-5K13
Michael G. Lovorn, A. Rezaei
{"title":"Assessing the Assessment: Rubrics Training for Pre-Service and New In-Service Teachers.","authors":"Michael G. Lovorn, A. Rezaei","doi":"10.7275/SJT6-5K13","DOIUrl":"https://doi.org/10.7275/SJT6-5K13","url":null,"abstract":"","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74739725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Best Practices in Using Large, Complex Samples: The Importance of Using Appropriate Weights and Design Effect Compensation. 使用大型复杂样本的最佳实践:使用适当权重和设计效果补偿的重要性。
Practical Assessment, Research and Evaluation Pub Date : 2011-09-01 DOI: 10.7275/2KYG-M659
J. Osborne
{"title":"Best Practices in Using Large, Complex Samples: The Importance of Using Appropriate Weights and Design Effect Compensation.","authors":"J. Osborne","doi":"10.7275/2KYG-M659","DOIUrl":"https://doi.org/10.7275/2KYG-M659","url":null,"abstract":"Large surveys often use probability sampling in order to obtain representative samples, and these data sets are valuable tools for researchers in all areas of science. Yet many researchers are not formally prepared to appropriately utilize these resources. Indeed, users of one popular dataset were generally found not to have modeled the analyses to take account of the complex sample (Johnson & Elliott, 1998) even when publishing in highly-regarded journals. It is well known that failure to appropriately model the complex sample can substantially bias the results of the analysis. Examples presented in this paper highlight the risk of error of inference and mis-estimation of parameters from failure to analyze these data sets appropriately.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79354428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
A Graphical Transition Table for Communicating Status and Growth. 沟通状态和成长的图形化过渡表。
Practical Assessment, Research and Evaluation Pub Date : 2011-06-01 DOI: 10.7275/T9R9-D719
Adam E. Wyse, Ji Zeng, Joseph A. Martineau
{"title":"A Graphical Transition Table for Communicating Status and Growth.","authors":"Adam E. Wyse, Ji Zeng, Joseph A. Martineau","doi":"10.7275/T9R9-D719","DOIUrl":"https://doi.org/10.7275/T9R9-D719","url":null,"abstract":"This paper introduces a simple and intuitive graphical display for transition table based accountability models that can be used to communicate information about students’ status and growth simultaneously. This graphical transition table includes the use of shading to convey year to year transitions and different sized letters for performance categories to depict yearly status. Examples based on Michigan’s transition table used on their Michigan Educational Assessment Program (MEAP) assessments are provided to illustrate the utility of the graphical transition table in practical contexts. Additional potential applications of the graphical transition table are also suggested.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81305010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Too Reliable to Be True? Response Bias as a Potential Source of Inflation in Paper-and-Pencil Questionnaire Reliability. 太可靠而不真实?反应偏差是纸笔问卷可靠性膨胀的潜在来源。
Practical Assessment, Research and Evaluation Pub Date : 2011-06-01 DOI: 10.7275/E482-N724
Eyal Péer, Eyal Gamliel
{"title":"Too Reliable to Be True? Response Bias as a Potential Source of Inflation in Paper-and-Pencil Questionnaire Reliability.","authors":"Eyal Péer, Eyal Gamliel","doi":"10.7275/E482-N724","DOIUrl":"https://doi.org/10.7275/E482-N724","url":null,"abstract":"When respondents answer paper-and-pencil (PP) questionnaires, they sometimes modify their responses to correspond to previously answered items. As a result, this response bias might artificially inflate the reliability of PP questionnaires. We compared the internal consistency of PP questionnaires to computerized questionnaires that presented a different number of items on a computer screen simultaneously. Study 1 showed that a PP questionnaire’s internal consistency was higher than that of the same questionnaire presented on a computer screen with one, two or four questions per screen. Study 2 replicated these findings to show that internal consistency was also relatively high when all questions were shown on one screen. This suggests that the differences found in Study 1 were not due to the difference in presentation medium. Thus, this paper suggests that reliability measures of PP questionnaires might be inflated because of a response bias resulting from participants cross-checking their answers against ones given to previous questions.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85264898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
Is a Picture Is Worth a Thousand Words? Creating Effective Questionnaires with Pictures. 一幅图胜过千言万语吗?用图片制作有效的问卷。
Practical Assessment, Research and Evaluation Pub Date : 2011-05-01 DOI: 10.7275/BGPE-A067
Laura Reynolds-Keefer, Robert Johnson
{"title":"Is a Picture Is Worth a Thousand Words? Creating Effective Questionnaires with Pictures.","authors":"Laura Reynolds-Keefer, Robert Johnson","doi":"10.7275/BGPE-A067","DOIUrl":"https://doi.org/10.7275/BGPE-A067","url":null,"abstract":"In developing attitudinal instruments for young children, researchers, program evaluators, and clinicians often use response scales with pictures or images (e.g., smiley faces) as anchors. This article considers highlights connections between word-based and picture based Likert scales and highlights the value in translating conventions used in word-based Likert scales to those with pictures or images.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87150928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 62
Applying Tests of Equivalence for Multiple Group Comparisons: Demonstration of the Confidence Interval Approach. 多组比较等效检验的应用:置信区间方法的论证。
Practical Assessment, Research and Evaluation Pub Date : 2011-04-01 DOI: 10.7275/D5WF-5P77
Shayna A. Rusticus, C. Lovato
{"title":"Applying Tests of Equivalence for Multiple Group Comparisons: Demonstration of the Confidence Interval Approach.","authors":"Shayna A. Rusticus, C. Lovato","doi":"10.7275/D5WF-5P77","DOIUrl":"https://doi.org/10.7275/D5WF-5P77","url":null,"abstract":"Assessing the comparability of different groups is an issue facing many researchers and evaluators in a variety of settings. Commonly, null hypothesis significance testing (NHST) is incorrectly used to demonstrate comparability when a non-significant result is found. This is problematic because a failure to find a difference between groups is not equivalent to showing that the groups are comparable. This paper provides a comparison of the confidence interval approach to equivalency testing and the more traditional analysis of variance (ANOVA) method using both continuous and rating scale data from three geographically separate medical education teaching sites. Equivalency testing is recommended as a better alternative to demonstrating comparability through its examination of whether mean differences between two groups are small enough that these differences can be considered practically unimportant and thus, the groups can be treated as equivalent.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85520336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Evaluating the Quantity-Quality Trade-off in the Selection of Anchor Items: a Vertical Scaling Approach 评价锚项目选择中的数量-质量权衡:一种垂直尺度方法
Practical Assessment, Research and Evaluation Pub Date : 2011-04-01 DOI: 10.7275/NNCY-EW26
Florian Pibal, H. Cesnik
{"title":"Evaluating the Quantity-Quality Trade-off in the Selection of Anchor Items: a Vertical Scaling Approach","authors":"Florian Pibal, H. Cesnik","doi":"10.7275/NNCY-EW26","DOIUrl":"https://doi.org/10.7275/NNCY-EW26","url":null,"abstract":"When administering tests across grades, vertical scaling is often employed to place scores from different tests on a common overall scale so that test-takers’ progress can be tracked. In order to be able to link the results across grades, however, common items are needed that are included in both test forms. In the literature there seems to be no clear agreement about the ideal number of common items. In line with some scholars, we argue that a greater number of anchor items bear a higher risk of unwanted effects like displacement, item drift, or undesired fit statistics and that having fewer psychometrically well-functioning anchor items can sometimes be more desirable. In order to demonstrate this, a study was conducted that included the administration of a reading-comprehension test to 1,350 test-takers across grades 6 to 8. In employing a step-by-step approach, we found that the paradox of high item drift in test administrations across grades can be mitigated and eventually even be eliminated. At the same time, a positive side effect was an increase in the explanatory power of the empirical data. Moreover, it was found that scaling adjustment can be used to evaluate the effectiveness of a vertical scaling approach and, in certain cases, can lead to more accurate results than the use of calibrated anchor items.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91005580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Do more online instructional ratings lead to better prediction of instructor quality 更多的在线教学评分能更好地预测教师的质量吗
Practical Assessment, Research and Evaluation Pub Date : 2011-02-01 DOI: 10.7275/NHNN-1N13
S. Sanders, Bhavneet Walia, Joel Potter, Kenneth W. Linna
{"title":"Do more online instructional ratings lead to better prediction of instructor quality","authors":"S. Sanders, Bhavneet Walia, Joel Potter, Kenneth W. Linna","doi":"10.7275/NHNN-1N13","DOIUrl":"https://doi.org/10.7275/NHNN-1N13","url":null,"abstract":"Online instructional ratings are taken by many with a grain of salt. This study analyzes the ability of said ratings to estimate the official (university-administered) instructional ratings of the same respective university instructors. Given self-selection among raters, we further test whether more online ratings of instructors lead to better prediction of official ratings in terms of both R-squared value and root mean squared error. We lastly test and correct for heteroskedastic error terms in the regression analysis to allow for the first robust estimations on the topic. Despite having a starkly different distribution of values, online ratings explain much of the variation in official ratings. This conclusion strengthens, and root mean squared error typically falls, as one considers regression subsets over which instructors have a larger number of online ratings. Though (public) online ratings do not mimic the results of (semi-private) official ratings, they provide a reliable source of information for predicting official ratings. There is strong evidence that this reliability increases in online rating usage.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85329773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Termination Criteria for Computerized Classification Testing. 计算机分类试验终止标准。
Practical Assessment, Research and Evaluation Pub Date : 2011-02-01 DOI: 10.7275/WQ8M-ZK25
Nathan A. Thompson
{"title":"Termination Criteria for Computerized Classification Testing.","authors":"Nathan A. Thompson","doi":"10.7275/WQ8M-ZK25","DOIUrl":"https://doi.org/10.7275/WQ8M-ZK25","url":null,"abstract":"Computerized classification testing (CCT) is an approach to designing tests with intelligent algorithms, similar to adaptive testing, but specifically designed for the purpose of classifying examinees into categories such as “pass” and “fail.” Like adaptive testing for point estimation of ability, the key component is the termination criterion, namely the algorithm that decides whether to classify the examinee and end the test or to continue and administer another item. This paper applies a newly suggested termination criterion, the generalized likelihood ratio (GLR), to CCT. It also explores the role of the indifference region in the specification of likelihood-ratio based termination criteria, comparing the GLR to the sequential probability ratio test. Results from simulation studies suggest that the GLR is always at least as efficient as existing methods.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74696461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
FORMATIVE USE OF ASSESSMENT INFORMATION: IT'S A PROCESS, SO LET'S SAY WHAT WE MEAN 评估信息的形成性使用:这是一个过程,所以让我们说一下我们的意思
Practical Assessment, Research and Evaluation Pub Date : 2011-02-01 DOI: 10.7275/3YVY-AT83
Robert Good
{"title":"FORMATIVE USE OF ASSESSMENT INFORMATION: IT'S A PROCESS, SO LET'S SAY WHAT WE MEAN","authors":"Robert Good","doi":"10.7275/3YVY-AT83","DOIUrl":"https://doi.org/10.7275/3YVY-AT83","url":null,"abstract":"The term formative assessment is often used to describe a type of assessment. The purpose of this paper is to challenge the use of this phrase given that formative assessment as a noun phrase ignores the well-established understanding that it is a process more than an object. A model that combines content, context, and strategies is presented as one way to view the process nature of assessing formatively. The alternate phrase formative use of assessment information is suggested as a more appropriate way to describe how content, context, and strategies can be used together in order to close the gap between where a student is performing currently and the intended learning goal. Let’s start with an elementary grammar review: adjectives modify nouns; adverbs modify verbs, adjectives, and other adverbs. Applied to recent assessment literature, the term formative assessment would therefore contain the adjective formative modifying the noun assessment, creating a noun phrase representing a thing or object. Indeed, formative assessment as a noun phrase is regularly juxtaposed to summative assessment in both purpose and timing. Formative assessment is commonly understood to occur during instruction with the intent to identify relative strengths and weaknesses and guide instruction, while summative assessment occurs after a unit of instruction with the intent of measuring performance levels of the skills and content related to the unit of instruction (Stiggins, Arter, Chappuis, & Chappuis, 2006). Distinguishing formative and summative assessments in this manner may have served an important introductory purpose, however using formative as a descriptor of a type of assessment has had ramifi cations that merit critical consideration. Given that formative assessment has received considerable attention in the literature over the last 20 or so years, this article contends that it is time to move beyond the well-established broad distinctions between formative and summative assessments and consider the subtle – yet important – distinction between the term formative assessment as an object and the intended meaning. The focus here is to suggest that if we want to realize the true potential of formative practices in our classrooms, then we need to start saying what we mean.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75781333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信