{"title":"Distractors--can they be biased too?","authors":"S Alagumalai, J P Keeves","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Numerous work has been done on item bias and differential item functioning. Although there is some research on distractor analysis, no detailed study has been attempted to examine the way distractors in an item function, with regards to comparing distractor performance. This paper examines how distractors function differentially and compares various methods for identifying this. The Pearson chi-square, likelihood ratio chi-square and Neyman weighted least squares chi-square tests are some of these methods. Possible causes of distractor bias are discussed with illustrations from a physics problem-solving scale.</p>","PeriodicalId":79673,"journal":{"name":"Journal of outcome measurement","volume":"3 1","pages":"89-102"},"PeriodicalIF":0.0,"publicationDate":"1999-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20936403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Teacher receptivity to a system-wide change in a centralized education system: a Rasch measurement model analysis.","authors":"R F Waugh","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The Education Department of Western Australia has implemented a new system called Student Outcome Statements, by trial in 1995/1996, then an a voluntary basis from 1997, with the intention of making it mandatory after 2001. The system describes, in order, the outcomes that students are expected to achieve in eight broad learning areas. The study has three aims. One, to create a scale for teacher receptivity to the use of Student Outcome Statements, based on eight orientations to receptivity: evaluative attitudes, behavior intentions, feelings towards Student Outcome Statements compared to the previous system, the benefits of the new system, support from significant others, alleviation of concerns, collaboration with other teachers, and involvement in decision-making. Two, to analyze the psychometric properties of the scale using the Extended Logistic Model of Rasch (Andrich, 1988; Rasch, 1960/1980) with the computer program RUMM (Andrich, Sheridan & Luo, 1997). Three, to provide advice to decision-makers about how better to implement the system of Student Outcome Statements.</p>","PeriodicalId":79673,"journal":{"name":"Journal of outcome measurement","volume":"3 1","pages":"71-88"},"PeriodicalIF":0.0,"publicationDate":"1999-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20936402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parameter recovery for the rating scale model using PARSCALE.","authors":"G A French, B G Dodd","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The purpose of the present study was to investigate item and trait parameter recovery for Andrich's rating scale model using the PARSCALE computer program. The four factors upon which the simulated data matrices varied were (a) the distribution of the scale values for the items (skewed or uniform), (b) the number of category response options (4 or 5), (c) the distribution of known trait levels (normal or skewed), and (d) the sample size (60, 125, 250, 500, or 1,000). Each condition was replicated 10 times resulting in 400 data matrices. Accurate item and trait parameter estimates were obtained for all sample sizes examined. As expected, sample size seemed to have little influence on the recovery of trait parameters but did influence item parameter recovery. The distribution of known trait levels did not seriously impact the item parameter recovery. It was concluded that Andrich's rating scale model allows for the use of considerably smaller calibration samples than are typically recommended for other polytomous IRT models.</p>","PeriodicalId":79673,"journal":{"name":"Journal of outcome measurement","volume":"3 2","pages":"176-99"},"PeriodicalIF":0.0,"publicationDate":"1999-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"21075949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mapping variables.","authors":"M H Stone, B D Wright, A J Stenner","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This paper describes Mapping Variables, the principal technique for planning and constructing a test or rating instrument. A variable map is also useful for interpreting results. Modest reference is made to the history of mapping leading to its importance in psychometrics. Several maps are given to show the importance and value of mapping a variable by person and item data. The need for a critical appraisal of maps is also stressed.</p>","PeriodicalId":79673,"journal":{"name":"Journal of outcome measurement","volume":"3 4","pages":"308-22"},"PeriodicalIF":0.0,"publicationDate":"1999-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"21430144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Many-facet Rasch analysis with crossed, nested, and mixed designs.","authors":"R E Schumacker","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Many-facet Rasch analysis provides the bases for making fair and meaningful decisions from individual ratings by judges on tasks. The typical measurement design employed in a many-facet Rasch analysis has judges crossed with other facets or conditions of measurement. A nested design does not permit facets to be compared. However, a mixed design can be used to achieve a common vertical ruler when the frame of reference permits commensurate measures to be linked. Examples of crossed, nested, and mixed designs are presented to illustrate how a many-facet Rasch analysis can be modified to meet the connectivity requirement for comparing facet measures.</p>","PeriodicalId":79673,"journal":{"name":"Journal of outcome measurement","volume":"3 4","pages":"323-38"},"PeriodicalIF":0.0,"publicationDate":"1999-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"21430145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Grades of severity and the validation of an atopic dermatitis assessment measure (ADAM).","authors":"D P Charman, G A Varigos","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>There has generally been a dearth of good clinical descriptions of grades of disease severity. The aim of this study was to produce reliable and valid descriptions of grades of severity of Atopic Dermatitis (AD). The ADAM (AD Assessment Measure) measure was used to assess AD severity in 171 male and female paediatric patients (mean age = 54 months) at the Royal Children's Hospital in Melbourne, Australia. The assessments were subject to Partial Credit analyses to produce clinically relevant \"word pictures\" of grades of severity of AD. Patterns of AD were shown to vary according to age, sex and severity. These descriptions will be useful for clinical training and research. Moreover, the approach to validation adopted here has important implications for the future of measurement in medicine.</p>","PeriodicalId":79673,"journal":{"name":"Journal of outcome measurement","volume":"3 2","pages":"162-75"},"PeriodicalIF":0.0,"publicationDate":"1999-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"21075428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Round-off error, blind faith, and the powers that be: a caution on numerical error in coefficients for polynomial curves fit to psychophysical data.","authors":"V J Samar, C L De Filippo","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Graphing and statistics software often permits users to fit polynomial curves, like a parabola or sigmoid, to scatter plots of psychophysical data points. These programs typically calculate the curve using double- or extended-precision numerical algorithms and display the resulting curve overlaid graphically on the scatter plot, but they may simultaneously display the equation that generates that curve with numerical coefficients that have been rounded off to only a few decimal places. If this equation is used for experimental or clinical applications, the round-off error, especially on coefficients for the higher powers, can produce anomalous findings due to systematic and extreme distortions of the fitted curve, even artifactually reversing the algebraic sign of the true slope of the fitted curve at particular data points. Care must be exercised in setting round-off criteria for coefficients of polynomial terms in curve-fit equations to avoid nonsensical measurement and prediction.</p>","PeriodicalId":79673,"journal":{"name":"Journal of outcome measurement","volume":"2 2","pages":"159-67"},"PeriodicalIF":0.0,"publicationDate":"1998-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20580456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Man is the measure ... the measurer.","authors":"M H Stone","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Measures originated from human anatomy. Metrology has moved from man the measure to man the measurer. This transformation is documented using examples taken from the history of metrology. The outcome measure are units constructed and maintained for their utility, constancy and generality.</p>","PeriodicalId":79673,"journal":{"name":"Journal of outcome measurement","volume":"2 1","pages":"25-32"},"PeriodicalIF":0.0,"publicationDate":"1998-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20579246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evidence for the validity of a Rasch model technique for identifying differential item functioning.","authors":"J D Scheuneman, R G Subhiyah","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This paper presents an analysis of differential item functioning (DIF) in a certification examination for a medical specialty. The groups analyzed were (1) physicians from different subspecialties within this area and (2) physicians who qualified for the examination through two different experiential pathways. The DIF analyses were performed using a simple Rasch model procedure. The results were shown to be readily interpretable in terms of the known differences between the groups being compared. These results serve as validity evidence for the Rasch model procedure as a means for evaluating DIF in examinations. The conclusion is drawn that complex procedures are not required to generate interpretable results if relevant differences between the groups being compared are known. This suggests that the inability of many researchers to interpret results for racial/ethnic or gender groups is not due to inadequacies of the methods, but more likely to lack of pertinent knowledge about group differences.</p>","PeriodicalId":79673,"journal":{"name":"Journal of outcome measurement","volume":"2 1","pages":"33-42"},"PeriodicalIF":0.0,"publicationDate":"1998-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20579247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Controlling the judge variable in grading essay-type items: an application of Rasch analyses to the recruitment exam for Korean public school teachers.","authors":"S Chae","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The purpose of this paper is to show how the Rasch measurement model can be used to control the effects of judge variable on the grading of essay-type items in the recruitment test for Korean teachers. Special attention is given to two aspects of judges' involvement in the grading. One is to identify a way to minimize the variation of grading due to judge severity. The other concern is to figure out a way to reduce the number of judges without threatening objectivity of ability estimates. Results from the FACETS analyses tell us not only how much grading standards vary among judges and how to adjust them but also it produces comparably reliable ability estimates with fewer judges.</p>","PeriodicalId":79673,"journal":{"name":"Journal of outcome measurement","volume":"2 2","pages":"123-41"},"PeriodicalIF":0.0,"publicationDate":"1998-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"20580454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}