Title Person Fit Analysis with Simulation-based Methods English 1.7 MB 144
##### Document Text Contents
Page 1

Universität Duisburg-Essen

Fakultät für Bildungswissenschaften

Lehrstuhl für Lehr-Lernpsychologie

Person Fit Analysis with Simulation-based Methods

Dissertation zur Erlangung des Grades Dr. phil.

vorgelegt von Christian Spoden

geboren am 27.01.1982 in Mülheim a.d. Ruhr

Erstgutachter: Prof. Dr. Dr. Detlev Leutner, Universität Duisburg-Essen

Zweitgutachter: Prof. Dr. Christian Tarnai, Universität der Bundeswehr München

Tag der mündlichen Prüfung: 16. Juli 2014

Page 72

3 STUDY I - APPLYING THE RASCH SAMPLER FOR PERSON FIT ANALYSIS ǁ 71

Figure 3.1. Empirical Type I error rates of two person fit statistics and two approaches

to generate p-values (normalization formula, NOR; Markov chain Monte Carlo

simulation of the Rasch Sampler, MCMC (RS)). A: statistic U3, 20 items; B: statistic l0,

20 items; C: statistic U3, 40 items; D: statistic l0, 40 items; E: statistic U3, 60 items; F:

statistic l0, 60 items.

Page 73

3 STUDY I - APPLYING THE RASCH SAMPLER FOR PERSON FIT ANALYSIS 72

3.4.2 Evaluation of statistical power and Type I error rate

Power and Type I error rate are evaluated under a fixed nominal -level. Power is

estimated in the guessing and the cheating conditions as the percentage of aberrant response

vectors with a p-value of the person fit statistic smaller than . The Type I error rate is

estimated as the percentage of non-aberrant response vectors with a p-value of the person fit

statistic smaller than . The same -levels as in Simulation 1 were used ( = .05 and = .10).

3.4.3 Results

Power rates of the statistics are presented in Figures 3.2 and 3.3. For both cheating and

guessing power increase with increasing item number, percentage of aberrancy and α-level.

Cheating was generally easier to detect than guessing. In the cheating conditions, power is in

most conditions highest for MCMC (RS), except for 20 items and an aberrancy rate of 20 %,

where higher rates are found for U3 under NOR. With the parametric statistic l0, advantages

of MCMC (RS) are stronger than for the nonparametric statistic U3. Differences between

both methods also grow with decreasing item number. Over all test lengths, power rates for

U3 and l0 are very similar under MCMC (RS), while under NOR U3 outperforms l0. These

differences reflect inflated Type I error rates of U3NOR and deflated Type I error rates of lNOR

(Emons et al., 2002). The highest power in all cheating conditions is found for both U3 and l0

and in the condition of 60 items, an aberrancy rate of 40 % and = .10, where the percentage

of correctly detected vectors is near 100 % with MCMC (RS). The lowest rates are found for

20 items, an aberrancy rate of 20 % and = .05, where rates below .35 indicate that model

violations are generally hard to detect.

Page 143

5 GENERAL DISCUSSION 142

Schuster, C., & Yuan, K.-H. (2011). Robust estimation of latent ability in item response models. Journal of

Educational and Behavioral Statistics, 36, 720 735.

Smith, R. M. (1985). A comparison of Rasch person analysis and robust estimators. Educational and

Psychological Measurement, 45, 433 444.

Snijders, T. A. B. (2001). Asymptotic null distribution of person fit statistics with estimated person parameter.

Psychometrika, 66, 331 342.

Steinbakk, G. H., & Storvik, G. O. (2009). Posterior predictive "p"-values in Bayesian hierarchical models.

Scandinavian Journal of Statistics, 36, 320 336.

van Krimpen-Stoop, E. M. L. A., & Meijer, R. R. (1999). The null distribution of person-fit statistics for

conventional and adaptive tests. Applied Psychological Measurement, 23, 327 345.

van Krimpen-Stoop, E. M. L. A., & Meijer, R. R. (2002). Detection of person misfit in computerized adaptive

tests with polytomous items. Applied Psychological Measurement, 26, 164 180.

Verhelst, N. D., Hatzinger, R., & Mair, P. (2007). The Rasch sampler. Journal of Statistical Software, 20(4), 1

14.

von Davier, M., & Molenaar, I. (2003). A person-fit index for polytomous Rasch models, latent class models,

and their mixture generalizations. Psychometrika, 68, 213 228.

Wainer, H., & Wright, B.D. (1980). Robust estimation of aility in the Rasch model. Psychometrika, 45, 373

391.

Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427

450.

Woods, C. M. (2008). Monte Carlo evaluation of two-level logistic regression for assessing person fit.

Multivariate Behavioral Research, 43, 50 76.

Woods, C. M., Oltmanns, T. F., & Turkheimer, E. (2008). Detection of aberrant responding on a personality

scale in a military sample: An application of evaluating person fit with two-level logistic regression.

Psychological Assessment, 20, 159 168.

Page 144

Der Lebenslauf ist in der Online-Version aus Gründen des Datenschutzes nicht enthalten.