@neps.dc

Entwicklung einer computerbasierten Schwierigkeitsabschätzung von Leseverstehensaufgaben

. NEPS Working Paper, (January 2016)

Abstract

The prediction of item difficulties in reading assessments has major implications for the test development process as item writing can be organised more efficiently as the need for costly pilot studies decreases. In addition new insights into the construct of reading competency and the processing of multiple choice items are gained. The test development process is time consuming: more items have to be written than are required by the final test form as many items are discarded because of their psychometric item properties in pilot studies. The prediction of item difficulty could help making this process more efficient, as very easy or very difficult items could be identified and revised beforehand. Furthermore, the test difficulty can be adjusted well-informed to the competency level of the population for a maximum of test information. For the practical application in test development a prompt provision of the estimated item difficulty is necessary. Previous research has shown that the rating of the item features can be cumbersome. To overcome these issues a new methodological approach is suggested that draws on quantitative linguistics: word frequencies are analysed to assess the vocabulary and part-of-speech-tagging is used to assess the propositional density. These item features are then used in a linear logistic test model (LLTM) to predict item difficulties in a main study with 9th graders. The data analysis shows that the resulting model fits the data reasonable well and that the construct validity of the NEPS reading tests is supported. As age differences in language components such as vocabulary and the verbal working memory are well known in adulthood, in a further pilot study it was investigated if the results could be generalized. The results showed a similar good prediction of the item difficulties. However, in the LLTM the effects of the item components possessed much larger confidence intervals. Thus, the generalization of the results in the adolescence remains vague.

Links and resources

Tags