In aptitude and achievement tests, different response formats are usually used. A fundamental distinction must be made between the class of multiple-choice formats and the constructed response formats. Previous studies have examined the impact of different response formats applying traditional statistical approaches, but these influences can also be studied using methods of item response theory to deal with incomplete data. Response formats can influence item attributes in two ways: different response formats could cause items to measure different latent traits or they could contribute differently to item difficulty. In contrast to previous research, the present study examines the impact of response formats on item attributes of a language awareness test applying different item response theory models. Results indicate that although the language awareness test contains items with different response formats, only one latent trait is measured; no format-specific dimensions were found. Response formats do, however, have a distinct impact on the difficulty of the items. In addition to the effects of the three administered item types, a fourth component that makes items more difficult was identified.
For large-scale assessments, usually booklet designs administering the same item at different positions within a booklet are used. Therefore, the occurrence of position effects influencing the difficulty of the item is a crucial issue. Not taking learning or fatigue effects into account would result in a bias of estimated item difficulty. The occurrence of position effects is examined for a 4th-grade mathematical competence test of the Austrian Educational Standards by means of the linear logistic test model (LLTM). A small simulation study assesses the test power for this model. Overall, the LLTM without a modelled position effect yielded a good model fit. Therefore, no relevant global item position effect could be found for the analysed mathematical competence test.
Multiple‐choice response formats are troublesome, as an item is often scored as solved simply because the examinee may be lucky at guessing the correct option. Instead of pertinent Item Response Theory models, which take guessing effects into account, this paper considers a psycho‐technological approach to re‐conceptualizing multiple‐choice response formats. The free‐response format is compared with two different multiple‐choice formats: a traditional format with a single correct response option and five distractors (‘1 of 6’), and another with five response options, three of them being distractors and two of them being correct (‘2 of 5’). For the latter format, an item is scored as mastered only if both correct response options and none of the distractors are marked. After the exclusion of a few items, the Rasch model analyses revealed appropriate fit for 188 items altogether. The resulting item‐difficulty parameters were used for comparison. The multiple‐choice format ‘1 of 6’ differs significantly from the multiple‐choice format ‘2 of 5’, while the latter does not differ significantly from the free‐response format. The lower difficulty of items ‘1 of 6’ suggests guessing effects.
The validity and psychometric properties of a new Persian adaptation of the Foreign Language Reading Anxiety Scale were investigated. The scale was translated into Persian and administered to 160 undergraduate students (131 women, 29 men; M age = 23.4 yr., SD = 4.3). Rasch model analysis on the scale's original 20 items revealed that the data do not fit the partial credit model. Principal components analysis identified three factors: one related to feelings of anxiety about reading, the second reflected the reverse-worded items, and the third related to general ideas about reading in a foreign language. In a re-analysis, the 12 items that loaded on the first factor showed a good fit with the partial credit model.
Es werden die beiden Intelligenz-Testbatterien HAWIK-IV und AID 2 in Bezug auf Hochbegabungsdiagnostik gegenübergestellt. Ausgegangen wird von zwei Modellen der Hochbegabungsdiagnostik. Dem traditionellen Ansatz einerseits – (kognitive) Hochbegabung liegt vor bei einem IQ > 130 – und dem „Wiener Diagnosemodell zum Hochleistungspotenzial“ andererseits. Letzteres postuliert in Anlehnung an das „Münchner Hochbegabungsmodell“ zusätzlich zu Begabungsfaktoren, wie vor allem der Intelligenz, bestimmte Persönlichkeits- sowie Umweltmerkmale als Moderatoren der Leistungsmanifestation. Die Abhandlung von HAWIK-IV und AID 2 ergibt, dass keine von beiden Testbatterien beiden Modellen gleichermaßen gerecht wird, sondern der HAWIK-IV eher im Sinne der traditionellen Hochbegabungsdiagnostik einsetzbar ist, der AID 2 besonders gut für eine förderungsorientierte Diagnostik im Sinne des „Wiener Diagnosemodells zum Hochleistungspotenzial“. Somit muss in der Praxis zuerst entschieden werden, welchem Modell man sich verpflichtet fühlt, um danach die optimale Intelligenz-Testbatterie auszuwählen.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.