In this study, the researchers assessed comparability of item statistics of 2017 basic education certific ate mathematics examination of National Examinations Council (NECO) through Classical Test Theory (CTT) and Item Response Theory (IRT) measurement frameworks. The study adopted instrumentation design. A 60-item NECO basic education certificate education mathematics objective test paper I was administered to 978 bas ic nine examinees, randomly selected from Osogbo and Olorunda Local Government Area, Osun S tate, Nigeria. The responses of the examinees to the test data were analysed using Marginal Maximum Likelihood Estimation of JMETRIK software. The result showed that the test data obey the assumption of unidimensionality of 3parameter logistic model and Classical Test Theory measurement framework deleted more items 33 (55%) compare to IRT measurement framework 12 (20%). Also, it was observed that item statistics fro m the two contrasting frameworks (CTT and IRT) were not comparable. Moreover, further analysis showed that there was low c orrelation among the item statistics index. The implication of this is that NECO should jettison the use of Classical Test Theory and embrace utilization of Item Response Theory framework during their test development and item analysis.
A good item that will measure the intended domain is expected to be free of biases. But several studies have confirmed that some items in a test reveal biases due to a group of testees.. A generally acceptable analytical technique that can be used to discover biases in test items is the Differential Item Functioning (DIF) which Item Response Theory (IRT) offers to check differences in psychometric properties due to the groups that testees belong. Thus, this study used the DIF technique to detect gender biased items in a teacher made Chemistry test. BILOG-MG was employed using 350 (183 males and 167 females) students from 10 Senior secondary school Two (SSII), randomly drawn from Obio/Akpor Local Government Area of Port Harcourt, River State, Nigeria. The study showed that out of one hundred items, fifty-three items were biased. However, 26(49.1%) out of 53 were in favour of the female while 27(50.9%) were in favour of the male which confirmed biases. DIF is effective in detecting group biases of test items. The study concluded that Differential Item Functioning should always be used by scale developers before collating the final items for a test.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.