The MATH taxonomy classifies questions according to the mathematical skills required to answer them. It was created to aid the development of more balanced assessments in undergraduate mathematics and has since been used to compare different assessment regimes across school and university. To date, there has been no systematic investigation of the reliability of the taxonomy when applied by multiple coders, and it has only been applied in a limited range of contexts. In this paper, we outline a calibration process which enabled four novice coders to attain a high level of inter-rater reliability. In addition, we report on the results of applying the taxonomy to different secondary school exams and to all assessment questions in a first-year university mathematics module. The results confirm previous findings that there is a difference between the mix of skills assessed in school and university mathematics exams, although we find a notably different assessment profile in the university module than in previous work. The calibration process we describe has the potential to be used more widely, enabling reliable use of the MATH taxonomy to give insight into assessment practices.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.