“…Hubwieser and Mühling (2015) used the IRT method to examine a more dated version (i.e., the German Bebras Contest of 2009) than the versions referenced in the current study. Other studies have used qualitative evaluation (e.g., content analysis or rubrics) or quantitative measures, such as questionnaires or success rate for identifying item difficulty (Izu et al, 2017; van der Vegt, 2018). Past studies have pointed out that the chosen items from the Bebras item pool represent a joint construct; nevertheless, there exist problems with the quality of items (Hubwieser & Mühling, 2015).…”