In their research articles, scholars often use 2 × 2 tables or tree diagrams including natural frequencies in order to illustrate Bayesian reasoning situations to their peers. Interestingly, the effect of these visualizations on participants’ performance has not been tested empirically so far (apart from explicit training studies). In the present article, we report on an empirical study (3 × 2 × 2 design) in which we systematically vary visualization (no visualization vs. 2 × 2 table vs. tree diagram) and information format (probabilities vs. natural frequencies) for two contexts (medical vs. economical context; not a factor of interest). Each of N = 259 participants (students of age 16–18) had to solve two typical Bayesian reasoning tasks (“mammography problem” and “economics problem”). The hypothesis is that 2 × 2 tables and tree diagrams – especially when natural frequencies are included – can foster insight into the notoriously difficult structure of Bayesian reasoning situations. In contrast to many other visualizations (e.g., icon arrays, Euler diagrams), 2 × 2 tables and tree diagrams have the advantage that they can be constructed easily. The implications of our findings for teaching Bayesian reasoning will be discussed.
Changing the information format from probabilities into frequencies as well as employing appropriate visualizations such as tree diagrams or 2 × 2 tables are important tools that can facilitate people’s statistical reasoning. Previous studies have shown that despite their widespread use in statistical textbooks, both of those visualization types are only of restricted help when they are provided with probabilities, but that they can foster insight when presented with frequencies instead. In the present study, we attempt to replicate this effect and also examine, by the method of eye tracking, why probabilistic 2 × 2 tables and tree diagrams do not facilitate reasoning with regard to Bayesian inferences (i.e., determining what errors occur and whether they can be explained by scan paths), and why the same visualizations are of great help to an individual when they are combined with frequencies. All ten inferences of N = 24 participants were based solely on tree diagrams or 2 × 2 tables that presented either the famous “mammography context” or an “economics context” (without additional textual wording). We first asked participants for marginal, conjoint, and (non-inverted) conditional probabilities (or frequencies), followed by related Bayesian tasks. While solution rates were higher for natural frequency questions as compared to probability versions, eye-tracking analyses indeed yielded noticeable differences regarding eye movements between correct and incorrect solutions. For instance, heat maps (aggregated scan paths) of distinct results differed remarkably, thereby making correct and faulty strategies visible in the line of theoretical classifications. Moreover, the inherent structure of 2 × 2 tables seems to help participants avoid certain Bayesian mistakes (e.g., “Fisherian” error) while tree diagrams seem to help steer them away from others (e.g., “joint occurrence”). We will discuss resulting educational consequences at the end of the paper.
Two different tools for assessing pedagogical content knowledge (PCK) of mathematics teachers used in the framework of the COACTIV study are systematically compared in this paper, namely the paper-and-pencil test consisting of items on the three facets knowledge of explaining and representation, knowledge of student thinking and typical mistakes, and knowledge of the potential of mathematical tasks, and the video vignettes instrument that examines teachers' proposed continuations for presented lesson video clips specific to their subject-related and methodological competence aspects. Initially, both COACTIV PCK assessment tools are systematically contrasted for the first time with respect to their predictive validity for instructional quality (N = 163 German secondary mathematics teachers) as well as student learning gains (N = 3806 PISA students from 169 different classes) by means of path models showing that PCK, when assessed by the paper-and-pencil method, can better predict instructional quality than the video vignettes instrument can. Next, we theoretically propose the cascade model as capable of integrating pertinent theories on teacher competence and instructional quality. This model implies five 'columns' that are ordered according to a sequential causal chain (teacher disposition → situation-specific skills → observable teaching behavior → student mediation → learning gains). Finally, we specify four out of the five 'columns' of this cascade model, based empirically on the COACTIV data.
In medicine, diagnoses based on medical test results are probabilistic by nature. Unfortunately, cognitive illusions regarding the statistical meaning of test results are well documented among patients, medical students, and even physicians. There are two effective strategies that can foster insight into what is known as Bayesian reasoning situations: (1) translating the statistical information on the prevalence of a disease and the sensitivity and the false-alarm rate of a specific test for that disease from probabilities into natural frequencies, and (2) illustrating the statistical information with tree diagrams, for instance, or with other pictorial representation. So far, such strategies have only been empirically tested in combination for “1-test cases”, where one binary hypothesis (“disease” vs. “no disease”) has to be diagnosed based on one binary test result (“positive” vs. “negative”). However, in reality, often more than one medical test is conducted to derive a diagnosis. In two studies, we examined a total of 388 medical students from the University of Regensburg (Germany) with medical “2-test scenarios”. Each student had to work on two problems: diagnosing breast cancer with mammography and sonography test results, and diagnosing HIV infection with the ELISA and Western Blot tests. In Study 1 (N = 190 participants), we systematically varied the presentation of statistical information (“only textual information” vs. “only tree diagram” vs. “text and tree diagram in combination”), whereas in Study 2 (N = 198 participants), we varied the kinds of tree diagrams (“complete tree” vs. “highlighted tree” vs. “pruned tree”). All versions were implemented in probability format (including probability trees) and in natural frequency format (including frequency trees). We found that natural frequency trees, especially when the question-related branches were highlighted, improved performance, but that none of the corresponding probabilistic visualizations did.
For more than 20 years, research has proven the beneficial effect of natural frequencies when it comes to solving Bayesian reasoning tasks (Gigerenzer and Hoffrage, 1995). In a recent meta-analysis, McDowell and Jacobs (2017) showed that presenting a task in natural frequency format increases performance rates to 24% compared to only 4% when the same task is presented in probability format. Nevertheless, on average three quarters of participants in their meta-analysis failed to obtain the correct solution for such a task in frequency format. In this paper, we present an empirical study on what participants typically do wrong when confronted with natural frequencies. We found that many of them did not actually use natural frequencies for their calculations, but translated them back into complicated probabilities instead. This switch from the intuitive presentation format to a less intuitive calculation format will be discussed within the framework of psychological theories (e.g., the Einstellung effect).
Humans are exposed to pyrrolizidine alkaloids (PA) through different sources, mainly from contaminated foodstuff. Teas and herbal infusions (T&HI) can be contaminated by PA producing weed. PA can possess toxic, mutagenic, genotoxic, and carcinogenic properties. Thus, possible health risks for the general population are under debate. There is a strong safety record for T&HI and additionally epidemiological evidence for the preventive effects of regular tea consumption on cardiovascular events and certain types of cancer. There is no epidemiological evidence, however, for human risks of regular low dose PA exposure. Recommended regulatory PA-threshold values are based on experimental data only, accepting big uncertainties. If a general risk exists through PA contaminated T&HI, it must be small compared to other frequently accepted risks of daily living and the proven health effects of T&HI. Decision making should be based on a balanced riskbenefit analysis. Based on analyses of the scientific data currently available, it is concluded that the benefits of drinking T&HI clearly outweigh the negligible health risk of possible PA contamination. At the same time, manufacturers must continue their efforts to secure good product quality and to be transparent on their measures of quality control and risk communication.
When physicians are asked to determine the positive predictive value from the a priori probability of a disease and the sensitivity and false positive rate of a medical test (Bayesian reasoning), it often comes to misjudgments with serious consequences. In daily clinical practice, however, it is not only important that doctors receive a tool with which they can correctly judge—the speed of these judgments is also a crucial factor. In this study, we analyzed accuracy and efficiency in medical Bayesian inferences. In an empirical study we varied information format (probabilities vs. natural frequencies) and visualization (text only vs. tree only) for four contexts. 111 medical students participated in this study by working on four Bayesian tasks with common medical problems. The correctness of their answers was coded and the time spent on task was recorded. The median time for a correct Bayesian inference is fastest in the version with a frequency tree (2:55 min) compared to the version with a probability tree (5:47 min) or to the text only versions based on natural frequencies (4:13 min) or probabilities (9:59 min).The score diagnostic efficiency (calculated by: median time divided by percentage of correct inferences) is best in the version with a frequency tree (4:53 min). Frequency trees allow more accurate and faster judgments. Improving correctness and efficiency in Bayesian tasks might help to decrease overdiagnosis in daily clinical practice, which on the one hand cause cost and on the other hand might endanger patients’ safety.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.