In ∼20 BCE, Marcus Vipsanius Agrippa, a powerful Roman general and Augustus Caesar's right-hand man, made a decree that the Roman 'foot' would be precisely 11.65 in., or the length of Agrippa's foot. 1 Prior to this, the length of a foot was determined based on each individual's own body part, which can vary by several inches depending on their height. The importance of standardizing the length of a foot cannot be understated, as this paved the way for the Romans' achievements in engineering and construction, including aqueducts, roadways, and monuments. This system of measurement brought order to randomness. It did not really matter what the actual length of a foot was; just that it was precise, reproducible, and accepted as the standard. Although the Kansas City Cardiomyopathy Questionnaire (KCCQ) can reliably and validly assess health status in patients with heart failure, the current reporting of health status measures in clinical trials is similar to the era before Agrippa. Stogios et al., 2 in their perspective piece, call for the need to standardize reporting of these measures. We could not agree more. The KCCQ is a continuous measure with scores ranging from 0 to 100. As with any continuous measure, the interpretation of outcomes, particularly average outcomes, can be more challenging than with discrete events such as hospitalizations or deaths. When trials compare means or mean changes in scores, the clinical significance of these differences may not be clear, particularly in large studies with higher statistical power. For example, few individual patients, if any, actually change by the same amount as the mean of the population. More often, some patients improve a lot, some a little, some do not change, and others get worse. Without reporting the proportions of patients with different magnitudes of clinical change, it is difficult to interpret the benefits of therapy on patients' health status. 3 Of course, the same applies to dichotomous outcomes such as death. If a trial shows a 2% difference in deaths between the groups, obviously this does not mean that every patient in the worse group is 2% dead or 2% more dead. Moreover, it may also by difficult to judge whether a certain percentage reduction in death is meaningful, especially if there are risks from treatment, as this also requires interpretation and judgement.