Comparing the slopes of aqueous-based and standard addition calibration procedures is almost a daily task in analytical laboratories. As usual protocols imply very few standards, sound statistical inference and conclusions are hard to obtain for current classical tests (e.g., the t-test), which may greatly affect decision-making. Thus, there is a need for robust statistics that are not distorted by small samples of experimental values obtained from analytical studies. Several promising alternatives based on bootstrapping are studied in this paper under the typical constraints common in laboratory work. The impact of number of standards, homoscedasticity or heteroscedasticity, three variance patterns, and three error distributions on least-squares fits were considered (in total, 144 simulation scenarios). The Student's t-test is the most valuable procedure when the normality assumption is true and homoscedasticity is present, although it can be highly affected by outliers. A wild bootstrap method leads to average rejection percentages that are closer to the nominal level in almost every situation, and it is recommended for laboratories working with a small number of standards. Finally, it was seen that the Theil-Sen percentile bootstrap statistic is very robust but its rejection percentages depart from the nominal ones (<5%), so its use is not recommended when the number of standards is very small. Finally, a tutorial and free software are given to encourage analytical laboratories to apply bootstrap principles to compare the slopes of two calibration lines.