Muscle strength, the maximal force-generating capacity of a muscle or group of muscles, is regularly assessed in physiological experiments and clinical trials. An understanding of the expected variation in strength and the factors that contribute to this variation is important when designing experiments, describing methodologies, interpreting results, and attempting to replicate methods of others and reproduce their findings. In this review (Cores of Reproducibility in Physiology), we report on the intra- and inter-rater reliability of tests of upper and lower limb muscle strength and voluntary activation in humans. Isometric, isokinetic, and isoinertial strength exhibit good intra-rater reliability in most samples (correlation coefficients ≥0.90). However, some tests of isoinertial strength exhibit systematic bias that is not resolved by familiarization. With the exception of grip strength, few attempts have been made to examine inter-rater reliability of tests of muscle strength. The acute factors most likely to affect muscle strength and serve as a source of its variation from trial-to-trial or day-to-day include attentional focus, breathing technique, remote muscle contractions, rest periods, temperature (core, muscle), time of day, visual feedback, body and limb posture, body stabilization, acute caffeine consumption, dehydration, pain, fatigue from preceding exercise, and static stretching >60 s. Voluntary activation, the nervous system’s ability to drive a muscle to create its maximal force, exhibits good intra-rater reliability when examined with twitch interpolation (correlation coefficients >0.80). However, inter-rater reliability has not been formally examined. The methodological factors most likely to influence voluntary activation are myograph compliance and sensitivity; stimulation location, intensity, and inadvertent stimulation of antagonists; joint angle (muscle length); and the resting twitch.