Massive neutrinos suppress the growth of structure below their free-streaming scale and leave an imprint on large-scale structure. Measuring this imprint allows us to constrain the sum of neutrino masses, M ν , a key parameter in particle physics beyond the Standard Model. However, degeneracies among cosmological parameters, especially between M ν and σ 8 , limit the constraining power of standard two-point clustering statistics. In this work, we investigate whether we can break these degeneracies and constrain M ν with the next higher-order correlation function -the bispectrum. We first examine the redshift-space halo bispectrum of 800 N -body simulations from the HADES suite and demonstrate that the bispectrum helps break the M ν -σ 8 degeneracy. Then using 22,000 N -body simulations of the Quijote suite, we quantify for the first time the full information content of the redshift-space halo bispectrum down to nonlinear scales using a Fisher matrix forecast of {Ω m , Ω b , h, n s , σ 8 , M ν }. For k max =0.5 h/Mpc, the bispectrum provides Ω m , Ω b , h, n s , and σ 8 constraints 1.9, 2.6, 3.1, 3.6, and 2.6 times tighter than the power spectrum. For M ν , the bispectrum improves the 1σ constraint from 0.2968 to 0.0572 eV -over 5 times tighter than the power spectrum. Even with priors from Planck, the bispectrum improves M ν constraints by a factor of 2.7. Although we reserve marginalizing over a more complete set of bias parameters to the next paper of the series, these constraints are derived for a (1 h −1 Gpc) 3 box, a substantially smaller volume than upcoming surveys. Thus, our results demonstrate that the bispectrum offers significant improvements over the power spectrum, especially for constraining M ν .