In this paper, we present a novel implementation of Bayesian cosmic microwave background (CMB) component separation. We sample from the full posterior distribution using the No-U-Turn Sampler (NUTS), a gradient-based sampling algorithm. Alongside this, we introduce new foreground modelling approaches. We use the mean shift algorithm to define regions on the sky, clustering according to naively estimated foreground spectral parameters. Over these regions we adopt a complete pooling model, where we assume constant spectral parameters, and a hierarchical model, where we model individual pixel spectral parameters as being drawn from underlying hyperdistributions. We validate the algorithm against simulations of the LiteBIRD and C-Band All-Sky Survey (C-BASS) experiments, with an input tensor-to-scalar ratio of r = 5 × 10−3. Considering multipoles 30 ≤ ℓ < 180, we are able to recover estimates for r. With LiteBIRD-only observations, and using the complete pooling model, we recover r = (12.9 ± 1.4) × 10−3. For C-BASS and LiteBIRD observations we find r = (9.0 ± 1.1) × 10−3 using the complete pooling model, and r = (5.2 ± 1.0) × 10−3 using the hierarchical model. Unlike the complete pooling model, the hierarchical model captures pixel-scale spatial variations in the foreground spectral parameters, and therefore produces cosmological parameter estimates with reduced bias, without inflating their uncertainties. Measured by the rate of effective sample generation, NUTS offers performance improvements of ∼103 over using Metropolis–Hastings to fit the complete pooling model. The efficiency of NUTS allows us to fit the more sophisticated hierarchical foreground model that would likely be intractable with non-gradient-based sampling algorithms.