With a finite amount of measurement data acquired in variational quantum algorithms, the statistical benefits of several optimized numerical estimation schemes, including the scaled parameter-shift (SPS) rule and finite-difference (FD) method, for estimating gradient and Hessian functions over analytical schemes [unscaled parameter-shift (PS) rule] were reported by the present author in [Y. S. Teo, Phys. Rev. A 107, 042421 (2023)]. We continue the saga by exploring the extent to which these numerical schemes remain statistically more accurate for a given number of sampling copies in the presence of noise. For noise-channel error terms that are independent of the circuit parameters, we demonstrate that without any knowledge about the noise channel, using the SPS and FD estimators optimized specifically for noiseless circuits can still give lower mean-squared errors than PS estimators for substantially wide sampling-copy number ranges-specifically for SPS, closed-form meansquared error expressions reveal that these ranges grow exponentially in the qubit number and reciprocally with a decreasing error rate. Simulations also demonstrate similar characteristics for the FD scheme. Lastly, if the error rate is known, we propose a noise-model-agnostic error-mitigation procedure to optimize the SPS estimators under the assumptions of two-design circuits and circuit-parameter-independent noise-channel error terms. We show that these heuristically-optimized SPS estimators can significantly reduce mean-squared-error biases that naive SPS estimators possess even with realistic circuits and noise channels, thereby improving their estimation qualities even further. The heuristically-optimized FD estimators possess as much mean-squared-error biases as the naively-optimized counterparts, and are thus not beneficial with noisy circuits.