Explaining the results of defect prediction models is practical but challenging to achieve. Recently, Jiarpakdee et al. [1] proposed to use two state-of-the-art model-agnostic techniques (i.e., LIME and BreakDown) to explain prediction results. Their study showed that model-agnostic techniques can achieve remarkable performance, and the generated explanations can assist developers to understand the prediction results. However, the fact that they only examined both LIME and BreakDown in a single defect prediction setting calls into question the consistency and reliability of model-agnostic techniques on defect prediction models under various settings.In this paper, we set out to investigate the reliability and stability of explanation generation approaches based on model-agnostic techniques, i.e., LIME and BreakDown, on defect prediction models under different settings, e.g., data sampling techniques, machine learning classifiers, and prediction scenarios used when building defect prediction models. Specifically, we use both LIME and BreakDown to generate explanations for the same instance under various defect prediction models with different settings and then check the consistency of the generated explanations for the instance. We reused the same defect data from Jiarpakdee et al. in our experiments. The results show that both LIME and BreakDown generate inconsistent explanations under different defect prediction settings for the same test instances. These imply that the model-agnostic techniques are unreliable for practical explanation generation. In addition, our manual analysis shows that none of the generated explanations can reflect the root causes of the predicted defects, which further weakens the usefulness of model-agnostic based explanation generation. Overall, with this study, we urge a revisit of existing model-agnostic based studies in software engineering and call for more research in explainable defect prediction towards achieving reliable and stable explanation generation.