With the development of large language model technologies, the capability of social robots to interact emotionally with users has been steadily increasing. However, the existing research insufficiently examines the influence of robot stance attribution design cues on the construction of users’ mental models and their effects on human–robot interaction (HRI). This study innovatively combines mental models with the associative–propositional evaluation (APE) model, unveiling the impact of the stance attribution explanations of this design cue on the construction of user mental models and the interaction between the two types of mental models through EEG experiments and survey investigations. The results found that under the influence of intentional stance explanations (compared to design stance explanations), participants displayed higher error rates, higher θ- and β-band Event-Related Spectral Perturbations (ERSPs), and phase-locking value (PLV). Intentional stance explanations trigger a primarily associatively based mental model of users towards robots, which conflicts with the propositionally based mental models of individuals. Users might adjust or “correct” their immediate reactions caused by stance attribution explanations after logical analysis. This study reveals that stance attribution interpretation can significantly affect users’ mental model construction of robots, which provides a new theoretical framework for exploring human interaction with non-human agents and provides theoretical support for the sustainable development of human–robot relations. It also provides new ideas for designing robots that are more humane and can better interact with human users.