An emotion recognition method based on multispectral imaging technology and tissue oxygen saturation (StO2) is proposed in this study. This method is called spatial-spectral-temporal adjustment convolutional neural network (SACNN). First, we use the algorithm to extract the StO2 content of an emotionally sensitive nose area through real-time multispectral imaging technology. Compared with facial expression data, StO2 data are more objective and cannot be controlled and changed artificially. Second, we construct a clustering algorithm based on the emotional state by extracting the spectral, StO2, and spatial features of the nose image to obtain accurate signals of emotionally sensitive areas. To utilize the correlation between spectral and spatial signals, we propose an adjustment-based CNN module, which reorganizes the relationship between all previous layers of the feature map, thereby making the relationship among layers close and highly quantitative. The features extracted through this method are consistent with spatial-spectral features. Third, we incorporate the extracted temporal feature signal into the long short-term memory module and finally complete the correlation between the spatial-spectral-temporal features. Experimental results show that the accuracy of the SACNN algorithm in emotional recognition reaches 90%, and the proposed method is more competitive than state-of-the-art approaches. To the best of our knowledge, this study is the first to use time-series StO2 signals for emotion recognition. INDEX TERMS Multispectral imaging, oxygen saturation, spatial-spectral-temporal adjustment convolutional neural network. I.