Multimodal classification is increasingly common in biomedical informatics studies. Many such studies use deep learning classifiers with raw data, which makes explainability difficult. As such, only a few studies have applied explainability methods, and new methods are needed. In this study, we propose sleep stage classification as a testbed for method development and train a convolutional neural network with electroencephalogram (EEG), electrooculogram, and electromyogram data. We then present a global approach that is uniquely adapted for electrophysiology analysis. We further present two local approaches that can identify subject-level differences in explanations that would be obscured by global methods and that can provide insight into the effects of clinical and demographic variables upon the patterns learned by the classifier. We find that EEG is globally the most important modality for all sleep stages, except non-rapid eye movement stage 1 and that local subject-level differences in importance arise. We further show that sex, followed by medication and age had significant effects upon the patterns learned by the classifier. Our novel methods enhance explainability for the growing field of multimodal classification, provide avenues for the advancement of personalized medicine, and yield novel insights into the effects of demographic and clinical variables upon classifiers.