Speech emotion recognition finds many applications in the daily life like conversational agents, human robot interaction, call centres etc. However; the task of emotion recognition from speech signal is not trivial due to the difficulty in determining the effective feature set that can recognize the emotion conveyed within the signal in an accurate manner. Image processing techniques are exploited in this paper to solve speech emotion recognition problem. After converting the signal into 2D spectrogram image representation, four forms of Extended Local Binary Pattern (ELBP) are generated to serve as a source for feature extraction stage. The histograms of multiple blocks from ELBP variants are computed and fed to Deep Belief Network (DBN) for classification purpose. Different tests were performed using Surrey AudioVisual Expressed Emotion (SAVEE) database and the achieved results showed that when using combined vectors of MELBP, the system gives the best accuracy which is 72.14%. The achieved result outperforms state-of-the-art results on the same database.