This paper proposes an emotion recognition method for tweets containing emoticons using their emoticon image and language features. Some of the existing methods register emoticons and their facial expression categories in a dictionary and use them, while other methods recognize emoticon facial expressions based on the various elements of the emoticons. However, highly accurate emotion recognition cannot be performed unless the recognition is based on a combination of the features of sentences and emoticons. Therefore, we propose a model that recognizes emotions by extracting the shape features of emoticons from their image data and applying the feature vector input that combines the image features with features extracted from the text of the tweets. Based on evaluation experiments, the proposed method is confirmed to achieve high accuracy and shown to be more effective than methods that use text features only.