In this study, we propose several semantic kernels for word sense disambiguation (WSD). Our approaches adapt the intuition that class-based term values help in resolving ambiguity of polysemous words in WSD. We evaluate our proposed approaches with experiments, utilizing various sizes of training sets of disambiguated corpora (SensEval 1 ).With these experiments we try to answer the following questions: 1.) Do our semantic kernel formulations yield higher classification performance than traditional linear kernel?, 2.) Under which conditions a kernel design performs better than others?, 3.) Does the addition of class labels into standard term-document matrix improve the classification accuracy?, 4.) Is their combination superior to either type?, 5.) Is ensemble of these kernels perform better than the baseline?, 6.) What is the effect of training set size? Our experiments demonstrate that our kernel-based WSD algorithms can outperform baseline in terms of F-score. of words (BOW), which is a well-known feature representation technique that only regards the frequency of the words, a basic similarity calculation such as Cosine or Jaccard among sentences id 1 , id 3 , and id 4 will be zero; since they have no words in common. The same situation is valid for the similarity between sentences id 2 and id 5 . On the other hand, the similarity between sentences id 1 and id 2 will probably be greater than zero since they shared a word; "mouse". Moreover, although they convey different messages, the similarity between sentences id 2 and id 4 will probably be greater than zero since they shared a word "cell".