We present an experimental setup to evaluate the relative peiformance of single gaussian and mixture of gaussians models for skin color modeling. Firstly, a sample set of J, J 20, 000 skin pixels from a number of ethnic groups is selected and represented in the chromaticity space. In the following, parameter estimation for both the single gaussian and seven (with 2 to 8 gaussian components) gaussian mixture models is peiformed. For the mixture models, learning is carried out via the expectation-maximisation (EM) algorithm. In order to compare peiformances achieved by the 8 different models, we apply to each model a test set of 800 images -none from the training set. 1rue skin regions, representing the ground truth, are manually selected, and false positive and true positive rates are computed for each value of a specific threshold. Finally, receiver operating characteristics (ROC) curves are plotted for each model, which make it possible to analyze and compare their relative performances. Results obtained show that, for medium to high true positive rates, mixture models (with 2 to 8 components) outpeiform the single gaussian model. Nevertheless. for low false positive rates, all the models behave similarly.kernel to represent the skin cluster in some color space ( e.g. [2,5,6]). An alternative to the single-gaussian model is presented in [I], where the skin color distribution is modelled with a mixture of bivariate gaussian components. A comparative study between the single-gaussian (SG) and the double-gaussian models has been presented there, but a more detailed analysis is lacking to understand how the model behaves when a higher number of gaussian clusters is used in the mixture.This work proposes a performance evaluation technique to analyse the behaviour of models with respect to the number of gaussians. The analysis took into consideration models ranging from I to 8 gaussian components, which lead to two main conclusions. Firstly, skin color mixture models clearly outperform the single gaussian model for medium to high true positive rates. In this case, however, no major differences are noticed among the performances of the 7 mixture models. Secondly, when low false-positive rates are required, all the 8 models tested perform quite similarly.The paper is organized as follows. In section 2, the feature space for representing the human skin color and the sample set used to train the models are presented. In section 3, the standard SG model is described. In s"ection 4, we review our GM model approach as an alternative for SG. Section 5 reports the experiments applied to a test set; with models ranging from one to eight gaussian components. Finally, in section 7 we draw the conclusions and outline future work.