We address the problem of identifying key domain concepts automatically from an unannotated corpus of goal-oriented human-human conversations. We examine two clustering algorithms, one based on mutual information and another one based on Kullback-Liebler distance. In order to compare the results from both techniques quantitatively, we evaluate the outcome clusters against reference concept labels using precision and recall metrics adopted from the evaluation of topic identification task. However, since our system allows more than one cluster to associate with each concept an additional metric, a singularity score, is added to better capture cluster quality. Based on the proposed quality metrics, the results show that Kullback-Liebler-based clustering outperforms mutual information-based clustering for both the optimal quality and the quality achieved using an automatic stopping criterion.