A dialogue-based interface for information systems is considered a potentially very useful approach to information access. A key step in computer processing of natural-language dialogues is dialogue-act (DA) recognition. In this paper, we apply a feature-based classification approach for DA recognition, by using the maximum entropy (ME) method to build a classifier for labeling utterances with DA tags. The ME method has the advantage that a large number of heterogeneous features can be flexibly combined in one classifier, which can facilitate feature selection. A unique characteristic of our approach is that it does not need to model the prior probability of DAs directly, and thus avoids the use of a discourse grammar. This simplifies the implementation of the classifier and improves the efficiency of DA recognition, without sacrificing the classification accuracy. We evaluate the classifier using a large data set based on the Switchboard corpus. Encouraging performance is observed; the highest classification accuracy achieved is 75.03%. We also propose a heuristic to address the problem of sparseness of the data set. This problem has resulted in poor classification accuracies of some DA types that have very low occurrence frequencies in the data set. Preliminary evaluation shows that the method is effective in improving the macroaverage classification accuracy of the ME classifier.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.