If a spoken dialog system can respond to a user as naturally as a human, the interaction will appear smoother. In this research, we aim to develop a spoken dialog system that emulates human behavior in a dialog. The proposed system makes use of a decision tree to generate responses at the appropriate times. These responses include 'aizuchi ' (back-channel), 'repetition', 'collaborative completion', etc. At each time interval, the decision tree generates the response timing features referring to the pitch and energy contours, recognition hypotheses, and the preparation status of the response generator. A subjective evaluation shows that there is a high degree of naturalness in the timing of ordinary responses and aizuchi, and that the spoken dialog system exhibits user-friendly behavior. The recorded voice system was preferred to a text-to-speech system (synthesized speech), and almost all subjects felt familiarity with the aizuchi.