While current approaches for video segmentation and indexing are mostly focused on visual information, audio signals may actually play a primary role in video content parsing. In this paper, we present an approach for automatic segmentation, indexing, and retrieval of audiovisual data based on audio content analysis. The accompanying audio signal of audiovisual data is first segmented and classified into basic types, i.e. speech, music, environmental sound, and silence. This coarse-level segmentation and indexing step is based on morphological and statistical analysis of several short-term features of the audio signals. Then, environmental sounds are classified into finer classes such as applause, explosion, bird's sound, etc. This fine-level classification and indexing step is based on time-frequency analysis of audio signals and the use of hidden Markov model (HMM) as the classifier. On top of this archiving scheme, an audiovisual data retrieval system is proposed. Experimental results show that the proposed approach has an accuracy rate higher than 90% for the coarse-level classification, and higher than 85% for the fine-level classification. Examples of audiovisual data segmentation and retrieval are also provided.