Along with the fast development of digital information technology and the application of Internet, video data begins to grow explosively. Some applications with high real-time requirements, such as object detection, require strong online video storage and analysis capabilities. Key-frame extraction is an important technique in video analysis, which provides an organizational framework for dealing with video content and reduces the amount of data required in video indexing. To address the problem, this study proposes a key-frame extraction method based on HSV (hue, saturation, value) histogram and adaptive clustering. The HSV histogram is used as color features for each frame, which reduces the amount of data. Furthermore, by using the transformed one-dimensional eigenvector, the fixed number of features can be extracted for images with different sizes. Then, a cluster validation technique, the silhouette coefficient, is employed to get the appropriate number of clusters without setting any clustering parameters. Finally, several algorithms are compared in the experiments. The density peak clustering algorithm (DPCA) model is shown to be more effective than the other four models in precision and F-measure.