The redundancy in the benchmark suite will increase the time for computer system performance evaluation and simulation. The most typical method to solve this problem is to select subsets based on clustering. However, it is a challenge to validate benchmark subsetting results for unlabeled benchmark suites when using the clustering method, and existing research has not considered this problem. Also, there is no quantitative evaluation method for subsetting which can reflect the universal and the diversity characteristics of the benchmark suite at the same time. To solve the above problems, we propose BenchSubset, a framework for selecting benchmark subsets based on consensus clustering, which includes Group Principal Components Analysis, consensus clustering, and a new evaluation method considering the universal and the diversity characteristics of the benchmark suite. We conducted SPEC CPU2017 subsetting experiments on Huawei's Taishan 200, then verified the effectiveness of Bench-Subset in selecting a benchmark subset. Compared with the mainstream principal components analysis with hierarchical clustering (PCA-H) method, the