The challenges in fault prediction today are to get a prediction as early as possible, at as low a cost as possible, needing as little data as possible and preferably in such a language that your average developer can understand where it came from. This paper presents a fault sampling method where a summary of a few, easily available metrics is used together with the results of a few sampled classes to generalize the fault content to an entire system. The method is tested on a large software system written in Java, that currently consists of around 2 000 classes and 300 000 lines of code. The evaluation shows that the fault generalization method is good at predicting fault-prone clusters and that it is possible to generalize the values of a few representative classes.