We report an advanced 'hybrid fingerprint' design concept specifically for the purpose of scaffold hopping. The generation of hybrid fingerprints includes two major steps. In the 'fingerprint reduction' step, bit positions of different types of fingerprints (e.g. substructural and pharmacophore fingerprints) are ranked according to their statistical significance and ability to discriminate between specifically active compounds and database decoys. On the basis of bit ranking, subsets containing the most discriminatory bit positions are determined. In the subsequent 'fingerprint recombination' step, bit subsets from different fingerprints are combined to yield a new compound class-directed fingerprint representation for similarity searching. Here, we generate hybrids from multiple fingerprints and analyze their search performance in comparison with parental fingerprints on compound activity classes that exclusively consist of molecules with unique core structures and that exhibit different levels of intra-class structural diversity. Fingerprint reduction is found to be a critical component of hybrid design. The resulting compound class-directed hybrid fingerprints further increase the similarity search performance and scaffold hopping potential of their parental fingerprints. Thus, fingerprint reduction and recombination improve compound recall and increase the structural diversity of hits.Key words: compound class-directed similarity searching, fingerprint recombination, fingerprint reduction, Kullback-Leibler divergence, scaffold hopping Similarity searching using two-dimensional (2D) molecular fingerprints is a widely applied technique in pharmaceutical research for computational screening of large compounds databases (1-3). Binary keyed fingerprints are bit-string representations of molecular structure and properties where each bit encodes the presence or absence of a specific chemical feature. Diverse types of fingerprints have been introduced that encode molecular information in different ways, for example, as substructures, topological pathways, pharmacophore patterns, or numerical property descriptors (2). Most conventional fingerprint designs have a fixed length, but molecule-or compound classspecific fingerprints of variable format have also been introduced (4,5). A systematic comparison of standard fingerprints capturing different aspects of molecular structure has recently been reported (6).Regardless of fingerprint design, conventional similarity searching relies on the use of complete fingerprint representations as descriptors and on the quantification of fingerprint overlap as a measure of molecular similarity (1). However, in recent years, modifications to standard fingerprint searching have been reported that include consensus fingerprints (7), bit scaling (8), or reverse fingerprinting (9).Furthermore, concepts for the generation of minimal fingerprint representations retaining high search performance have recently also been introduced. These methods include bit density reduction (10), bit silen...