Protein structure comparison algorithms can be used to identify distantly related proteins or to categorize differ ences in binding specificities. When they are presented in different conformations, distantly related proteins can go unrecognized unless flexible representations of whole protein structures are used. Such representations offer a sophisticated description of backbone motion, but they do not incorporate the potential motion of every atom. Thus, existing representations, both rigid and flexible, cannot compensate for atomic motions that can make binding sites with similar binding preferences appear different. To bridge this gap, this paper presents a tool for comparing protein binding sites despite conformational changes in the binding site. Our method employs ensemble clustering techniques to incorporate the diversity of binding site variations observed in conformational samples of binding site motion. We applied the method on protein conformations of serine proteases and enolase superfamilies. Our results demonstrate that this approach can distinguish proteins with similar binding preferences in the presence of considerable binding site flexibility. I. I NTRODUCTION Conformational flexibility is significant complication for the accurate the comparison of protein structures. Many algorithms perform efficiently because they apply rigid transformations to superpose atoms from different structures without considering alternative conformations. With this simplifying assumption, eXIstIng methods can rapidly align backbone carbon atoms [1]-[7], distance matrices [8], graphical topologies [9]-[11] and geometric surfaces [12]-[15] to detect structural similarities between remote homologs. The rigidity assumption also enables methods to rapidly discover structural variations between closely related proteins with different binding specificities [16], [17]. However, without the simplification of rigidity, the structural comparisons would be more difficult because all protein conformations must be considered. A recent class of algorithms use rigid secondary structural elements with flexible linkers to represent protein structures via hinges [18], [19], graphs [20], [21], fragments [22] and dynamic programming [23]-[25]. Most approaches are designed to identify remote homologs that could be overlooked from conformations of each protein. But flexible linkers do not describe smaller atomic motions inside binding cavities, and they thus have limited applications to categorize flexible binding sites with subtly different binding preferences.This paper presents an algorithm for categorizing protein structures based on ligand binding preferences, despite the presence of structural varIatIOns. Beginning with conformational samples of several proteins, our method clusters binding sites found in randomly selected samples of each protein. Integrating many randomly generated clusterings yields an average categorization of the structural similarities and variations between the binding sites. This approach reflects all-atom mo...