In recent years, many virtual screening (VS) tools have been developed that employ different molecular representations and have different speed and accuracy characteristics. In this paper, we compare ten popular ligand-based VS tools using the publicly available Directory of Useful Decoys (DUD) dataset comprising over 100,000 compounds distributed across 40 protein targets. The DUD was developed initially to evaluate docking algorithms, but our results from an operational correlation analysis show that it is also well suited for comparing ligand-based VS tools. Although it is conventional wisdom that 3D molecular shape is an important determinant of biological activity, our results based on permutational significance tests of several commonly used VS metrics show that the 2D fingerprint-based methods generally give better VS performance than the 3D shape-based approaches for surprisingly many of the DUD targets. In order to help understand this finding, we have analysed the nature of the scoring functions used and the composition of the DUD dataset itself. We propose that in order to * To whom correspondence should be addressed † INRIA Nancy Grand Est, LORIA, 54506, Vandoeuvre-lès-Nancy, France ‡ Contributed equally to this work 1 improve the VS performance of current 3D methods, it will be necessary to devise screening queries which can represent multiple possible conformations and which can exploit knowledge of known actives that span multiple scaffold families.