A database consisting of 780 ligand-receptor complexes, termed SB2010, has been derived from the Protein Databank to evaluate the accuracy of docking protocols for regenerating bound ligand conformations. The goal is to provide easily accessible community resources for development of improved procedures to aid virtual screening for ligands with a wide range of flexibilities. Three core experiments using the program DOCK, which employ rigid (ROD), fixed anchor (FAD), and flexible (FLX) protocols, were used to gauge performance by several different metrics: (I) global results, (2) ligand flexibility, (3) protein family, and (4) cross-docking. Global spectrum plots of successes and failures vs rmsd reveal well-defined inflection regions, which suggest the commonly used 2 angstrom criteria is a reasonable choice for defining success. Across all 780 systems, success tracks with the relative difficulty of the calculations: RGD (82.3%) > FAD (78.1%) > FLX (63.8%). In general, failures due to scoring strongly outweigh those due to sampling. Subsets of SB2010 grouped by ligand flexibility (7-or-less, 8-to-15, and 15-plus rotatable bonds) reveal that success degrades linearly for FAD and FLX protocols, in contrast to ROD, which remains constant. Despite the challenges associated with FLX anchor orientation and on-the-fly flexible growth, success rates for the 7-or-less (74.5%) and, in particular, the 8-to-15 (55.2%) subset are encouraging. Poorer results for the very flexible 15-plus set (39.3%) indicate substantial room for improvement. Family-based success appears largely independent of ligand flexibility, suggesting a strong dependence on the binding site environment. For example, zinc-containing proteins are generally problematic, despite moderately flexible ligands. Finally, representative cross-docking examples, for carbonic anhydrase, thermolysin, and neuraminidase families, show the utility of family-based analysis for rapid identification of particularly good or bad docking trends, and the type of failures involved (scoring/sampling), which will likely be of interest to researchers making specific receptor choices for virtual screening. SB2010 is available for download at http://rizzolab.org.