Random Research highlight: Binding sites in PDB: The fast-growing Protein Data Bank is the richest source of structural biological information on the Earth. We described a graph theoretical method for automatically repair, re-organize and re-structure the PDB data. The most important result of this cleaning procedure is the reliable and automatic identification of all the protein-ligand complexes and binding sites in the PDB. We examined the residue composition of the binding sites in the whole PDB and identified strong cysteine and tryptophan irregularities in the data. Int. J. of Bioinformatics Research and Applications 2010 Vol.6, No.6 pp.594 - 608.