We study the luminosity gap , { \Delta } m _ { 12 } , between the first and second ranked galaxies in a sample of 59 massive ( \sim 10 ^ { 15 } \mathrel { M _ { \odot } } ) galaxy clusters , using data from the Hale Telescope , the Hubble Space Telescope ( HST ) , Chandra , and Spitzer . We find that { \Delta } m _ { 12 } distribution , p ( { \Delta } m _ { 12 } ) , is a declining function of { \Delta } m _ { 12 } , to which we fitted a straight line : p ( { \Delta } m _ { 12 } ) \propto - ( 0.13 \pm 0.02 ) { \Delta } m _ { 12 } . The fraction of clusters with “ large ” luminosity gaps is p ( { \Delta } m _ { 12 } \geq 1 ) = 0.37 \pm 0.08 , which represents a 3 \sigma excess over that obtained from Monte Carlo simulations of a Schechter function that matches the mean cluster galaxy luminosity function . We also identify four clusters with “ extreme ” luminosity gaps , { \Delta } m _ { 12 } \geq 2 , giving a fraction of p ( { \Delta } m _ { 12 } \geq 2 ) = 0.07 ^ { +0.05 } _ { -0.03 } . More generally , large luminosity gap clusters are relatively homogeneous , with elliptical/disky brightest cluster galaxies ( BCGs ) , cuspy gas density profiles ( i.e. strong cool cores ) , high concentrations , and low substructure fractions . In contrast , small luminosity gap clusters are heterogeneous , spanning the full range of boxy/elliptical/disky BCG morphologies , the full range of cool core strengths and dark matter concentrations , and have large substructure fractions . Taken together , these results imply that the amplitude of the luminosity gap is a function of both the formation epoch , and the recent infall history of the cluster . “ BCG dominance ” is therefore a phase that a cluster may evolve through , and is not an evolutionary “ cul-de-sac ” . We also compare our results with semi-analytic model predictions based on the Millennium Simulation . None of the models are able to reproduce all of the observational results on { \Delta } m _ { 12 } , underlining the inability of the current generation of models to match the empirical properties of BCGs . We identify the strength of AGN feedback and the efficiency with which cluster galaxies are replenished after they merge with the BCG in each model as possible causes of these discrepancies .