We present a sample of 40 Ultra Steep Spectrum ( USS , \alpha \leq - 1.3 , S _ { \nu } \propto \nu ^ { \alpha } ) radio sources selected from the Westerbork in the Southern Hemisphere ( WISH ) catalog . The USS sources have been imaged in K –band at the Cerro Tololo Inter-American Observatory ( CTIO ) and with the Very Large Telescope at Cerro Paranal . We also present VLT , Keck and Willian Herschel Telescope ( WHT ) optical spectroscopy of 14 targets selection from 4 different USS samples . For 12 sources , we have been able to determine the redshifts , including 4 new radio galaxies at z > 3 . We find that most of our USS sources have predominantly small ( < 6″ ) radio sizes and faint magnitudes ( K \mathrel { \hbox to 0.0 pt { \lower 3.0 pt \hbox { $ \mathchar 536 $ } \hss } \raise 2.0 pt% \hbox { $ \mathchar 318 $ } } 18 ) . The mean K - band counterpart magnitude is \overline { K } =18.6 . The expected redshift distribution estimated using the Hubble K - z diagram has a mean of \overline { z } _ { exp } \sim 2.13 , which is higher than the predicted redshift obtained for the SUMSS–NVSS sample and the expected redshift obtained in the 6C ^ { ** } survey . The compact USS sample analyzed here may contain a higher fraction of galaxies which are high redshift and/or are heavily obscured by dust . Using the 74 , 352 and 1400 MHz flux densities of a sub-sample , we construct a radio colour-colour diagram . We find that all but one of our USS sources have a strong tendency to flatten below 352 MHz . We also find that the highest redshift source from this paper ( at z =3.84 ) does not show evidence for spectral flattening down to 151 MHz . This suggests that very low frequency selected USS samples will likely be more efficient to find high redshift galaxies .