I derive the mass-temperature relation and its time evolution for clusters of galaxies in different cosmologies by means of two different models . The first one is a modification and improvement of a model by Del Popolo & Gambera ( 1999 ) , namely based upon a modification of the top-hat model in order to take account of angular momentum acquisition by protostructures and of an external pressure term in the virial theorem . The second one is based on the merging-halo formalism of Lacey & Cole ( 1993 ) , accounting for the fact that massive clusters accrete matter quasi-continuously , and is an improvement of a model proposed by Voit ( 2000 ) ( herafter V2000 ) , again to take account of angular momentum acquisition by protostructures . The final result is that , in both models , the M-T relation shows a break at T \sim 3 - 4 { keV } . The behavior of the M-T relation is as usual , M \propto T ^ { 3 / 2 } , at the high mass end , and M \propto T ^ { \gamma } , with a value of \gamma > 3 / 2 depending on the chosen cosmology . Larger values of \gamma are related to open cosmologies , while \Lambda CDM cosmologies give results of the slope intermediate between the flat case and the open case . The evolution of the M-T relation , for a given M _ { vir } , is more modest both in flat and open universes in comparison to previous estimate found in literature , even more modest than what found by V2000 . Moreover the time evolution is more rapid in models with L = 0 than in models in which the angular momentum acquisition by protostructures is taken into account ( L \neq 0 ) . The effect of a non-zero cosmological constant is that of slightly increasing the evolution of the M-T relation with respect to open models with L \neq 0 . The evolution is more rapid for larger values ( in absolute value ) of the spectral index , n . The mass-temperature relation , obtained using the quoted models , is also compared with the data by Finoguenov , Reiprich & Bohringer ( 2001 ) ( hereafter FRB ) . The comparison shows that the FRB data is able to rule out very low \Omega _ { 0 } models ( < 0.3 ) , particularly in the open case , and that better fit are obtained by \Lambda CDM models and by CDM models with \Omega _ { 0 } > 0.3 .