Temperature profiles based on radio occultation (RO) measurements with the operational European METOP-satellites are used to derive monthly mean global distributions of stratospheric (20–40 km) gravity wave (GW) potential energy densities (<i>E<sub>P</sub></i>) for the period July 2014–December 2016. In order to test whether the sampling and data quality of this data set is sufficient for scientific analysis we investigate to which degree the METOP-observations agree quantitatively with ECMWF operational analysis (IFS-data) and reanalysis (ERA-Interim) data. A systematic comparison between corresponding monthly mean temperature fields determined for a latitude-longitude-altitude grid of 5° by 10° by 1 km is carried out. This yields very low systematic differences between RO and model data below 30 km (i.e., median temperature differences is between −0,2 and +0,3 K) which increases with height to yield median differences of +1,0 K at 34 km and +2,2 K at 40 km. Comparing <i>E<sub>P</sub></i>)-values for three selected locations at which also ground based lidar measurements are available yields excellent agreement between RO and IFS-data below 35 km. ERA-Interim underestimates <i>E<sub>P</sub></i>) under conditions of strong local mountain wave forcing over Norther Scandinavia which is apparently not resolved by the model. Above 35 km, RO-values are consistently much larger than model values which is likely caused by the model sponge layer which damps small scale fluctuations above ~ 32 km altitude. The comparison between RO and lidar data reveals very good qualitative agreement in terms of the seasonal variation of <i>E<sub>P</sub></i>), however, RO-values are consistently smaller than lidar values by about a factor of two. This discrepancy is likely caused by the very different sampling characteristics of RO and lidar observations. Direct comparison of the global data set of RO and model <i>E<sub>P</sub></i>)-fields shows large correlation coefficients (0.4–1.0) with a general degradation with increasing altitude. Concerning absolute differences between observed and modelled <i>E<sub>P</sub></i>)-values, the median difference is relatively small at all altitudes (but increasing with altitude) with an exception between 20 and 25 km where the median difference between RO- and model-data is increased and where also the corresponding variability is found to be very large. The reason for this is identified as an artifact of the <i>E<sub>P</sub></i>)-algorithm: this erroneously interprets the pronounced climatological feature of the tropical tropopause inversion layer (TTIL) as GW activity hence yielding very large <i>E<sub>P</sub></i>)-values in this area and also large differences between model and observations. This is because the RO-data show a more pronounced TTIL than IFS and ERA-Interim. We suggest a correction for this effect based on an estimate of this `artificial' <i>E<sub>P</sub></i>) using monthly mean zonal mean temperature profiles. This correction may be used in the future to study, for example, the annual cycle of zonal mean GW activity using the here considered data.