NOOS sea level error statistics, 2008-2012
Data handling
We process five calendar years of tide gauge data and modelled sea level: 2008-2012.
Tide gauge data is retrieved thru the DMI MySQL data base.
Data is collected in real time.
Model data is retrieved from DMIs NOOS Exchange Archive. Partners issue forecasts
on an N= 6-, 12-, or 24 hour schedule. We establish one-year time series by
concatenating all forecasts, using only the part ranging from analysis time and
N hours ahead. If a forecast is missing, N more hours of the
previous forecast is used to bridge the gap. If two or more consecutive forecasts are missing,
this is repeated if possible.
All data, tide gauge and model, is interpolated or sampled onto 10-minute series.
- 5 minute data is sampled
- 10 minute data is used as is
- 15 minute data at regular minutes (00,15,30,45) is reduced to 30 minute, then interpolated
- 15 minute data at odd minutes (07,22,37,52) is shifted onto nearest 10 minute (10,20,40,50). 00 and 30 will be missing data.
- 30 and 60 minute data is time interpolated
QC
Tide gauge data QC checks for large jumps and spurious zeroes. frc-obs difference time series and scatter diagrams
are inspected visually, in order to get rid of any remaining suspicious data.
Forecasts are QC'ed for utter crap. (This happens). Files with format overflow or wrong time stamps are discarded.
Error measures
We examine
- data coverage
- mean sea level, observed and predicted
- range (max-min)
- skewness (max/min)
- variance
- peaks (sum(frc_peak)/sum(obs_peak))
- hit rate using a 20 cm tolerance band
- peak hit rate using tolerance level= 0.1
- largest forecast error
For peaks, we use the using 10 highest observed, with a minimum time separation of 12 hours. Forecasted peak must lie within -/+ 6 hours from the time of the observed peak.
Unbiasing
Skewness, peaks, hit rate, peak hit rate and worst case are examined twice:
- using model data 'as is'
- using unbiased model data
when the bias for the year/station in question is subtracted.
Bias may appear either as a general off-set, or by different representation of low waters only.
The latter happens typically when tide gauge data is skewed.
In that case, unbiasing will not lead to a better forecast.
Average statistics for all stations
All stations
Pick an error measure
data coverage
mean sea level
range
skewness
variance
peaks
hit rate
peak hit rate
worst case
Pick a station
TorsmindeKyst
Esbjerg
Vidaa
Wick
Aberdeen
Immingham
Lowestoft
NorthShields
Sheerness
Helgoland
Cuxhaven
Borkum
Bremerhaven
Husum
HoekVanHolland
Vlissingen
RoompotBuiten
IJmuiden
DenHelder
Delfzijl
Tregde
Stavanger
Oostende