DTC Winter Forecast Experiment (DWFE) User Manual

Model Verification

Version 20050418

Table of Contents

1.0	Models
2.0	Parameter
3.0	Observation Type
4.0	Stratification
5.0	Time Increment
6.0	Output

The DTC Winter Forecast Experiment (DWFE) was motivated by the needs of the National Weather Service to improve model guidance in support of their winter weather forecast and warning mission. The DWFE experiment uses high-resolution (5 km) NWP models with improved physics, in an effort to offer a solution. The experiment will run from 15 January 2005 through 31 March 2005 over a CONUS domain, with a special emphasis on the Eastern United States. More information can be found on the DTC's website.

1.0 Models

The model refers to the forecast model being verified. There are three possible selections available to the user for DWFE verification; the Advanced Research WRF (WRF/ARW), the Non-Hydrostatic Mesoscale Model (WRF/NMM), and the operational Eta Model (North American Mesoscale model (NAM)). Statistics can be generated for any combination of models by checking the boxes next to the model names.

1.1 Advanced Research WRF (WRF/ARW)

The ARW is run at NCAR at a 5 km resolution. The ARW features a Eulerian mass (hydrostatic pressure) vertical coordinate and an Eulerian solver for the fully compressible nonhydrostatic equations. Detailed information about the WRF/ARW can be found at the following website.

1.2 Non-Hydrostatic Mesoscale Model (WRF/NMM)

The NMM was developed at NCEP and is currently being run at the Earth System Research Laboratory (ESRL) at a 5 km resolution. The NMM offers a non-hydrostatic dynamic core as a second option to the Eulerian mass core model. More information about the NMM can be found in the following PDF file.

1.3 NCEP Operational Eta Model

The Eta model is run operationally at NCEP on a 12 km grid. More information about this model can be found at the following website.

2.0 Parameter

The parameter refers to the forecast variable being verified. Verification is performed only on the 00Z model runs. The observation type and level options are dependent upon the parameter selected. Figure 2.1 shows the observation types and levels that are available for each parameter.

Figure 2.1. A schematic showing the possible combinations for the model, parameter, observation type, and level selections.

2.1 Precipitation (3-hourly)

A grid-to-grid verification against NCEP Stage II radar/gauge precipitation analyses. Only 00Z model runs are verified. The verification is always the last 3 hours of the specified lead time (forecast hour). For example, if one chooses a 24-hour forecast period, the verification is performed over hours 22, 23, and 24. Forecast and analysis products are remapped to a common grid in order to perform grid-to-grid verification.

2.2 Precipitation (Daily)

A grid-to-grid verification against NCEP's Climate Prediction Center (CPC) gauge-only 1/8th degree daily precipitation analysis. The CPC daily analysis is valid from 12Z one day to 12Z the next day. Because all DWFE model runs are performed at 00Z, daily verification is only available from forecast hours 12 through 36. Forecast and analysis products are remapped to a common grid in order to perform grid-to-grid verification.

2.3 Relative Humidity (RH)

The ratio of the vapor pressure to the saturation vapor pressure with respect to water.

2.4 Sea Level Pressure (SLP)

The atmospheric pressure reduced to sea level.

2.5 Temperature

The temperature at a given level.

2.6 Vector Wind

The magnitude and direction of the wind at a given level.

3.0 Observation Type

3.1 2m Data

These data include temperature and RH at 2 meters above the surface. Surface (SFC) is the only level option for this observation type.

3.2 10m Data

These data are at 10 meters above the surface. These data are available for winds only at the surface level.

3.3 Any Surface Data

Includes data from any surface data source. Surface (SFC) is the only level option for this observation type.

3.4 Any Upper-Air

Includes data from any upper-air data source. These data are available at the following levels: 1000-850mb, 850-700mb, 700-550mb, 550-400mb, 400-300mb, 300-250mb, 250-200mb, and ALL.

3.5 Conventional Upper-Air

Conventional Upper Air data (ADPUPA) comes from radiosonde observations (RAOB), pilot balloon data (PIBAL), reconnaissance (RECCO), and dropsondes. Conventional upper-air data are available for the following levels: 1000mb, 850mb, 700mb, 500mb, 400mb, 300mb, 250mb, 200mb, 150mb, 100mb, and ALL.

3.6 Precipitation analysis

The 3-hourly precipitation data come from NCEP's Stage II Precipitation Analysis (radar + gauge). Daily (24-h) precipitation data come from NCEP's CPC 1/8th Degree Precipitation Analysis (gauge only). Both these data are available only at the surface. Forecast and analysis products are remapped to a common grid in order to perform the appropriate grid-to-grid verification.

3.7 Profiler

Profiler data comes from NOAA wind profilers that are specially designed to measure vertical profiles of horizontal wind speed and direction from near the surface to above the tropopause. Profiler data are available for the levels of 1000-850mb, 850-700mb, 700-550mb, 550-400mb, 400-300mb, 300-250mb, 250-200mb, and ALL. More information about the NOAA Profiler Network can be found at http://www.profiler.noaa.gov/.

3.8 WSR-88D

These data come from the National Weather Service WSR-88D Doppler radar. WSR-88D data are avalaible for the following levels: 1000-850mb, 850-700mb, 700-550mb, 550-400mb, 400-300mb, 300-250mb, 250-200mb, and ALL (includes all of the layers). More information about the WSR-88D radar can be found at http://www.srh.noaa.gov/radar/radinfo/radinfo.htm.

4.0 Stratification

4.1 Verification Grid

The models are run on a full U.S. domain. Verification is performed on the National domain (covers the CONUS), as well as the West, Central, and Eastern domains (Figure 4.1).

Figure 4.1. Map of the U.S. showing the West, Central, and Eastern domains.

4.2 Level

The Level is the pressure level in the atmosphere at which the forecast is evaluated. Level options include: Surface (SFC), 1000mb, 850mb, 700mb, 500mb, 400mb, 300mb, 250mb, 200mb, 150mb, 100mb. Several layers are also available, including: 1000-850mb, 850-700mb, 700-550mb, 550-400mb, 400-300mb, 300-250mb, 250-200mb. There is also an "ALL" option, which includes all relevant levels, depending on the observation type selected.

5.0 Time Increment

Only model runs from 00Z are verified. No other run times (06Z, 12Z, 18Z) are verified because the WRF models are only run at 00Z, out to 48 hours. For daily precipitation, verification scores are only available from 12Z to 12Z, so the only lead time (forecast hour) available is the 36-hour. The 3-hourly verification, out to 24 hours, is for the last 3 hours in the 24-h period.

Figure 5.1. Shows the NCEP verification time periods.

5.1 Beginning Date

The Beginning Date will default to either the previous date chosen by the user or to the earliest date for which data are available. The dates are used to allow access to statistics for any user-defined period of time (e.g., day, week, month, year). All dates in the NCEP verification system refer to the valid dates of the forecasts.

5.2 Ending Date

The Ending Date will default to either the previous date chosen by the user or to the latest date for which data are available (usually 2 days before the current date). All dates in the NCEP verification system refer to the valid dates of the forecasts.

5.3 Date Event Equalization

Date Event Equalization can be applied to the verification analyses to ensure that an equal number of observations is used to verify each of the models. When this function is turned off, all of the data available will be used for verification, regardless of whether data are available for each of the selected models. When event equalization is turned on, only data that are available for all models selected will be used in the verification analysis.

5.3 Lead Time (Forecast Hour)

The Lead Time (Forecast Hour) is the period of the model forecast, which ranges from 0-48 hours. For example, if a user chooses a beginning and ending period of March 15 and a lead time (forecast hour) of 36 hours, then the model run for that verification occurred on 00Z of the previous day (March 14) and the verification is valid at 12Z on March 15. Recall that all dates in the NCEP verification system refer to the valid dates of the forecasts.

6.0 Output

6.1 Plot Type

6.1.1 Time Series

By default, the time series plot shows the date increment on the x-axis with the statistic plotted on the y-axis. Figure 6.1 shows an example of a time series plot.

Figure 6.1. Example of time series output, showing two time series plots with the valid time increment on the x-axis and the bias and RMSE statistics on the y-axis.

6.1.2 Height Series

Height series plots show the time averaged statistic on the x-axis and the pressure (height) levels on the y-axis. This gives a vertical representation of the statistical scores. Figure 6.2 shows an example of a height series plot.

Figure 6.2. Example of height series output. The top plot shows bias on the x-axis and the bottom plot shows the RMSE score.

6.1.3 All-Level Time Period Aggregation

The All-Level Time Period Aggregation plot can be created by selecting "ALL" for the Forecast Hour and "ALL" for the Level. The time-averaged skill score is displayed on the x-axis and the pressure levels are displayed on the y-axis.

Figure 6.3. Example of the Time Period Aggregation plot.

6.1.4 Single-Level Time Period Aggregation

The Single-Level Time Period Aggregation plot can be created by selecting "ALL" for the Forecast Hour and any desired Level. The forecast hour is plotted on the x-axis and the time-averaged skill score is plotted on the y-axis for the selected pressure level.

Figure 6.4. Example of the Single-Level Time Period Aggregation plot.

6.2 Statistics for Surface & Upper-Air Verification

Verification for continuous variables (RH, Temp, SLP, Vector Wind) is performed by 3-dimensionally interpolating model forecast values to observation locations. The technique is to bi-linearly interpolate in the horizontal and to perform linear in ln(p) interpolation in the vertical.

6.2.1 Bias

Bias is the difference between the forecast and the observed value.

Bias = (F - O)

6.2.2 Correlation

Correlation is the measure of association between two variables. It measures how strongly the variables are related, or change, with each other.

6.2.3 Covariance

Covariance is a measure of the relationship between the forecasts and observations and is defined as the average of the products of the deviations of each forecast/observation pair from their respective mean.

6.2.4 Forecast Average

The mean of the forecast data.

6.2.5 Forecast*Observation Standard Deviation (F*O STDV)

The standard deviation of the forecast times the observation.

6.2.6 Forecast Standard Deviation (STDV)

The square root of the average of the squares of deviations about the mean of the forecast data.

6.2.7 Forecast Variance

A measure of the dispersion of the forecast data set.

6.2.8 Mean Absolute Error (MAE)

The mean absolute error is the absolute value of the difference between the forecast and observed values (the error).

6.2.9 Mean Squared Error (MSE)

The average of the square of the difference between the forecast and observed values (the error).

6.2.10 Observation Average

The mean of the observations.

6.2.11 Observation Standard Deviation (STDV)

The square root of the average of the squares of deviations about the mean of the observation data.

6.2.12 Observation Variance

A measure of the dispersion of the observation data set.

6.2.13 Root Mean Squared Error (RMSE)

The square root of the mean square error (MSE).

6.3 Statistics for Precipitation Verification

Dichotomous statistics are used to verify precipitation at specificed thresholds of 0.01, 0.1, 0.25, 0.5, 0.75, 1.0, 2.0, and 3.0 inches. The forecast/observation pairs used to compute the skill scores are summarized in Table 6.1. The rows in the table represent the forecasts, the columns in the table represent the observations, and the elements in the cells represent the counts of forecast/observation pairs (YY, YN, NY, NN).

Table 6.1 Contingency table for evaluation of dichotomous (Yes/No) forecasts. Elements in the cells are the counts of forecast/observation pairs.

Forecast Observation Total
Yes No
Yes YY YN YY+YN
No NY NN NY+NN
Total YY+NY YN+NN YY+YN+NY+NN

6.3.1 Bias

Bias is the ratio of the number of Yes forecasts to the number of Yes observations. A bias greater than one indicates over-forecasting and a bias less than one indicates under-forecasting.

Bias = (YY + YN) / (YY + NY)

6.3.2 Conditional Miss Rate (CMR)

The detection failure ratio. The proportion of non-forecast events that were incorrectly forecast.

CMR = NY / (NY + NN)

6.3.3 Critical Success Index (CSI)

CSI is the proportion of events that were either forecast or observed that were correctly forecast. CSI is also known as Threat Score.

CSI = YY / (YY + NY + YN)

6.3.4 Equitable Threat Score (ETS)

The Equitable Threat Score (ETS) measures the fraction of all events forecast and/or observed that were correctly diagnosed, accounting for the hits that would occur purely due to random chance. A score of one is considered perfect for this statistic.

ETS = (YY - C2) / [(YY + YN + NY) - C2]

where C2 = [(YY + YN)*(YY + NY)] / (YY + YN + NY + NN)

6.3.5 False Alarm Rate (FARate)

The probability of false detection.

POFD = YN / (YN + NN)

6.3.6 False Alarm Ratio (FAR)

FAR is the proportion of Yes forecasts that were not accurate.

FAR = YN /(YY + YN)

6.3.7 Heidke Skill Score

Heidke Skill Score (Heidke) measures the increase in proportion correct for the forecast system, relative to that of random chance.

Heidke = (YY + NN - C1) / (N - C1)

where N = YY + NY + YN + NN

C1 = [ (YY + YN)*{YY + NY) ] + [ (NY + NN)*(YN + NN)] / N

6.3.8 Log of Odds Ratio ( ln(OR) )

The log of the Odds Ratio.

6.3.9 Odds Ratio (OR)

The Odds Ratio (OR) gives the ratio of the odds of making a hit to the odds of making a false alarm, and takes prior probability into account.

OR = (YY * NN) / (NY * YN)

6.3.10 Odds Ratio Skill Score (ORSS)

The Odds Ratio Skill Score (ORSS) is a transformation of the odds ratio to have the range [-1, +1].

ORSS = [(YY * NN) - (NY * YN)] / [(YY * NN) + (NY * YN)]

6.3.11 Odds Ratio Skill Score Cubed (ORSS^3)

The ORSS to the third power.

6.3.12 Peirce-Hanssen-Kuipers Skill Score (PHK)

The probability of detection minus the probability of false alarm. Also referred to as the True Skill Statistic.

PHK = ( YY / (YY + NY) ) - ( YN / (YN + NN) )

6.3.13 Probability of Detection (POD)

The PODy is defined as the probability of detecting a YES event. It is the proportion of YES events that were correctly forecast.

PODy = YY / (YY + NY)

6.3.14 Post Agreement

The proportion of "forecast events" that were correctly forecast. The frequency of hits (1-FAR).

PA = YY/ (YY + YN)

6.3.15 STDV ln(OR)

The standard deviation of the log of the Odds Ratio.

6.3.16 Threat Score

The threat score is the proportion of hits that were either forecast or observed. It is also known as CSI.

Threat Score = YY / (YY + NY + YN)


Back to top