Performance Metrics for Weather Images Forecasting

In a typical Machine Learning project, one would need to find out how good or bad their models are by measuring the models’ performance on a test dataset, using some statistical metrics.

Various performance metrics are used for different problems, depending on what needs to be optimized by the models. For this blog, we will focus on the evaluation metrics that are used in weather forecasting, based on radar images.

The major problem that we need to overcome in our forecasting model is to quickly, precisely, and accurately predict the movement of rain clouds in a short period. If heavy rain is predicted as showers or – even worse – as cloudy weather without rain, the consequences could be serious for the users of our prediction model. If the rain is going to stop in the next few minutes, incorrect forecasting – that predicts that rainfall would continue with high intensity – may cause little to no harm; however, the prediction model is no longer useful.

A good model should tackle as many of these issues as possible. We believe that the following measures may help us identify which model is better.

Performance Metrics

  • Root Mean Square Error (RMSE)This is a broad measure of accuracy in terms of an average error across the value of forecast-observation pairs. Formally, it is defined as follows: Root Mean Square Error (RMSE)This measure will help us to compare how much difference intensity between ground truth observation and predicted one.

Root Mean Square Error (RMSE)Figure 1: RMSE between 2 models across 60 minutes forecast.

Figure 1 shows an example of the RMSE between 2 different models over 60 minutes forecast.

The RMSE  of Model 1  is increasing over time. On the other hand, it seems that Model 2 has a smaller RMSE, which means it is a better model out of the two models.

Before defining the next metric, we need to recall about Confusion Matrix (Figure 2). Each column of the matrix represents the instances in an actual class while each row represents the instances in a predicted class, or vice versa [1]. By using Confusion Matrix, we can calculate the number of False Positives (FP), False Negatives (FN), True Positives (TP), and True Negatives (TN).

Figure 2: Definition of a Confusion Matrix.
Figure 2: Definition of a Confusion Matrix.
  • Hit Rate (H): The fraction of observed events that are forecast correctly. This is also known as the Probability of Detection.​ It tells us what proportion had rain was predicted by the algorithm as having rain. It ranges from [0,1].

Hit Rate (H)

Hit rate through the time

Figure 3: Hit rate of 2 models across 60 forecast times.

From Figure 3, the Hit Rate of both models is good in the first 20 minutes. Model 2 has a higher value than model 1 (higher probability of predicting rain). Therefore,  model 2 is the better model base on this measure.

  • False Alarm Ratio (FAR): The fraction of “yes” forecasts that were wrong. It is calculated as follows:

False Alarm Ratio

Even though in weather forecast the False Alarms do not lead to serious consequences. However,  a model with a high FAR measure is not ideal.

  • Bias (B): This measure compares the number of points is predicted as having rain and the total number of actual rain points.  Specifically,


Monte Carlo Simulation

On a nice day 2 years ago, when I was in the financial field. My boss sent our team an email. In this email, he would like to us propose some machine learning techniques to predict stock price.

So, after accepting the assignment from my manager, our team begin to research and apply some approaches for prediction. When we talk about Machine Learning, we often think of supervised and unsupervised learning. But one of the algorithms we applied is one that we forgot however equally highly effective algorithm: Monte Carlo Simulation.

What is Monte Carlo simulation?

The Monte Carlo method is a technique that uses random numbers and probability to solve complex problems. The Monte Carlo simulation, or probability simulation, is a technique used to understand the impact of risk and uncertainty in financial sectors, project management, costs, and other forecasting machine learning models.[1]

Now let’s jump into python implementation to see how it applies,

Python Implementation

In this task, we used data of DXG stock dataset from 2017/01/01 to 2018/08/24 and we would like to know what is stock price after 10 days, 1 month, and 3 months, respectively

Monte Carlo Simulation

We will simulate the return of stock and next price will be calculated by

P(t) = P(0) * (1+return_simulate(t))

Calculate mean and standard deviation of stock returns

miu = np.mean(stock_returns, axis=0)
dev = np.std(stock_returns)

Simulation process


simulation_df = pd.DataFrame()
last_price = init_price
for x in range(mc_rep):
    count = 0
    daily_vol = dev
    price_series = []
    price = last_price * (1 + np.random.normal(miu, daily_vol))
    for y in range(train_days):
        if count == train_days-1:
        price = price_series[count] * (1 + np.random.normal(miu, daily_vol))
        count += 1
    simulation_df[x] = price_series

Visualization Monte Carlo Simulation

fig = plt.figure()
fig.suptitle('Monte Carlo Simulation')
plt.axhline(y = last_price, color = 'r', linestyle = '-')

Monte Carlo Simulation

Now, let’s check with actual stock price after 10 days, 1 month and 3 months

plt.hist(simulation_df.iloc[9,:],bins=15,label ='histogram')
plt.axvline(x = test_simulate.iloc[10], color = 'r', linestyle = '-',label ='Price at 10th')
plt.title('Histogram simulation and last price of 10th day')

Monte Carlo Simulation

We can see the most frequent occurrence price is pretty close to the actual price after 10th

If the forecast period is longer, the results are not good gradually

Simulation for next 1 month

Monte Carlo Simulation

After 3 months

Monte Carlo Simulation


Monte Carlo simulation is used a lot in finance, although it has some weaknesses, hopefully through this article, you will have a new look at the simulation application for forecasting.


[1] Pratik Shukla, Roberto Iriondo, “Monte Carlo Simulation An In-depth Tutorial with Python”, medium,

Please also check Gaussian Samples and N-gram language models,
Bayesian Statistics for more statistics knowledge.


Hiring Data Scientist / Engineer

We are looking for Data Scientist and Engineer.
Please check our Career Page.

Data Science Project

Please check about experiences for Data Science Project

Vietnam AI / Data Science Lab

Vietnam AI Lab

Please also visit Vietnam AI Lab