When you look at a weather forecast, you typically see one prediction for the time you’re interested in. Temperatures of 12.3 degrees at 1PM, wind force 9 at 8 o’clock, and so forth. It suggests that the forecast is highly certain. However, a complex world is hidden behind that single value, which is produced by weather models that may contain errors. Furthermore, certain weather events – such as storm systems – lead to actional uncertainty in weather forecasts. In other words, that single value, can we trust it?
Sometimes, the answer is no. In these cases, providing the best forecast to our customers does not mean stating a single value, but rather communicating the uncertainties. In other words, this means communicating the range of possibilities for these weather elements, typically generated by ensemble models.
In this blog in the AI in weather forecasting series, we dive into these models. First, we’ll investigate why we need ensemble models in the first place – regardless of whether we’re using physics-based numerical models or AI-based models. This is followed by a range of approaches used to generate AI-based ensemble forecasts, to highlight developments in that area.
NWP and AI: the need for ensemble weather forecasting
If we recall how weather models work, we find that each forecast starts with an initial state, known as an analysis. For an arbitrary numerical weather model, an analysis is typically a previous forecast for the analysis time step T which is adjusted mathematically to observations measured since the forecast was issued. Many of today’s AI-based weather models still rely on NWP-based analyses, although developments in AI-based data assimilation are on the horizon. However, the process of generating the analysis already reveals that being an adjusted forecast, it is always a bit off with respect to actual weather conditions. This is inevitable, because we cannot take measurements at every location and at every altitude. However, it leads to uncertainty in the weather forecast. One form of uncertainty is therefore:
- Uncertainty from initial conditions: NWP models use physics to compute the analysis forward in time; AI models use their learned mapping for doing so. If the initial conditions slightly deviate from actual weather, what does that mean for predictions 3 days from now? And 9 days from now? And 30 days from now?
Now let’s look at the actual forecasting done by weather models: forecasting future weather based on current conditions. Both NWP- and AI-based weather models are computer programs. In NWP models, physics has been programmed into the model, while smaller-scale phenomena are resolved by parameterizations. However, to reduce computational cost, weather predictions are discretized: the model predicts for a grid, which has grid points that are typically between 1 to 25 kilometers apart. In other words, for many models, all temperatures in a 1x1 kilometer box (and in some cases, even larger boxes) for one time step are summarized to 1 value. Since AI-based models are mostly trained on NWP analyses, that applies to them too. And even worse – they use learned parameters. What if they perform well in, let’s say, 98% of the cases, but not in the other 2%? Another case of uncertainty is:
- Uncertainty from the model itself: all models are wrong, but some are useful. Trade-offs for computational cost in NWP models and potential errors from training AI-based weather models mean that errors are introduced in predictions. Since these errors are used for generating the next predictions (be it through the application of physics or the learned parameters), error accumulates over time, leading to uncertainty.
Not only models and their initialization can produce uncertainty; the weather itself can do so too! In the case of complex and unpredictable weather events, such as thunderstorms or winter weather, even a small difference can make a difference of day and night. This leads us to the third source of uncertainty in a deterministic weather forecast:
- Uncertainty from weather events: relatively small-scale, complex and unpredictable weather events can add significant uncertainty to a weather forecast, especially when they are predicted a bit further ahead in time.
> Figure 1: Small errors in initial model conditions, the way in which weather models reduce computational complexity and weather itself may lead to entirely different forecasts.
Introducing AI-based ensemble weather models
Let’s now introduce ensemble weather models. It is a forecasting approach that uses multiple simulations for predicting weather conditions. Rather than relying on a single forecast, ensemble models generate a set of predictions, each slightly varied. Producing a range of possible scenarios, ensemble models offer a more comprehensive view of potential weather patterns.
Let’s examine how ensemble forecasts are generated. First, we’ll look at using slightly different input data – an approach that is common in NWP-based ensemble models but has also been tested for AI-based ensembles. Then, we’ll explore approaches specifically developed for AI-based weather models, such as post-hoc processing, using generative techniques and developing AI/NWP based hybrids for ensemble forecasting.
Figure 2: Because ensemble models effectively combine multiple forecasts combined into one, it is possible to say something about the most likely weather (in this case, around the ensemble mean) but also something about the uncertainty. In the case above, after about 3 days, uncertainty increases, as can be seen from the increasing probability ranges.
Using slightly different input data
Recall that the initial conditions are used to generate the weather forecast, regardless of the model being an NWP model or an AI model. In other words, if we slightly change the initial conditions in a reasonable way, this has a direct impact on the forecast. This is the basis of ensemble models.
For example, in NWP, the ECMWF ensemble model generates 50 individual forecasts with slightly different initial conditions. These forecasts, also known as members of the ensemble model, are then statistically aggregated into values like the ensemble mean and probability ranges via the percentiles. Verification of ensemble data suggests that ensemble means tend to perform well for forecasting, as they cancel out outliers in weather forecasts.
Figure 3: Two members from the ECMWF ensemble model give entirely different 2-meter temperature forecasts 13 days into the future. Quite an uncertain forecast!
NWP-based ensemble models have been around for quite some time. AI-based ensemble models are something new, like the field of AI-based weather forecasting itself. One way of generating AI-based ensembles is by using slightly different initial conditions. In fact, recent works investigated multiple ways of generating these different initial conditions (Bülte et al., 2024). To summarize, it is possible to generate AI-based ensembles with different initial conditions by:
- Using gaussian noise perturbations, which means “[adding] random noise to the [initial conditions] (...) from which the (...) model would be initialized”. Typically, Gaussian noise is used – which is noise sampled from a Gaussian distribution, in this case with a zero mean and a (tuned) standard deviation based on the variable at hand.
- Using random field perturbations, which means “[using] the scaled difference of two independent, randomly selected atmospheric states from the past as perturbation”. It can be the case that the randomness involved with Gaussian noise does not capture the underlying dynamics of the weather very well, while other atmospheric states from the past do. By selecting these relatively randomly, then adding their scaled difference to the initial conditions, one still gets an ensemble of initial conditions – and thus various forecasts.
- Using initial conditions of the ECMWF ensemble model, which is as simple as taking the initial conditions from the ECMWF ensemble model run and using them to generate the AI-based weather forecast. Because these initial conditions are perturbed by the NWP model, they provide the set of forecasts needed to estimate uncertainty.
Post-hoc adaptation of ensemble forecasts
Rather than generating ensemble forecasts with slightly different initial conditions, it is also possible to generate them via postprocessing. These techniques use the deterministic forecast generated by a weather model – in our case, an AI-based weather model – and generate an ensemble out of it (Bülte et al., 2024).
Using generative AI for ensemble modelling
Another interesting approach for generating AI-based ensembles is to use generative AI. As we’ve seen in our article on MLWP technology, techniques used in tools like ChatGPT can also be used for weather forecasting. Specifically, a class of models called diffusion models is very interesting for this purpose. Effectively being image reconstruction models, training them means destroying the input data first with cumulative noise, then teaching the model how to reconstruct them. In recent works, this has been used for generating weather predictions and for making forecasts sharper.
Figure 4: The learned diffusion decoder: give it noise, get an image. Image from Ho et al. (2020).
What’s interesting about diffusion models is that they can be conditioned. In other words, in the reconstruction phase, it is possible to provide extra information that the model can use. In the case of GenCast, an AI-based ensemble weather model developed by Google DeepMind, predicting time step t from some time step T (thus T + t hours) follows this process:
- For the time step, noise is sampled and converted into the residual by the diffusion model. In this case, the residual is what must be added to the previous time step T to arrive at the forecast for T + t.
- In each reconstruction step, the previous time step T is used as conditioning for generating the residual value of the forecast with respect to the previous time step.
- Finally, the residual is added to the previous time step to generate the forecast.
- The process starts again at step 1, but now for time step T + t + t with the forecast for T + t as conditioning. By repeating this until some forecast horizon, an AI based forecast is generated. By doing this multiple times, e.g. by changing input conditions, an AI-based ensemble weather forecast is generated.
Figure 5: The GenCast AI-based ensemble weather model. Image from Price et al. (2023).
GenCast is not the only model using generative AI techniques. For example, the SEEDS weather model – also developed by Google – uses a type of diffusion model for generating ensemble forecasts as well (Li et al., 2024).
Hybrid AI/physics ensemble models
Many of today’s AI-based weather models are entirely data driven, meaning that no physics is involved in generating the weather forecast. Rather, physical consistency is embedded in the input and output data the model uses during training. A valid question, despite the large volume of data such models are trained on, is whether AI-based weather models are consistent with the laws of physics. That question paves the path for hybrids between AI and physics-based models, of which NeuralGCM is an interesting example (Kochkov et al., 2024). Combining an atmospheric solver with machine learning approaches, the model can generate forecasts that are competitive with other NWP- and AI-based weather models for up to 15 days. Interestingly, it does so at a fraction of the computational cost required for NWP-based weather models.
What’s next: AI-based limited-area modelling
Today’s AI-based weather models are global. Can we also extend AI-based weather forecasting to regional models with higher resolution? The answer is yes, as the first attempts to do this have emerged in research communities. In the next article, we will look at AI-based limited-area modelling.
This is followed by a recap article, which ends the series on AI and weather forecasting. We provide an overview of many developments we have seen so far and use our broad understanding to draw future directions where developments can be expected.
References
Bülte, C., Horat, N., Quinting, J., & Lerch, S. (2024). Uncertainty quantification for data-driven weather models. arXiv preprint arXiv:2403.13458.
Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. Advances in neural information processing systems, 33, 6840-6851.
Kochkov, D., Yuval, J., Langmore, I., Norgaard, P., Smith, J., Mooers, G., ... & Hoyer, S. (2024). Neural general circulation models for weather and climate. Nature, 1-7.
Li, L., Carver, R., Lopez-Gomez, I., Sha, F., & Anderson, J. (2024). Generative emulation of weather forecast ensembles with diffusion models. Science Advances, 10(13), eadk4489.
Price, I., Sanchez-Gonzalez, A., Alet, F., Ewalds, T., El-Kadi, A., Stott, J., ... & Willson, M. (2023). GenCast: Diffusion-based ensemble forecasting for medium-range weather. arXiv preprint arXiv:2312.15796.