When you look at a weather forecast, you typically see one prediction for the time you’re interested in. Temperatures of 12.3 degrees at 1PM, wind force 9 at 8 o’clock, and so forth. It suggests that the forecast is highly certain. However, a complex world is hidden behind that single value, which is produced by weather models that may contain errors. Furthermore, certain weather events – such as storm systems – lead to actional uncertainty in weather forecasts. In other words, that single value, can we trust it?
Sometimes, the answer is no. In these cases, providing the best forecast to our customers does not mean stating a single value, but rather communicating the uncertainties. In other words, this means communicating the range of possibilities for these weather elements, typically generated by ensemble models.
In this blog in the AI in weather forecasting series, we dive into these models. First, we’ll investigate why we need ensemble models in the first place – regardless of whether we’re using physics-based numerical models or AI-based models. This is followed by a range of approaches used to generate AI-based ensemble forecasts, to highlight developments in that area.
If we recall how weather models work, we find that each forecast starts with an initial state, known as an analysis. For an arbitrary numerical weather model, an analysis is typically a previous forecast for the analysis time step T which is adjusted mathematically to observations measured since the forecast was issued. Many of today’s AI-based weather models still rely on NWP-based analyses, although developments in AI-based data assimilation are on the horizon. However, the process of generating the analysis already reveals that being an adjusted forecast, it is always a bit off with respect to actual weather conditions. This is inevitable, because we cannot take measurements at every location and at every altitude. However, it leads to uncertainty in the weather forecast. One form of uncertainty is therefore:
Now let’s look at the actual forecasting done by weather models: forecasting future weather based on current conditions. Both NWP- and AI-based weather models are computer programs. In NWP models, physics has been programmed into the model, while smaller-scale phenomena are resolved by parameterizations. However, to reduce computational cost, weather predictions are discretized: the model predicts for a grid, which has grid points that are typically between 1 to 25 kilometers apart. In other words, for many models, all temperatures in a 1x1 kilometer box (and in some cases, even larger boxes) for one time step are summarized to 1 value. Since AI-based models are mostly trained on NWP analyses, that applies to them too. And even worse – they use learned parameters. What if they perform well in, let’s say, 98% of the cases, but not in the other 2%? Another case of uncertainty is:
Not only models and their initialization can produce uncertainty; the weather itself can do so too! In the case of complex and unpredictable weather events, such as thunderstorms or winter weather, even a small difference can make a difference of day and night. This leads us to the third source of uncertainty in a deterministic weather forecast:
>
Let’s now introduce ensemble weather models. It is a forecasting approach that uses multiple simulations for predicting weather conditions. Rather than relying on a single forecast, ensemble models generate a set of predictions, each slightly varied. Producing a range of possible scenarios, ensemble models offer a more comprehensive view of potential weather patterns.
Let’s examine how ensemble forecasts are generated. First, we’ll look at using slightly different input data – an approach that is common in NWP-based ensemble models but has also been tested for AI-based ensembles. Then, we’ll explore approaches specifically developed for AI-based weather models, such as post-hoc processing, using generative techniques and developing AI/NWP based hybrids for ensemble forecasting.
Recall that the initial conditions are used to generate the weather forecast, regardless of the model being an NWP model or an AI model. In other words, if we slightly change the initial conditions in a reasonable way, this has a direct impact on the forecast. This is the basis of ensemble models.
For example, in NWP, the ECMWF ensemble model generates 50 individual forecasts with slightly different initial conditions. These forecasts, also known as members of the ensemble model, are then statistically aggregated into values like the ensemble mean and probability ranges via the percentiles. Verification of ensemble data suggests that ensemble means tend to perform well for forecasting, as they cancel out outliers in weather forecasts.
NWP-based ensemble models have been around for quite some time. AI-based ensemble models are something new, like the field of AI-based weather forecasting itself. One way of generating AI-based ensembles is by using slightly different initial conditions. In fact, recent works investigated multiple ways of generating these different initial conditions (Bülte et al., 2024). To summarize, it is possible to generate AI-based ensembles with different initial conditions by:
Rather than generating ensemble forecasts with slightly different initial conditions, it is also possible to generate them via postprocessing. These techniques use the deterministic forecast generated by a weather model – in our case, an AI-based weather model – and generate an ensemble out of it (Bülte et al., 2024).
Another interesting approach for generating AI-based ensembles is to use generative AI. As we’ve seen in our article on MLWP technology, techniques used in tools like ChatGPT can also be used for weather forecasting. Specifically, a class of models called diffusion models is very interesting for this purpose. Effectively being image reconstruction models, training them means destroying the input data first with cumulative noise, then teaching the model how to reconstruct them. In recent works, this has been used for generating weather predictions and for making forecasts sharper.
What’s interesting about diffusion models is that they can be conditioned. In other words, in the reconstruction phase, it is possible to provide extra information that the model can use. In the case of GenCast, an AI-based ensemble weather model developed by Google DeepMind, predicting time step t from some time step T (thus T + t hours) follows this process:
GenCast is not the only model using generative AI techniques. For example, the SEEDS weather model – also developed by Google – uses a type of diffusion model for generating ensemble forecasts as well (Li et al., 2024).
Many of today’s AI-based weather models are entirely data driven, meaning that no physics is involved in generating the weather forecast. Rather, physical consistency is embedded in the input and output data the model uses during training. A valid question, despite the large volume of data such models are trained on, is whether AI-based weather models are consistent with the laws of physics. That question paves the path for hybrids between AI and physics-based models, of which NeuralGCM is an interesting example (Kochkov et al., 2024). Combining an atmospheric solver with machine learning approaches, the model can generate forecasts that are competitive with other NWP- and AI-based weather models for up to 15 days. Interestingly, it does so at a fraction of the computational cost required for NWP-based weather models.
Today’s AI-based weather models are global. Can we also extend AI-based weather forecasting to regional models with higher resolution? The answer is yes, as the first attempts to do this have emerged in research communities. In the next article, we will look at AI-based limited-area modelling.
This is followed by a recap article, which ends the series on AI and weather forecasting. We provide an overview of many developments we have seen so far and use our broad understanding to draw future directions where developments can be expected.
Bülte, C., Horat, N., Quinting, J., & Lerch, S. (2024). Uncertainty quantification for data-driven weather models. arXiv preprint arXiv:2403.13458.
Ho, J., Jain, A., & Abbeel, P. (2020). Denoising diffusion probabilistic models. Advances in neural information processing systems, 33, 6840-6851.
Kochkov, D., Yuval, J., Langmore, I., Norgaard, P., Smith, J., Mooers, G., ... & Hoyer, S. (2024). Neural general circulation models for weather and climate. Nature, 1-7.
Li, L., Carver, R., Lopez-Gomez, I., Sha, F., & Anderson, J. (2024). Generative emulation of weather forecast ensembles with diffusion models. Science Advances, 10(13), eadk4489.
Price, I., Sanchez-Gonzalez, A., Alet, F., Ewalds, T., El-Kadi, A., Stott, J., ... & Willson, M. (2023). GenCast: Diffusion-based ensemble forecasting for medium-range weather. arXiv preprint arXiv:2312.15796.