AI and weather forecasting: all developments in a nutshell

With advances in AI-based weather modelling, the adoption of AI technology in the field has seen a steady increase in the last couple of years. In fact, some AI weather models are currently running operationally, producing forecasts that can even be competitive to more traditional numerical weather models.

In Infoplaza’s series on AI and weather forecasting, we provided in-depth articles on recent developments in this space. In this final article, we summarize all developments in a nutshell. This allows you to grasp the developments in one go, whilst allowing yourself to dive further into referenced specifics when interested. Firstly, we’ll focus on the differences between numerical weather prediction (NWP) and AI-based or machine learning-based weather prediction (MLWP). We then introduce the first few generations of AI-based weather models as well as their strengths and weaknesses. This guides you to initial developments towards AI-based weather models.

Typically, based on initial successes and identified areas for improvement, a large body of research is undertaken to broaden and deepen the performance of initial results. This is also the case for AI-based weather modelling, and what follows are developments on ensemble forecasting, data assimilation, reducing smoothing in forecasts and regional modelling. Finally, we look at the way forward, as with all these developments, future projects are working towards an ambitious goal – the development of foundation models of the Earth System.

Differences between NWP and AI-based weather forecasting

Today’s weather forecasts are typically generated with numerical weather models. This paradigm, also called NWP, involves complex models that compute some initial state called analysis forward in time using programmed atmospheric physics. The physics equations that must be computed are computationally intensive, requiring large clusters of powerful computers to be deployed in NWP centers across the world. Typically, only a few model runs can be generated every day – for example, once per hour for smaller models, or four times per day for global ones. NWP models are mostly processor intensive and have thus benefited greatly from advances in computing power in the last 15 to 20 years. However, since these advances are slowing down at the technological level, these models are reaching their upper performance limits, too.

AI-based weather models, on the other hand, work differently. Firstly, their underlying principles differ as they learn from vast amounts of examples (40 years of weather situations is not uncommon) instead of using the laws of physics. This also requires different technology: rather than being built specifically for regular processors (CPUs), AI technologies like the family of models also underlying ChatGPT are used. These are designed to run fast on graphics processors (GPUs). Because they are optimized for parallelism, and relatively simple computations need to be performed for generating new forecasts after training, making new forecasts with AI-based weather models is extremely fast compared to NWP approaches. Only training the AI model takes significant resources, albeit fewer resources are spent compared to running the NWP models over their entire lifetime.

Figure 1: training an AI-based weather model involves large amounts of data. For training, typically, the state of the atmosphere at some time T is passed through the AI (also called ML) model. The model is expected to produce the atmosphere at time T + h, where h is typically 3, 6, 12 or 24 hours forward in time. Differences between the model output and the expected output lead to a loss value, which indicates how poorly the model performs. This loss value can be computed backwards into the model, allowing it to slightly improve. Doing this many times over, it is possible to arrive at a model that can perform weather forecasting.

The first few generations of AI-based weather models

Today, a series of generations of AI-based weather models is released and sometimes even used operationally. The first models that produced results that are operationally useful are Pangu-Weather, GraphCast and FourCastNet. Interestingly, they use different underlying technologies, but a similar paradigm for training (as illustrated above). Pangu-Weather uses a Transformer architecture adapted to the Earth, which is like those used for large language models such as the one underlying ChatGPT. Effectively, it considers the state of the atmosphere to be images, it cuts them into pieces, lets the model compute pieces of the next time step, then recombines those. Then, the process is repeated for computing another time step, and another one, and so forth. Effectively, it is thus close to a computer vision approach.

GraphCast, on the other hand, works by representing the atmospheric grids as mathematical graphs. They borrow from how NWP models structure the atmosphere: by means of grids with grid points. This is related to the continuity of the atmosphere – as a fluid – and the impossibility to represent that numerically. Put simply, to represent the whole atmosphere in a model requires the model to have an infinite number of points, making it impossible to produce a forecast in time. Hence, the atmosphere is divided into a grid, which is a collection of grid points structured in a logical way in both the horizontal and vertical dimensions, with points typically being kilometers apart, something we call model resolution. Also, these grids tend to have most grid points near the surface and fewer in the upper atmosphere. Now, if you add connections (also known as edges) to these points in the horizontal and vertical dimensions, you construct what is effectively known as a graph in mathematics. GraphCast uses this concept to represent weather states and uses a technology named graph neural networks to transform the input to a weather state some hours forward in time. Like Pangu-Weather, it uses that state to compute the next time step, and so forth. Because graph neural networks involve message passing, contrary to image-based learning, one could say that GraphCast effectively learns to send messages that weather is coming your way.

Finally, the technology behind FourCastNet involves learning neural operators. Effectively, instead of the model learning the mapping between some input state and some output state, the goal is to learn the function that maps between these states. This brings us closer to physical consistency when using AI for weather modelling, because if the training data was generated with physically consistent functions, the AI model is attempting to learn these functions implicitly by means of neural operators. The first version of FourCastNet uses the Fourier Neural Operator (FNO), which uses the Fourier Transform to extract patterns from the input data. A subsequent version uses a spherical harmonics approach, where the neural operator is adapted to better handle the fact that Earth is a sphere.

Typically, after a breakthrough in AI technology, many offspring research projects are undertaken to try and resolve limitations of the then state-of-the-art. The same can be observed with AI weather models. A series of newer-generation models has appeared, such as FengWu, FuXi, AIFS and others. They mostly use a variety of the approaches mentioned above but focus on a few optimizations to try and make forecasts better.

Figure 2: three grids nested into each other. Because numerical weather models cannot model the atmosphere continuously because of computational restraints, they discretize the atmosphere into a grid with grid points that are few or more kilometers apart. Global models often have low-resolution grids, whereas local grids typically have higher resolution.

Strengths and weaknesses of AI weather models

Creating a weather model is only one part of the process. To understand how well it performs, it must be verified. Preferably, this is done similar to how other weather models are verified, to avoid comparing apples and pears. For this reason, WeatherBench was introduced by the AI weather modelling community. It provides testing data, a testing method and various tests to ensure that models are compared in the same way. What’s more, it allows us to get insights into the performance of AI-based weather models. A few of them:

Claims involving AI weather models being better than NWP models typically use verification results from upper-level atmospheric variables, such as the geopotential at 500 hPa. These variables are relevant for synoptic, i.e. large scale, weather.
Some AI-based weather models also report advances in precision of predicted storm trajectories, particularly tropical systems. If this boost in performance can be exploited structurally, this can potentially lead to many cost savings and, in the end, preservation of human life.
Modelling weather near the surface is more complex, for example due to interactions of the wind with the Earth’s surface and buildings. What’s more, these are typically small-scale phenomena. AI weather models, by consequence of how they are trained, typically suffer from smoothing effects and thus struggle with these phenomena. Hence, while AI weather models can be competitive to NWP models on these variables, in many cases they are still a bit worse.
There are significant differences between the performance of AI weather models throughout the year and throughout the world. In WeatherBench, these can be analyzed via temporal scores and bias maps.

Areas of active research: ensemble forecasting, data assimilation, smoothing and regional modelling

Since the introduction of the mentioned AI weather models, an area of active research has emerged that attempts to integrate these models with many parts of the weather forecasting chain. In earlier articles, we highlighted many of these developments.

For example, let’s consider the initial conditions or analyses that we highlighted before. An analysis used for a forecast starting at time step T is a previous forecast of the same model for T, but then adapted to best reflect the gathered observations that occurred since that forecast was generated. Effectively, the forecast is ‘juggled’ to best fit the whole set of observations in a process called data assimilation. This does not give a perfect representation of current weather, but a best guess that weather models can work with. However, like generating weather forecasts, creating analyses is a computationally intensive process. Interestingly, we are now in the early days of performing data assimilation with AI models. This has the potential to significantly speed up the analysis generation process, then allowing AI weather models to benefit from new observations continuously instead of only a few times a day (when they are run with analyses generated by NWP models).

In another example, we need to look at uncertainty created by the fact that analyses are imperfect. Put very simply: if you start with conditions that are slightly different than reality, the forecast will be slightly different than reality. While this does not lead to significant differences in the short term, the problem increases with longer time horizons. What’s more, model representation (e.g., a grid of many by many kilometers) adds extra uncertainty. Especially in the case of extreme weather, such uncertainty is undesired. For this reason, ensemble forecasts are created by running the model many times (e.g., 50 times) with slightly different initial conditions. Typically, this is possible because analyses are slightly perturbed. However, because AI models use the same initial conditions and are never perfect, uncertainty is a problem for them, too. Active research is undertaken in the field of AI-based ensemble modelling, which we discussed in a dedicated article.

Another active research area focuses on making sharper forecasts, or, in other words, reducing the smoothing effects that were introduced above. A variety of approaches is available to reduce these effects, including further training existing AI weather models to reduce these effects, using separate models for different time horizons, using adapters for unique time steps and using diffusion models like the ones used in ChatGPT. Initial results show promising developments, demonstrating that significantly sharper AI-based forecasts are possible.

Another recently successful research area is the generation of regional weather forecasts with AI-based weather models. The existing generation of AI weather models is global and was trained with global weather data. In other words, they can generate forecasts at typically 0.25-degree resolution, which is approximately 25x25 to 30x30 kilometer resolution. In simple terms: that’s one value for temperature for a 25 by 25-kilometer box! Higher-resolution models are available in NWP that produce forecasts using few by few kilometer grids but for smaller domains. Recently, for the Nordics, a regional AI-based weather model was created by adapting a global model, adding a regional grid while continuing training. Only limited additional data was required to let the model generate weather forecasts at 2.5 by 2.5 kilometer resolution for that part of the world.

Figure 3: global output of the GFS weather model at 12-kilometer resolution. Right: regional output of the KNMI Harmonie weather model at 2-kilometer resolution. Source: I’m Weather

The way forward: towards foundation models of the Earth System

As you've read, the last few years have introduced many AI-based developments to the field of weather forecasting. The velocity with which developments are underway keeps increasing, now that both the private and public sectors are investing significant amounts of money into research. However, the question would be: what would be the way forward? This is still an open question, but some direction is already becoming visible.

One potential answer to this question can be found in the development of foundation models. Those models are trained on broad data and are not task-oriented, such as the current generation of models, which are tailored to weather forecasting. Rather, the goal of training them with a large variety of datasets would be building models with a broad ‘understanding’ of atmospheric dynamics that can subsequently be used for fine-tuning to specific tasks.

Large language models like the ones underlying ChatGPT are a good example of foundation models. In the early days of language modelling, models were typically task-oriented, for example by focusing on text summarization or sentiment analysis. Large language models, however, are trained on large amounts of text without focusing on a specific task. This allows them to grasp language in ways that often feel natural to users, effectively creating the illusion of technology having general language capabilities.

One of the key targets in weather forecasting for the next few years is the development of foundation models for Earth System modelling (which includes weather forecasting). In other words, that would be the development of one model that ‘understands’ the Earth system. The Aurora model, for example, is one of the first foundation models of the atmosphere. It was first trained on a large amount of varying atmospheric datasets. Only then, it was finetuned to weather forecasting. Similarly, the WeatherGenerator project undertaken in Europe is a highly ambitious project aiming to develop a foundation model as well. It is driven by public meteorological institutions that work together with the private sector. If the project is successful, a foundation model like WeatherGenerator can be fine-tuned to a large variety of weather-related tasks, including but not limited to weather forecasting, decadal forecasting, data assimilation, nowcasting, predicting impacts of extreme weather events, and so forth.

At Infoplaza, we keep monitoring developments in AI-based weather forecasting closely. Once such models add value for our clients, we will add them to our forecasting chain as quickly as possible. Besides, with new significant results, we will continue our AI in weather forecasting series on a per-theme basis. In other words, stay tuned – and feel free to reach out via the contact page if you would like to know more!

Land

Road ice

Outdoor

Energy

Construction

Rail

Marine

Marine Weather

Weather expert

Metocean weather planning

Ice charting

Media

Media support

Advertising

Our brands

Mobility

Cycling

Public transport

Car

Our clients

Our services in practice

Province of Groningen

BAM Infra

ProRail

Bouwend Nederland

Van Oord

Spirit Energy

Get to know us

Our story

Careers

ESG Vision

Contact us

Land

Marine

Media

Mobility

AI and weather forecasting: all developments in a nutshell

Differences between NWP and AI-based weather forecasting

The first few generations of AI-based weather models

Strengths and weaknesses of AI weather models

Areas of active research: ensemble forecasting, data assimilation, smoothing and regional modelling

The way forward: towards foundation models of the Earth System

Care to share?

Care to share?

Christian Versloot

Stay up to date:guiding you to the decision point

Stay up-to-date

Infoplaza now SEQUAL certified

The evolution of weather forecasting: From ancient signs to AI-driven predictions

Stay up to date:
guiding you to the decision point