CAMS global atmospheric composition forecasts#

This notebooks provides an introduction to the CAMS global atmospheric composition forecasts and shows you how the variable Dust Aerosol Optical Depth can be used to monitor and predict dust events.

CAMS produces global forecasts for atmospheric composition for the next five days twice a day. The forecasts consist of more than 50 chemical species (e.g. ozone, nitrogen dioxide, carbon dioxide) and seven different types of aerosol (desert dust, sea salt, organic matter, black carbon, sulphate, nitrate and ammonium aerosol).

The initial conditions of each forecast are obtained by combining a previous forecast with current satellite observations through a process called data assimilation. This best estimate of the state of the atmosphere at the initial forecast time step, called the analysis, provides a globally complete and consistent dataset allowing for estimates at locations where observation data coverage is low or for atmospheric pollutants for which no direct observations are available.

The notebook examines the Saharan Dust event which occured over Europe in April 2024.

Basic facts

Spatial resolution: 0.4° x 0.4°
Spatial coverage: Global
Temporal resolution: 1-hourly up to leadtime hour 120
Temporal coverage: since 2015 to present
Data format: GRIB or zipped NetCDF

How to access the data

CAMS global atmospheric composition forecasts are available for download via the Copernicus Atmosphere Data Store (ADS). You will need to create an ADS account here.

Data from the ADS can be downloaded in two ways:


Load required libraries

import xarray as xr

import matplotlib.pyplot as plt
import matplotlib.colors
from matplotlib.cm import get_cmap
from matplotlib import animation
from matplotlib.axes import Axes

import cartopy.crs as ccrs
from cartopy.mpl.gridliner import LONGITUDE_FORMATTER, LATITUDE_FORMATTER
import cartopy.feature as cfeature
from cartopy.mpl.geoaxes import GeoAxes
GeoAxes._pcolormesh_patched = Axes.pcolormesh

from IPython.display import HTML

import warnings
warnings.simplefilter(action = "ignore", category = RuntimeWarning)

Load helper functions

%run ../functions.ipynb

Optional: Retrieve CAMS global atmospheric composition forecasts programmatically#

The CDS Application Program Interface (CDS API) is a Python library which allows you to access data from the ADS programmatically. In order to use the CDS API, follow the steps below:

  • Self-register at the ADS registration page (if you do not have an account yet)

  • Login to the ADS portal and go to the api-how-to-page.

  • Copy the CDS API key displayed in the black terminal window and replace the ###### of the KEY variable below with your individual CDS API key

Note: You find your CDS API key displayed in the black terminal box under the section Install the CDS API key. If you do not see a URL or key appear in the black terminal box, please refresh your browser tab.

URL='https://ads.atmosphere.copernicus.eu/api/v2'
KEY='#########################'

The next step is then to request the data with a so called API request. Via the ADS web interface, you can specificy the characteristics of the CAMS global atmospheric composition forecasts and at the end of the web interface, you can open the ADS request via Show API request. Copy the request displayed there in the cell below. Once you execute the cell, the download of the data starts automatically.

In the web interface we specified the following characteristics:

  • Variable: Dust aerosol optical depth at 550 nm

  • Date: 2024-04-07 to 2024-04-07

  • time: 12:00 (Start of the model run)

  • Lead time hour: every 3 hours up to forecast hour 90

  • Format: netcdf_zip

import cdsapi

c = cdsapi.Client(url=URL, key=KEY)

c.retrieve(
    'cams-global-atmospheric-composition-forecasts',
    {
        'variable': 'dust_aerosol_optical_depth_550nm',
        'date': '2024-04-07/2024-04-07',
        'time': '12:00',
        'leadtime_hour': [
            '0', '12', '15',
            '18', '21', '24',
            '27', '3', '30',
            '33', '36', '39',
            '42', '45', '48',
            '51', '54', '57',
            '6', '60', '63',
            '66', '69', '72',
            '75', '78', '81',
            '84', '87', '9',
            '90',
        ],
        'type': 'forecast',
        'format': 'netcdf_zip',
    },
    '../20240407_duaod.netcdf_zip')

CAMS global atmospheric composition forecasts can be retrieved either in GRIB or in a zipped NetCDF. Above, we requested the data in a zipped NetCDF and for this reason, we have to unzip the file before we can open it. You can unzip zip archives in Python with the Python package zipfile and the function extractall().

import zipfile
with zipfile.ZipFile('../20240407_duaod.netcdf_zip', 'r') as zip_ref:
    zip_ref.extractall('../')

Load and browse CAMS global atmospheric composition forecasts#

CAMS global near-real-time forecast data is available either in GRIB or netCDF. The data for the present example has been downloaded as netCDF.

You can use xarray’s function xr.open_dataset() to open the netCDF file as xarray.Dataset.

file = xr.open_dataset('../eodata/3_model/cams/global_forecast/20240407_data.nc')
file
<xarray.Dataset> Size: 101MB
Dimensions:    (longitude: 900, latitude: 451, time: 31)
Coordinates:
  * longitude  (longitude) float32 4kB 0.0 0.4 0.8 1.2 ... 358.8 359.2 359.6
  * latitude   (latitude) float32 2kB 90.0 89.6 89.2 88.8 ... -89.2 -89.6 -90.0
  * time       (time) datetime64[ns] 248B 2024-04-07T12:00:00 ... 2024-04-11T...
Data variables:
    duaod550   (time, latitude, longitude) float64 101MB ...
Attributes:
    Conventions:  CF-1.6
    history:      2024-05-30 10:00:01 GMT by grib_to_netcdf-2.25.1: /opt/ecmw...

The data above has three dimensions (latitude, longitude, time) and one data variable:

  • duaod550: Dust Aerosol Optical Depth at 550nm

Let us inspect the coordinates of the file more in detail.

Below, you see that the data set consists of 31 time steps, ranging from 7 April 2024 12:00 UTC to 11 April 06:00 UTC in a 3-hour timestep.

file.time
<xarray.DataArray 'time' (time: 31)> Size: 248B
array(['2024-04-07T12:00:00.000000000', '2024-04-07T15:00:00.000000000',
       '2024-04-07T18:00:00.000000000', '2024-04-07T21:00:00.000000000',
       '2024-04-08T00:00:00.000000000', '2024-04-08T03:00:00.000000000',
       '2024-04-08T06:00:00.000000000', '2024-04-08T09:00:00.000000000',
       '2024-04-08T12:00:00.000000000', '2024-04-08T15:00:00.000000000',
       '2024-04-08T18:00:00.000000000', '2024-04-08T21:00:00.000000000',
       '2024-04-09T00:00:00.000000000', '2024-04-09T03:00:00.000000000',
       '2024-04-09T06:00:00.000000000', '2024-04-09T09:00:00.000000000',
       '2024-04-09T12:00:00.000000000', '2024-04-09T15:00:00.000000000',
       '2024-04-09T18:00:00.000000000', '2024-04-09T21:00:00.000000000',
       '2024-04-10T00:00:00.000000000', '2024-04-10T03:00:00.000000000',
       '2024-04-10T06:00:00.000000000', '2024-04-10T09:00:00.000000000',
       '2024-04-10T12:00:00.000000000', '2024-04-10T15:00:00.000000000',
       '2024-04-10T18:00:00.000000000', '2024-04-10T21:00:00.000000000',
       '2024-04-11T00:00:00.000000000', '2024-04-11T03:00:00.000000000',
       '2024-04-11T06:00:00.000000000'], dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 248B 2024-04-07T12:00:00 ... 2024-04-11T06...
Attributes:
    long_name:  time

The latitude values have a 0.4 degrees resolution and have a global N-S coverage.

file.latitude
<xarray.DataArray 'latitude' (latitude: 451)> Size: 2kB
array([ 90. ,  89.6,  89.2, ..., -89.2, -89.6, -90. ], dtype=float32)
Coordinates:
  * latitude  (latitude) float32 2kB 90.0 89.6 89.2 88.8 ... -89.2 -89.6 -90.0
Attributes:
    units:      degrees_north
    long_name:  latitude

The longitude values have a 0.4 degrees resolution as well, and are disseminated in a [0, 360] grid.

file.longitude
<xarray.DataArray 'longitude' (longitude: 900)> Size: 4kB
array([  0. ,   0.4,   0.8, ..., 358.8, 359.2, 359.6], dtype=float32)
Coordinates:
  * longitude  (longitude) float32 4kB 0.0 0.4 0.8 1.2 ... 358.8 359.2 359.6
Attributes:
    units:      degrees_east
    long_name:  longitude

Above, you see that the longitude variables are in the range of [0, 359.6]. Per default, ECMWF data are on a [0, 360] grid. Let us bring the longitude coordinates to a [-180, 180] grid. You can use the xarray function assign_coords() to assign new values to coordinates of a xarray data arra. The code below shifts your longitude coordinates from [0, 359.6] to [-180, 179.6].

file = file.assign_coords(longitude=(((file.longitude + 180) % 360) - 180)).sortby('longitude')
file
<xarray.Dataset> Size: 101MB
Dimensions:    (latitude: 451, time: 31, longitude: 900)
Coordinates:
  * latitude   (latitude) float32 2kB 90.0 89.6 89.2 88.8 ... -89.2 -89.6 -90.0
  * time       (time) datetime64[ns] 248B 2024-04-07T12:00:00 ... 2024-04-11T...
  * longitude  (longitude) float32 4kB -180.0 -179.6 -179.2 ... 179.2 179.6
Data variables:
    duaod550   (time, latitude, longitude) float64 101MB ...
Attributes:
    Conventions:  CF-1.6
    history:      2024-05-30 10:00:01 GMT by grib_to_netcdf-2.25.1: /opt/ecmw...

Retrieve Dust Aerosol Optical Depth at 550nm as data array#

Let us extract from the dataset above the data variable Dust Aerosol Optical Depth (AOD) at 550nm as xarray.DataArray with the name du_aod. You can load a data array from a xarray dataset by specifying the name of the variable (duaod550) in square brackets.

du_aod = file['duaod550']
du_aod
<xarray.DataArray 'duaod550' (time: 31, latitude: 451, longitude: 900)> Size: 101MB
[12582900 values with dtype=float64]
Coordinates:
  * latitude   (latitude) float32 2kB 90.0 89.6 89.2 88.8 ... -89.2 -89.6 -90.0
  * time       (time) datetime64[ns] 248B 2024-04-07T12:00:00 ... 2024-04-11T...
  * longitude  (longitude) float32 4kB -180.0 -179.6 -179.2 ... 179.2 179.6
Attributes:
    units:      ~
    long_name:  Dust Aerosol Optical Depth at 550nm

Above, you see that the variable du_aod has two attributes, units and long_name. Let us define variables for those attributes. The variables can be used later for visualizing the data.

long_name = du_aod.long_name
units = du_aod.units

Let us do the same for the coordinates longitude and latitude.

latitude = du_aod.latitude
longitude = du_aod.longitude

Visualize Dust Aerosol Optical Depth at 550nm#

The next step is to visualize the dataset. You can use the function visualize_pcolormesh, which makes use of matploblib’s function pcolormesh and the Cartopy library.

With ?visualize_pcolormesh you can open the function’s docstring to see what keyword arguments are needed to prepare your plot.

?visualize_pcolormesh

You can make use of the variables we defined above:

  • units

  • long_name

  • latitude

  • longitude

Additionally, you can specify the color scale and minimum (vmin) and maxium (vmax) data values.

time_index = 10
visualize_pcolormesh(data_array=du_aod[time_index,:,:],
                     longitude=longitude,
                     latitude=latitude,
                     projection=ccrs.PlateCarree(),
                     color_scale='hot_r',
                     unit=units,
                     long_name=long_name + ' ' + str(du_aod[time_index,:,:].time.data)[0:10],
                     vmin=0, 
                     vmax=1.5)
(<Figure size 2000x1000 with 2 Axes>,
 <GeoAxes: title={'center': 'Dust Aerosol Optical Depth at 550nm 2024-04-08'}>)
../_images/02e0ebfc0bd057d82f2e5c7d9e3beb1ae3c9fca49c1531cd3e94e3b72893a302.png

Create a geographical subset for Europe#

The map above shows Dust Aerosol Optical Depth at 550nm globally. Let us create a geographical subset for Europe, in order to better analyse the Saharan dust event which impacts parts of central and southern Europe.

For geographical subsetting, you can use the function generate_geographical_subset. You can use ?generate_geographical_subset to open the docstring in order to see the function’s keyword arguments.

?generate_geographical_subset

Define the bounding box information for Europe

latmin = 28.
latmax = 71.
lonmin = -22.
lonmax = 43

Now, let us apply the function generate_geographical_subset to subset the du_aod xarray.DataArray. Let us call the new xarray.DataArray du_aod_subset.

du_aod_subset = generate_geographical_subset(xarray=du_aod, 
                                             latmin=latmin, 
                                             latmax=latmax, 
                                             lonmin=lonmin, 
                                             lonmax=lonmax)
du_aod_subset
<xarray.DataArray 'duaod550' (time: 31, latitude: 107, longitude: 162)> Size: 4MB
array([[[5.16533464e-03, 5.26098837e-03, 5.30881523e-03, ...,
         1.57831991e-03, 1.53049305e-03, 1.43483932e-03],
        [5.45229582e-03, 5.40446896e-03, 5.30881523e-03, ...,
         1.53049305e-03, 1.48266619e-03, 1.43483932e-03],
        [5.11750778e-03, 5.06968092e-03, 4.92620033e-03, ...,
         1.48266619e-03, 1.43483932e-03, 1.38701246e-03],
        ...,
        [4.78302063e-04, 6.69609515e-04, 9.08743831e-04, ...,
         2.67304371e-01, 3.71901720e-01, 4.89221015e-01],
        [1.00439756e-03, 1.24353187e-03, 1.38701246e-03, ...,
         9.00101897e-02, 1.21049824e-01, 1.63280944e-01],
        [1.67397364e-03, 1.81745423e-03, 1.91310795e-03, ...,
         9.02493240e-02, 9.81407564e-02, 1.46206754e-01]],

       [[4.83054660e-03, 4.92620033e-03, 4.92620033e-03, ...,
         1.05222442e-03, 1.00439756e-03, 9.56570694e-04],
        [5.26098837e-03, 5.16533464e-03, 5.06968092e-03, ...,
         1.43483932e-03, 1.38701246e-03, 1.33918560e-03],
        [5.26098837e-03, 5.11750778e-03, 4.97402719e-03, ...,
         1.67397364e-03, 1.62614678e-03, 1.62614678e-03],
...
        [1.16601926e-01, 1.83750841e-01, 2.56543327e-01, ...,
         1.92933599e-01, 2.09625174e-01, 2.27416767e-01],
        [1.77820310e-01, 2.46930127e-01, 2.94804817e-01, ...,
         2.11825210e-01, 2.22634081e-01, 2.32582068e-01],
        [2.30382033e-01, 2.82895928e-01, 3.25653144e-01, ...,
         2.13020881e-01, 2.21534063e-01, 2.31721185e-01]],

       [[1.53049305e-03, 1.57831991e-03, 1.72180050e-03, ...,
         2.96529894e-03, 2.77399149e-03, 2.53485717e-03],
        [1.57831991e-03, 1.57831991e-03, 1.62614678e-03, ...,
         7.26971661e-03, 6.55231367e-03, 5.78708386e-03],
        [1.62614678e-03, 1.62614678e-03, 1.62614678e-03, ...,
         1.10480388e-02, 1.16697880e-02, 1.08089045e-02],
        ...,
        [2.81221988e-01, 2.90356919e-01, 2.91743898e-01, ...,
         1.58067816e-01, 1.68876687e-01, 1.80976883e-01],
        [2.85048137e-01, 2.82130699e-01, 3.00878829e-01, ...,
         1.66533171e-01, 1.77246388e-01, 1.88724835e-01],
        [2.83135063e-01, 2.82369833e-01, 3.10826816e-01, ...,
         1.77581176e-01, 1.89346584e-01, 1.99772840e-01]]])
Coordinates:
  * latitude   (latitude) float32 428B 70.8 70.4 70.0 69.6 ... 29.2 28.8 28.4
  * time       (time) datetime64[ns] 248B 2024-04-07T12:00:00 ... 2024-04-11T...
  * longitude  (longitude) float32 648B -21.6 -21.2 -20.8 ... 42.0 42.4 42.8
Attributes:
    units:      ~
    long_name:  Dust Aerosol Optical Depth at 550nm

Let us visualize the subsetted xarray.DataArray again. This time, you set the set_global kwarg to False and you specify the longitude and latitude bounds specified above.

Additionally, in order to have the time information as part of the title, we add the string of the datetime information to the long_name variable: long_name + ' ' + str(du_aod_subset[##,:,:].time.data)[0:10].

time_index = 14
visualize_pcolormesh(data_array=du_aod_subset[time_index,:,:],
                     longitude=du_aod_subset.longitude,
                     latitude=du_aod_subset.latitude,
                     projection=ccrs.PlateCarree(),
                     color_scale='hot_r',
                     unit=units,
                     long_name=long_name + ' - ' + str(du_aod_subset[time_index,:,:].time.data)[0:10],
                     vmin=0, 
                     vmax=1,
                     lonmin=lonmin,
                     lonmax=lonmax,
                     latmin=latmin,
                     latmax=latmax,
                     set_global=False)
(<Figure size 2000x1000 with 2 Axes>,
 <GeoAxes: title={'center': 'Dust Aerosol Optical Depth at 550nm - 2024-04-09'}>)
../_images/2718ee7e7f2886e35c364c1500b8ed9e69f1233a9b8c16be4862a6bc2bd8b179.png

Animate changes of Dust AOD at 550nm over time#

In the last step, you can animate the Dust Aerosol Optical Depth at 550nm in order to see how the trace gas develops over a period of 5 days, from 7th to 11th April 2024.

You can do animations with matplotlib’s function animation. Jupyter’s function HTML can then be used to display HTML and video content.

The animation function consists of 4 parts:

  • Setting the initial state:
    Here, you define the general plot your animation shall use to initialise the animation. You can also define the number of frames (time steps) your animation shall have.

  • Functions to animate:
    An animation consists of three functions: draw(), init() and animate(). draw() is the function where individual frames are passed on and the figure is returned as image. In this example, the function redraws the plot for each time step. init() returns the figure you defined for the initial state. animate() returns the draw() function and animates the function over the given number of frames (time steps).

  • Create a animate.FuncAnimation object:
    The functions defined before are now combined to build an animate.FuncAnimation object.

  • Play the animation as video:
    As a final step, you can integrate the animation into the notebook with the HTML class. You take the generate animation object and convert it to a HTML5 video with the to_html5_video function

# Setting the initial state:
# 1. Define figure for initial plot
fig, ax = visualize_pcolormesh(data_array=du_aod_subset[0,:,:],
                               longitude=du_aod_subset.longitude, 
                               latitude=du_aod_subset.latitude,
                               projection=ccrs.PlateCarree(), 
                               color_scale='hot_r', 
                               unit='-',
                               long_name=long_name + ' - '+ str(du_aod_subset.time[0].data)[0:10],
                               vmin=0,
                               vmax=1, 
                               lonmin=lonmin, 
                               lonmax=lonmax, 
                               latmin=latmin, 
                               latmax=latmax,
                               set_global=False)

frames = 30

def draw(i):
    img = plt.pcolormesh(du_aod_subset.longitude, 
                         du_aod_subset.latitude, 
                         du_aod_subset[i,:,:], 
                         cmap='hot_r', 
                         transform=ccrs.PlateCarree(),
                         vmin=0,
                         vmax=1,
                         shading='auto')
    
    ax.set_title(long_name + ' - '+ str(du_aod_subset.time[i].data)[0:10], fontsize=20, pad=20.0)
    return img

def init():
    return fig

def animate(i):
    return draw(i)

ani = animation.FuncAnimation(fig, animate, frames, interval=800, blit=False,
                              init_func=init, repeat=True)

HTML(ani.to_html5_video())
plt.close(fig)

Play the animation video as HTML5 video

HTML(ani.to_html5_video())