CAMS global atmospheric composition forecasts#
This notebooks provides an introduction to the CAMS global atmospheric composition forecasts and shows you how the variable Dust Aerosol Optical Depth
can be used to monitor and predict dust events.
CAMS produces global forecasts for atmospheric composition for the next five days twice a day. The forecasts consist of more than 50 chemical species (e.g. ozone, nitrogen dioxide, carbon dioxide) and seven different types of aerosol (desert dust, sea salt, organic matter, black carbon, sulphate, nitrate and ammonium aerosol).
The initial conditions of each forecast are obtained by combining a previous forecast with current satellite observations through a process called data assimilation. This best estimate of the state of the atmosphere at the initial forecast time step, called the analysis, provides a globally complete and consistent dataset allowing for estimates at locations where observation data coverage is low or for atmospheric pollutants for which no direct observations are available.
The notebook examines the Saharan Dust event which occured over Europe in April 2024.
Basic facts
Spatial resolution: 0.4° x 0.4°
Spatial coverage: Global
Temporal resolution: 1-hourly up to leadtime hour 120
Temporal coverage: since 2015 to present
Data format: GRIB
or zipped NetCDF
How to access the data
CAMS global atmospheric composition forecasts are available for download via the Copernicus Atmosphere Data Store (ADS). You will need to create an ADS account here.
Data from the ADS can be downloaded in two ways:
manually
via the ADS web interfaceprogrammatically
with a Python package called cdsapi (more information)
Load required libraries
import xarray as xr
import matplotlib.pyplot as plt
import matplotlib.colors
from matplotlib.cm import get_cmap
from matplotlib import animation
from matplotlib.axes import Axes
import cartopy.crs as ccrs
from cartopy.mpl.gridliner import LONGITUDE_FORMATTER, LATITUDE_FORMATTER
import cartopy.feature as cfeature
from cartopy.mpl.geoaxes import GeoAxes
GeoAxes._pcolormesh_patched = Axes.pcolormesh
from IPython.display import HTML
import warnings
warnings.simplefilter(action = "ignore", category = RuntimeWarning)
Load helper functions
%run ../functions.ipynb
Optional: Retrieve CAMS global atmospheric composition forecasts programmatically#
The CDS Application Program Interface (CDS API)
is a Python library which allows you to access data from the ADS programmatically
. In order to use the CDS API, follow the steps below:
Self-register at the ADS registration page (if you do not have an account yet)
Login to the ADS portal and go to the api-how-to-page.
Copy the CDS API key displayed in the black terminal window and replace the
######
of theKEY
variable below with your individual CDS API key
Note: You find your CDS API key displayed in the black terminal box under the section Install the CDS API key
. If you do not see a URL or key appear in the black terminal box, please refresh your browser tab.
URL='https://ads.atmosphere.copernicus.eu/api/v2'
KEY='#########################'
The next step is then to request the data with a so called API request
. Via the ADS web interface, you can specificy the characteristics of the CAMS global atmospheric composition forecasts and at the end of the web interface, you can open the ADS request via Show API request
. Copy the request displayed there in the cell below. Once you execute the cell, the download of the data starts automatically.
In the web interface we specified the following characteristics:
Variable:
Dust aerosol optical depth at 550 nm
Date:
2024-04-07
to2024-04-07
time:
12:00
(Start of the model run)Lead time hour:
every 3 hours up to forecast hour 90
Format:
netcdf_zip
import cdsapi
c = cdsapi.Client(url=URL, key=KEY)
c.retrieve(
'cams-global-atmospheric-composition-forecasts',
{
'variable': 'dust_aerosol_optical_depth_550nm',
'date': '2024-04-07/2024-04-07',
'time': '12:00',
'leadtime_hour': [
'0', '12', '15',
'18', '21', '24',
'27', '3', '30',
'33', '36', '39',
'42', '45', '48',
'51', '54', '57',
'6', '60', '63',
'66', '69', '72',
'75', '78', '81',
'84', '87', '9',
'90',
],
'type': 'forecast',
'format': 'netcdf_zip',
},
'../20240407_duaod.netcdf_zip')
CAMS global atmospheric composition forecasts can be retrieved either in GRIB
or in a zipped NetCDF
. Above, we requested the data in a zipped NetCDF and for this reason, we have to unzip the file before we can open it. You can unzip zip archives
in Python with the Python package zipfile
and the function extractall()
.
import zipfile
with zipfile.ZipFile('../20240407_duaod.netcdf_zip', 'r') as zip_ref:
zip_ref.extractall('../')
Load and browse CAMS global atmospheric composition forecasts#
CAMS global near-real-time forecast data is available either in GRIB
or netCDF
. The data for the present example has been downloaded as netCDF
.
You can use xarray’s function xr.open_dataset()
to open the netCDF file as xarray.Dataset
.
file = xr.open_dataset('../eodata/3_model/cams/global_forecast/20240407_data.nc')
file
<xarray.Dataset> Size: 101MB Dimensions: (longitude: 900, latitude: 451, time: 31) Coordinates: * longitude (longitude) float32 4kB 0.0 0.4 0.8 1.2 ... 358.8 359.2 359.6 * latitude (latitude) float32 2kB 90.0 89.6 89.2 88.8 ... -89.2 -89.6 -90.0 * time (time) datetime64[ns] 248B 2024-04-07T12:00:00 ... 2024-04-11T... Data variables: duaod550 (time, latitude, longitude) float64 101MB ... Attributes: Conventions: CF-1.6 history: 2024-05-30 10:00:01 GMT by grib_to_netcdf-2.25.1: /opt/ecmw...
The data above has three dimensions (latitude
, longitude
, time
) and one data variable:
duaod550
: Dust Aerosol Optical Depth at 550nm
Let us inspect the coordinates of the file more in detail.
Below, you see that the data set consists of 31 time steps, ranging from 7 April 2024 12:00 UTC to 11 April 06:00 UTC in a 3-hour timestep.
file.time
<xarray.DataArray 'time' (time: 31)> Size: 248B array(['2024-04-07T12:00:00.000000000', '2024-04-07T15:00:00.000000000', '2024-04-07T18:00:00.000000000', '2024-04-07T21:00:00.000000000', '2024-04-08T00:00:00.000000000', '2024-04-08T03:00:00.000000000', '2024-04-08T06:00:00.000000000', '2024-04-08T09:00:00.000000000', '2024-04-08T12:00:00.000000000', '2024-04-08T15:00:00.000000000', '2024-04-08T18:00:00.000000000', '2024-04-08T21:00:00.000000000', '2024-04-09T00:00:00.000000000', '2024-04-09T03:00:00.000000000', '2024-04-09T06:00:00.000000000', '2024-04-09T09:00:00.000000000', '2024-04-09T12:00:00.000000000', '2024-04-09T15:00:00.000000000', '2024-04-09T18:00:00.000000000', '2024-04-09T21:00:00.000000000', '2024-04-10T00:00:00.000000000', '2024-04-10T03:00:00.000000000', '2024-04-10T06:00:00.000000000', '2024-04-10T09:00:00.000000000', '2024-04-10T12:00:00.000000000', '2024-04-10T15:00:00.000000000', '2024-04-10T18:00:00.000000000', '2024-04-10T21:00:00.000000000', '2024-04-11T00:00:00.000000000', '2024-04-11T03:00:00.000000000', '2024-04-11T06:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 248B 2024-04-07T12:00:00 ... 2024-04-11T06... Attributes: long_name: time
The latitude values have a 0.4 degrees resolution and have a global N-S coverage.
file.latitude
<xarray.DataArray 'latitude' (latitude: 451)> Size: 2kB array([ 90. , 89.6, 89.2, ..., -89.2, -89.6, -90. ], dtype=float32) Coordinates: * latitude (latitude) float32 2kB 90.0 89.6 89.2 88.8 ... -89.2 -89.6 -90.0 Attributes: units: degrees_north long_name: latitude
The longitude values have a 0.4 degrees resolution as well, and are disseminated in a [0, 360] grid.
file.longitude
<xarray.DataArray 'longitude' (longitude: 900)> Size: 4kB array([ 0. , 0.4, 0.8, ..., 358.8, 359.2, 359.6], dtype=float32) Coordinates: * longitude (longitude) float32 4kB 0.0 0.4 0.8 1.2 ... 358.8 359.2 359.6 Attributes: units: degrees_east long_name: longitude
Above, you see that the longitude
variables are in the range of [0, 359.6]
. Per default, ECMWF data are on a [0, 360] grid. Let us bring the longitude coordinates to a [-180, 180]
grid. You can use the xarray function assign_coords()
to assign new values to coordinates of a xarray data arra. The code below shifts your longitude coordinates from [0, 359.6]
to [-180, 179.6]
.
file = file.assign_coords(longitude=(((file.longitude + 180) % 360) - 180)).sortby('longitude')
file
<xarray.Dataset> Size: 101MB Dimensions: (latitude: 451, time: 31, longitude: 900) Coordinates: * latitude (latitude) float32 2kB 90.0 89.6 89.2 88.8 ... -89.2 -89.6 -90.0 * time (time) datetime64[ns] 248B 2024-04-07T12:00:00 ... 2024-04-11T... * longitude (longitude) float32 4kB -180.0 -179.6 -179.2 ... 179.2 179.6 Data variables: duaod550 (time, latitude, longitude) float64 101MB ... Attributes: Conventions: CF-1.6 history: 2024-05-30 10:00:01 GMT by grib_to_netcdf-2.25.1: /opt/ecmw...
Retrieve Dust Aerosol Optical Depth at 550nm as data array#
Let us extract from the dataset above the data variable Dust Aerosol Optical Depth (AOD) at 550nm
as xarray.DataArray
with the name du_aod
. You can load a data array from a xarray dataset by specifying the name of the variable (duaod550
) in square brackets.
du_aod = file['duaod550']
du_aod
<xarray.DataArray 'duaod550' (time: 31, latitude: 451, longitude: 900)> Size: 101MB [12582900 values with dtype=float64] Coordinates: * latitude (latitude) float32 2kB 90.0 89.6 89.2 88.8 ... -89.2 -89.6 -90.0 * time (time) datetime64[ns] 248B 2024-04-07T12:00:00 ... 2024-04-11T... * longitude (longitude) float32 4kB -180.0 -179.6 -179.2 ... 179.2 179.6 Attributes: units: ~ long_name: Dust Aerosol Optical Depth at 550nm
Above, you see that the variable du_aod
has two attributes, units
and long_name
. Let us define variables for those attributes. The variables can be used later for visualizing the data.
long_name = du_aod.long_name
units = du_aod.units
Let us do the same for the coordinates longitude
and latitude
.
latitude = du_aod.latitude
longitude = du_aod.longitude
Visualize Dust Aerosol Optical Depth at 550nm#
The next step is to visualize the dataset. You can use the function visualize_pcolormesh, which makes use of matploblib’s function pcolormesh
and the Cartopy library.
With ?visualize_pcolormesh
you can open the function’s docstring to see what keyword arguments are needed to prepare your plot.
?visualize_pcolormesh
You can make use of the variables we defined above:
units
long_name
latitude
longitude
Additionally, you can specify the color scale
and minimum (vmin
) and maxium (vmax
) data values.
time_index = 10
visualize_pcolormesh(data_array=du_aod[time_index,:,:],
longitude=longitude,
latitude=latitude,
projection=ccrs.PlateCarree(),
color_scale='hot_r',
unit=units,
long_name=long_name + ' ' + str(du_aod[time_index,:,:].time.data)[0:10],
vmin=0,
vmax=1.5)
(<Figure size 2000x1000 with 2 Axes>,
<GeoAxes: title={'center': 'Dust Aerosol Optical Depth at 550nm 2024-04-08'}>)
Create a geographical subset for Europe#
The map above shows Dust Aerosol Optical Depth at 550nm globally. Let us create a geographical subset for Europe, in order to better analyse the Saharan dust event which impacts parts of central and southern Europe.
For geographical subsetting, you can use the function generate_geographical_subset
. You can use ?generate_geographical_subset
to open the docstring in order to see the function’s keyword arguments.
?generate_geographical_subset
Define the bounding box information for Europe
latmin = 28.
latmax = 71.
lonmin = -22.
lonmax = 43
Now, let us apply the function generate_geographical_subset
to subset the du_aod
xarray.DataArray. Let us call the new xarray.DataArray
du_aod_subset
.
du_aod_subset = generate_geographical_subset(xarray=du_aod,
latmin=latmin,
latmax=latmax,
lonmin=lonmin,
lonmax=lonmax)
du_aod_subset
<xarray.DataArray 'duaod550' (time: 31, latitude: 107, longitude: 162)> Size: 4MB array([[[5.16533464e-03, 5.26098837e-03, 5.30881523e-03, ..., 1.57831991e-03, 1.53049305e-03, 1.43483932e-03], [5.45229582e-03, 5.40446896e-03, 5.30881523e-03, ..., 1.53049305e-03, 1.48266619e-03, 1.43483932e-03], [5.11750778e-03, 5.06968092e-03, 4.92620033e-03, ..., 1.48266619e-03, 1.43483932e-03, 1.38701246e-03], ..., [4.78302063e-04, 6.69609515e-04, 9.08743831e-04, ..., 2.67304371e-01, 3.71901720e-01, 4.89221015e-01], [1.00439756e-03, 1.24353187e-03, 1.38701246e-03, ..., 9.00101897e-02, 1.21049824e-01, 1.63280944e-01], [1.67397364e-03, 1.81745423e-03, 1.91310795e-03, ..., 9.02493240e-02, 9.81407564e-02, 1.46206754e-01]], [[4.83054660e-03, 4.92620033e-03, 4.92620033e-03, ..., 1.05222442e-03, 1.00439756e-03, 9.56570694e-04], [5.26098837e-03, 5.16533464e-03, 5.06968092e-03, ..., 1.43483932e-03, 1.38701246e-03, 1.33918560e-03], [5.26098837e-03, 5.11750778e-03, 4.97402719e-03, ..., 1.67397364e-03, 1.62614678e-03, 1.62614678e-03], ... [1.16601926e-01, 1.83750841e-01, 2.56543327e-01, ..., 1.92933599e-01, 2.09625174e-01, 2.27416767e-01], [1.77820310e-01, 2.46930127e-01, 2.94804817e-01, ..., 2.11825210e-01, 2.22634081e-01, 2.32582068e-01], [2.30382033e-01, 2.82895928e-01, 3.25653144e-01, ..., 2.13020881e-01, 2.21534063e-01, 2.31721185e-01]], [[1.53049305e-03, 1.57831991e-03, 1.72180050e-03, ..., 2.96529894e-03, 2.77399149e-03, 2.53485717e-03], [1.57831991e-03, 1.57831991e-03, 1.62614678e-03, ..., 7.26971661e-03, 6.55231367e-03, 5.78708386e-03], [1.62614678e-03, 1.62614678e-03, 1.62614678e-03, ..., 1.10480388e-02, 1.16697880e-02, 1.08089045e-02], ..., [2.81221988e-01, 2.90356919e-01, 2.91743898e-01, ..., 1.58067816e-01, 1.68876687e-01, 1.80976883e-01], [2.85048137e-01, 2.82130699e-01, 3.00878829e-01, ..., 1.66533171e-01, 1.77246388e-01, 1.88724835e-01], [2.83135063e-01, 2.82369833e-01, 3.10826816e-01, ..., 1.77581176e-01, 1.89346584e-01, 1.99772840e-01]]]) Coordinates: * latitude (latitude) float32 428B 70.8 70.4 70.0 69.6 ... 29.2 28.8 28.4 * time (time) datetime64[ns] 248B 2024-04-07T12:00:00 ... 2024-04-11T... * longitude (longitude) float32 648B -21.6 -21.2 -20.8 ... 42.0 42.4 42.8 Attributes: units: ~ long_name: Dust Aerosol Optical Depth at 550nm
Let us visualize the subsetted xarray.DataArray
again. This time, you set the set_global
kwarg to False
and you specify the longitude and latitude bounds specified above.
Additionally, in order to have the time information as part of the title, we add the string of the datetime information to the long_name
variable: long_name + ' ' + str(du_aod_subset[##,:,:].time.data)[0:10]
.
time_index = 14
visualize_pcolormesh(data_array=du_aod_subset[time_index,:,:],
longitude=du_aod_subset.longitude,
latitude=du_aod_subset.latitude,
projection=ccrs.PlateCarree(),
color_scale='hot_r',
unit=units,
long_name=long_name + ' - ' + str(du_aod_subset[time_index,:,:].time.data)[0:10],
vmin=0,
vmax=1,
lonmin=lonmin,
lonmax=lonmax,
latmin=latmin,
latmax=latmax,
set_global=False)
(<Figure size 2000x1000 with 2 Axes>,
<GeoAxes: title={'center': 'Dust Aerosol Optical Depth at 550nm - 2024-04-09'}>)
Animate changes of Dust AOD at 550nm over time#
In the last step, you can animate the Dust Aerosol Optical Depth at 550nm
in order to see how the trace gas develops over a period of 5 days, from 7th to 11th April 2024.
You can do animations with matplotlib’s function animation
. Jupyter’s function HTML
can then be used to display HTML and video content.
The animation function consists of 4 parts:
Setting the initial state:
Here, you define the general plot your animation shall use to initialise the animation. You can also define the number of frames (time steps) your animation shall have.Functions to animate:
An animation consists of three functions:draw()
,init()
andanimate()
.draw()
is the function where individual frames are passed on and the figure is returned as image. In this example, the function redraws the plot for each time step.init()
returns the figure you defined for the initial state.animate()
returns thedraw()
function and animates the function over the given number of frames (time steps).Create a
animate.FuncAnimation
object:
The functions defined before are now combined to build ananimate.FuncAnimation
object.Play the animation as video:
As a final step, you can integrate the animation into the notebook with theHTML
class. You take the generate animation object and convert it to a HTML5 video with theto_html5_video
function
# Setting the initial state:
# 1. Define figure for initial plot
fig, ax = visualize_pcolormesh(data_array=du_aod_subset[0,:,:],
longitude=du_aod_subset.longitude,
latitude=du_aod_subset.latitude,
projection=ccrs.PlateCarree(),
color_scale='hot_r',
unit='-',
long_name=long_name + ' - '+ str(du_aod_subset.time[0].data)[0:10],
vmin=0,
vmax=1,
lonmin=lonmin,
lonmax=lonmax,
latmin=latmin,
latmax=latmax,
set_global=False)
frames = 30
def draw(i):
img = plt.pcolormesh(du_aod_subset.longitude,
du_aod_subset.latitude,
du_aod_subset[i,:,:],
cmap='hot_r',
transform=ccrs.PlateCarree(),
vmin=0,
vmax=1,
shading='auto')
ax.set_title(long_name + ' - '+ str(du_aod_subset.time[i].data)[0:10], fontsize=20, pad=20.0)
return img
def init():
return fig
def animate(i):
return draw(i)
ani = animation.FuncAnimation(fig, animate, frames, interval=800, blit=False,
init_func=init, repeat=True)
HTML(ani.to_html5_video())
plt.close(fig)
Play the animation video as HTML5 video
HTML(ani.to_html5_video())