{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Using the Copernicus Seasonal Forecast Tools package to create a hazard object" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "The [copernicus-seasonal-forecast-tools](https://github.com/DahyannAraya/copernicus-seasonal-forecast-tools) package was developed to manage seasonal forecast data from the [Copernicus Climate Data Store](https://cds.climate.copernicus.eu) (CDS) for the [U-CLIMADAPT project](https://www.copernicus-user-uptake.eu/user-uptake/details/responding-to-the-impact-of-climate-change-u-climadapt-488).\n", "It offers comprehensive tools for downloading, processing, computing climate indices, and generating hazard objects based on seasonal forecast datasets, particularly [Seasonal forecast daily and subdaily data on single levels](https://cds.climate.copernicus.eu/datasets/seasonal-original-single-levels?tab=overview).\n", "The package is tailored to integrate seamlessly with the CLIMADA, supporting climate risk assessment and the development of effective adaptation strategies.\n", "\n", "Features:\n", "- Automated download of the high-dimensional seasonal forecasts data via the Copernicus API\n", "- Preprocessing of sub-daily forecast data into daily formats\n", "- Calculation of heat-related climate indices (e.g., heatwave days, tropical nights)\n", "- Conversion of processed indices into CLIMADA hazard objects ready for impact modelling\n", "- Flexible modular architecture to accommodate additional indices or updates to datasets\n", "\n", "In this tutorial, you can see a simple example of how to retrieve and process data from Copernicus, calculate a heat-related index, and create a hazard object. For more detailed documentation and advanced examples, please visit the [repository](https://github.com/DahyannAraya/copernicus-seasonal-forecast-tools) or the [documentation](https://copernicus-seasonal-forecast-tools.readthedocs.io/en/latest/?badge=latest)." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Prerequisites:\n", "\n", "1. CDS account and API key:\n", " Register at https://cds.climate.copernicus.eu\n", "\n", "2. CDS API client installation:\n", " pip install cdsapi\n", "\n", "3. CDS API configuration:\n", " Create a .cdsapirc file in your home directory with your API key and URL.\n", " For instructions, visit: https://cds.climate.copernicus.eu/how-to-api#install-the-cds-api-client\n", "\n", "4. Dataset Terms and Conditions: After selecting the dataset to download, make\n", " sure to accept the terms and conditions on the corresponding dataset webpage in the CDS portal before running this notebook. Here, https://cds.climate.copernicus.eu/datasets/seasonal-original-single-levels?tab=download.\n", "\n", "For more information, visit the comprehensive [CDS API setup guide](https://copernicus-seasonal-forecast-tools.readthedocs.io/en/latest/cds_api.html), which walks you through each step of the process. Once configured, you'll be ready to explore and analyze seasonal forecast data.\n", "\n", "**Note**:\n", "Ensure you have the **necessary permissions** and comply with the CDS data usage policies when using this package. You can view the terms and conditions at https://cds.climate.copernicus.eu/datasets/seasonal-original-single-levels?tab=download. You can find them at the bottom of the download page." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Import packages\n", "\n", "import warnings\n", "import datetime as dt\n", "\n", "warnings.filterwarnings(\"ignore\")\n", "from seasonal_forecast_tools import SeasonalForecast, ClimateIndex\n", "from seasonal_forecast_tools.utils.coordinates_utils import bounding_box_from_countries\n", "from seasonal_forecast_tools.utils.time_utils import month_name_to_number" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Set up parameters\n", "\n", "To configure the package for working with Copernicus forecast data and converting it into a hazard object for CLIMADA, you will need to define several essential parameters. These settings are crucial as they specify the type of data to be retrieved, the format, the forecast period, and the geographical area of interest. These parameters influence how the forecast data is processed and transformed into a hazard object.\n", "\n", "Below, we outline these parameters and use an example for the `Tmax` – Maximum Temperature index to demonstrate the seasonal forecast functionality.\n", "\n", "To learn more about what these parameters entail and their significance, please refer to the [documentation on the CDS webpage](https://cds.climate.copernicus.eu/datasets/seasonal-original-single-levels?tab=overview).\n", "\n", "#### Overview of parameters\n", "\n", "**index_metric**: Defines the type of index to be calculated. There are currently **12 predefined options** available, including temperature-based indices (`Tmean` – Mean Temperature, `Tmin` – Minimum Temperature, `Tmax` – Maximum Temperature), heat stress indicators (`HIA` – Heat Index Adjusted, `HIS` – Heat Index Simplified, `HUM` – Humidex, `AT` – Apparent Temperature, `WBGT` – Wet Bulb Globe Temperature (Simple)), and extreme event indices (`HW` – Heat Wave, `TR` – Tropical Nights, `TX30` – Hot Days).\n", "\n", " - **Heat Waves (\"HW\")**: \n", " If `index_metric` is set to 'HW' for heat wave calculations, additional parameters can be specified to fine-tune the heat wave detection:\n", "\n", " - **threshold**: Temperature threshold above which days are considered part of a heat wave. Default is 27°C.\n", " - **min_duration**: Minimum number of consecutive days above the threshold required to define a heat wave event. Default is 3 days.\n", " - **max_gap**: Maximum allowable gap (in days) between two heat wave events to consider them as one single event. Default is 0 days.\n", "\n", " - **Tropical Nights (\"TR\")**: \n", " If `index_metric` is set to 'TR' for tropical nights, an additional parameter can be specified to set the threshold:\n", "\n", " - **threshold**: Nighttime temperature threshold, above which a night is considered \"tropical.\" Default is 20°C.\n", "\n", "- ⚠️ **Flexibility:** Users can define and integrate their own indices into the pipeline to extend the analysis according to their specific needs.\n", "\n", "\n", "**format** : Specifies the format of the data to be downloaded, \"grib\" or \"netcdf\". Copernicus do **NOT** recommended netcdf format for operational workflows since conversion to netcdf is considered experimental. [More information here](https://confluence.ecmwf.int/display/CKB/GRIB+to+netCDF+conversion+on+new+CDS+and+ADS+systems).\n", "\n", "**originating_centre**: Identifies the source of the data. A standard choice is \"dwd\" (German Weather Service), one of eight providers including ECMWF, UK Met Office, Météo France, CMCC, NCEP, JMA, and ECCC.\n", "\n", "**system**: Refers to a specific model or configuration used for forecasts. In this script, the default value is \"21,\" which corresponds to the GCSF (German Climate Forecast System) [version 2.1](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2020MS002101). More details can be found in the [CDS documentation](https://cds.climate.copernicus.eu/datasets/seasonal-original-single-levels?tab=documentation).\n", "\n", "**year_list**: A list of years for which data should be downloaded and processed.\n", "\n", "**initiation_month**: A list of the months in which the forecasts are initiated. Example: [\"March\", \"April\"].\n", "\n", "**forecast_period**: Specifies the months relative to the forecast's initiation month for which the data is forecasted. Example: [\"June\", \"July\", \"August\"] indicates forecasts for these months. The maximum available is 7 months.\n", "\n", " - **⚠️ Important**: When an initiation month is in one year and the forecast period in the next, the system recognizes the forecast extends beyond the initial year. Data is retrieved based on the initiation month, with lead times covering the following year. The forecast is stored under the initiation year’s directory, ensuring consistency while spanning both years.\n", "\n", "**area_selection**: This determines the geographical area for which the data should be downloaded. It can be set to\n", "- Global coverage:\n", " - Use the predefined function bounding_box_global() to select the entire globe.\n", "- Custom geographical bounds (cardinal coordinates):\n", " - Input explicit latitude/longitude limits (in EPSG:4326). \n", " - *bounds = bounding_box_from_cardinal_bounds(northern=49, eastern=20, southern=40, western=10)*\n", "- Country codes (ISO alpha-3):\n", " - Provide a list of ISO 3166-1 alpha-3 country codes (e.g., \"DEU\" for Germany, \"CHE\" for Switzerland). The bounding box is constructed as the union of all selected countries.See this [wikipedia page](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3) for the country codes.\n", " - *bounds = bounding_box_from_countries([\"CHE\", \"DEU\"])*\n", "\n", "\n", "**overwrite**: Boolean flag that, when set to True, forces the system to redownload and reprocess existing files.\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# We define above parameters for an example on Tmax\n", "index_metric = ClimateIndex.Tmax.name\n", "data_format = \"grib\" # 'grib' or 'netcdf'\n", "originating_centre = \"dwd\"\n", "system = \"21\"\n", "forecast_period = [\n", " \"December\",\n", " \"February\",\n", "] # from December to February including January\n", "year_list = [2022]\n", "initiation_month = [\"November\"]\n", "overwrite = False\n", "bounds = bounding_box_from_countries([\"URY\"])\n", "\n", "# Parameters for Heat Waves\n", "hw_threshold = 27\n", "hw_min_duration = 3\n", "hw_max_gap = 0\n", "\n", "# Parameters for Tropical Nights\n", "threshold_tr = 20\n", "\n", "# Describe the selected climate index and the associated input data\n", "forecast = SeasonalForecast(\n", " index_metric=index_metric,\n", " year_list=year_list,\n", " forecast_period=forecast_period,\n", " initiation_month=initiation_month,\n", " bounds=bounds,\n", " data_format=data_format,\n", " originating_centre=originating_centre,\n", " system=system,\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "The variables required for your selected index will be printed below. This allows you to see which data will be accessed and helps estimate the data volume." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Explanation for Maximum Temperature: Maximum Temperature: Tracks the highest temperature recorded over a specified period. Required variables: 2m_temperature'" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "forecast.explain_index()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Download and Process Data\n", "\n", "You can now call the `forecast.download_and_process_data` method, which efficiently retrieves and organizes Copernicus forecast data. It checks for existing files to avoid redundant downloads, stores data by format (grib or netCDF), year, month. Then the files are processed for further analysis, such as calculating climate indices or creating hazard objects within CLIMADA. Here are the aspects of this process:\n", "\n", "- **Data Download**: The method downloads the forecast data for the selected years, months, and regions. The data is retrieved in **grib** or **netCDF** formats, which are commonly used for storing meteorological data. If the required files already exist in the specified directories, the system will skip downloading them, as indicated by the log messages such as: \n", " *\"Corresponding grib file SYSTEM_DIR/copernicus_data/seasonal_forecasts/dwd/sys21/2023/init03/valid06_08/downloaded_data/grib/TX30_boundsW4_S44_E11_N48.grib already exists.\"* \n", "\n", "- **Data Processing**: After downloading (or confirming the existence of) the files, the system converts them into daily **netCDF** files. Each file contains gridded, multi-ensemble data for daily mean, maximum, and minimum, structured by forecast step, ensemble member, latitude, and longitude. The log messages confirm the existence or creation of these files, for example: \n", " *\"Daily file SYSTEM_DIR/copernicus_data/seasonal_forecasts/dwd/sys21/2023/init03/valid06_08/processed_data/TX30_boundsW4_S44_E11_N48.nc already exists.\"*\n", "\n", "- **Geographic and Temporal Focus**: The files are generated for a specific time frame (e.g., June and July 2022) and a predefined geographic region, as specified by the parameters such as `bounds`, `month_list`, and `year_list`. This ensures that only the selected data for your analysis is downloaded and processed.\n", "\n", "- **Data Completeness**: Messages like \"already exists\" ensure that you do not redundantly download or process data, saving time and computing resources. However, if the data files are missing, they will be downloaded and processed as necessary." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'downloaded_data': {'2022_init11_valid12_02': PosixPath('/Users/daraya/climada/data/copernicus_data/seasonal_forecasts/dwd/sys21/2022/init11/valid12_02/downloaded_data/grib/Tmax_boundsN-59_S-35_E-52_W-29.grib')},\n", " 'processed_data': {'2022_init11_valid12_02': PosixPath('/Users/daraya/climada/data/copernicus_data/seasonal_forecasts/dwd/sys21/2022/init11/valid12_02/processed_data/Tmax_boundsN-59_S-35_E-52_W-29.nc')}}" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Download and process data\n", "forecast.download_and_process_data()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "From here, you can consult the data created by calling xarray. This will display the structure of the dataset, including dimensions such as time (here called steps), latitude, longitude, and ensemble members, as well as coordinates, data variables such as the processed daily values of temperature at two meters (mean, max, and min), and associated metadata and attributes. \n", "\n", "This already processed daily data can be used as needed; or you can now also calculate a heat-related index as in the following cells. " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
<xarray.Dataset> Size: 3MB\n",
"Dimensions: (number: 50, step: 90, latitude: 7, longitude: 8)\n",
"Coordinates:\n",
" * number (number) int64 400B 0 1 2 3 4 5 6 7 ... 42 43 44 45 46 47 48 49\n",
" time datetime64[ns] 8B ...\n",
" * step (step) timedelta64[ns] 720B 30 days 09:00:00 ... 119 days 09:...\n",
" surface float64 8B ...\n",
" * latitude (latitude) float64 56B -29.95 -30.95 -31.95 ... -34.95 -35.95\n",
" * longitude (longitude) float64 64B -59.43 -58.43 -57.43 ... -53.43 -52.43\n",
" valid_time (step) datetime64[ns] 720B ...\n",
"Data variables:\n",
" t2m_mean (number, step, latitude, longitude) float32 1MB ...\n",
" t2m_max (number, step, latitude, longitude) float32 1MB ...\n",
" t2m_min (number, step, latitude, longitude) float32 1MB ...<xarray.Dataset> Size: 832B\n",
"Dimensions: (latitude: 7, longitude: 8)\n",
"Coordinates:\n",
" number int64 8B 0\n",
" time datetime64[ns] 8B ...\n",
" step timedelta64[ns] 8B 30 days 09:00:00\n",
" surface float64 8B ...\n",
" * latitude (latitude) float64 56B -29.95 -30.95 -31.95 ... -34.95 -35.95\n",
" * longitude (longitude) float64 64B -59.43 -58.43 -57.43 ... -53.43 -52.43\n",
" valid_time datetime64[ns] 8B ...\n",
"Data variables:\n",
" t2m_mean (latitude, longitude) float32 224B ...\n",
" t2m_max (latitude, longitude) float32 224B ...\n",
" t2m_min (latitude, longitude) float32 224B ...<xarray.Dataset> Size: 1MB\n",
"Dimensions: (number: 50, latitude: 7, longitude: 8, step: 90)\n",
"Coordinates:\n",
" * number (number) int64 400B 0 1 2 3 4 5 6 7 8 ... 42 43 44 45 46 47 48 49\n",
" time datetime64[ns] 8B ...\n",
" surface float64 8B ...\n",
" * latitude (latitude) float64 56B -29.95 -30.95 -31.95 ... -34.95 -35.95\n",
" * longitude (longitude) float64 64B -59.43 -58.43 -57.43 ... -53.43 -52.43\n",
" * step (step) timedelta64[ns] 720B 30 days 09:00:00 ... 119 days 09:0...\n",
"Data variables:\n",
" Tmax (number, step, latitude, longitude) float32 1MB ...<xarray.Dataset> Size: 10kB\n",
"Dimensions: (latitude: 7, longitude: 8, step: 3)\n",
"Coordinates:\n",
" time datetime64[ns] 8B ...\n",
" surface float64 8B ...\n",
" * latitude (latitude) float64 56B -29.95 -30.95 ... -34.95 -35.95\n",
" * longitude (longitude) float64 64B -59.43 -58.43 ... -53.43 -52.43\n",
" * step (step) <U7 84B '2022-12' '2023-01' '2023-02'\n",
" quantile float64 8B ...\n",
"Data variables:\n",
" ensemble_mean (step, latitude, longitude) float32 672B ...\n",
" ensemble_median (step, latitude, longitude) float32 672B ...\n",
" ensemble_max (step, latitude, longitude) float32 672B ...\n",
" ensemble_min (step, latitude, longitude) float32 672B ...\n",
" ensemble_std (step, latitude, longitude) float32 672B ...\n",
" ensemble_p5 (step, latitude, longitude) float64 1kB ...\n",
" ensemble_p25 (step, latitude, longitude) float64 1kB ...\n",
" ensemble_p50 (step, latitude, longitude) float64 1kB ...\n",
" ensemble_p75 (step, latitude, longitude) float64 1kB ...\n",
" ensemble_p95 (step, latitude, longitude) float64 1kB ...