DEMs in the OGGM workflow

The main reason for this short blog post is to document how and why we split some digital elevation models (DEMs) into multiple smaller pieces. But let us start with a small introduction to the DEM problem:

For each glacier we want to simulate in the OGGM framework we need two topographical items: The outlines of the glacier and the underlying topography. The first we get from the Randolph Glacier Inventory (RGI), a global inventory of individual glacier shapefiles. For the latter we need a suitable DEM and this is where things get more complicated. OGGM of course allows the user to provide a specific (e.g. publicly not available) DEM. A tutorial about this can be found here. But in general the OGGM workflow is such that the user does not have to choose a DEM her/himself. By default OGGM chooses between several freely available DEM sources depending on the region of the glacier.

DEM requirements

For a DEM to be automatically usable within the OGGM workflow it needs to fulfill some criteria: First and foremost the DEM must be freely available, at least for scientific purpose.
Next important point is the quality of the DEM. Data gaps are always a problem and this is especially true for steep mountainous terrain, which is unfortunately where most glaciers are located. Worse than data gaps (which can be identified and interpolated if small) are wrong values within the DEM. Wrong values are often hard to detect and such a DEM should therefor be avoided in an automatic process.
Another criteria is the resolution of the DEM. For OGGM it should be high enough to represent all important features of a small mountain glacier. But it should not be so high that the additional resolution causes performance issues on global scale runs due to intensified file handling. Our rule of thumb choice would be a resolution between 50 and 150m.
Last point but rather important when analysing the current state of glaciers is the date when the DEM was acquired. The RGI outlines are a snapshot of the glaciers state at a certain time. Most outlines in the current version RGIv6 are from around 2003. But this is not consistent through the inventory and is likely to change with future RGI versions. The DEM date and the RGI date should be as close as possible.

A list of the different DEMs which are currently available within OGGM, their problems, and our standard choices can be found here.

Data sources and acknowledgments

If possible OGGM will download the DEM directly from the original data provider. In some cases we decided to mirror the DEM on our own data server and let OGGM get it from there. This might be necessary if there is no usable API or the network performance of the original source is extremely poor.

Every OGGM user is responsible to thoroughly attribute and acknowledge the original data provider if required by their license terms. In order to support this, OGGM provides a text file called dem_source.txt in every GlacierDirectory which contains the requested acknowledgments and citations of the DEM used for this individual glacier. But we recommend to always double check the license terms with the original data source.

Downloading the DEMs

OGGM stores all automatically downloaded files (DEMs, outlines, climate data, …) in a local cache directory. This directory will be checked before any download to avoid downloading the same file twice.

To additionally minimize network usage and file handling OGGM only downloads necessary DEM tiles and not the whole DEM. This is possible for most but not all DEMs:
ArcticDEM and REMA (both provided by the Polar Geospatial Center (PGC)) are only available as single GeoTiff files (at 100m resolution) and cover the whole Arctic or Antarctic respectively. For a regional run with one of these DEMs the single GeoTiff would only have to be downloaded once. But it would be necessary to read the large file and extract the actual glacier area for every glacier within that run.
To avoid this costly file I/O we decided to cut the original 100m REMA and ArcticDEM into multiple DEM tiles and provide these to OGGM from our own data server. Both DEMs were cut in the pattern of the respective Tile Index which are provided by the PGC for the higher resolution versions of the two DEMs.

To enable others to reproduce this task we here document the few necessary Python commands as an example for REMA:

import geopandas as gpd
from subprocess import Popen
import os
from oggm import utils

# read the full REMA GeoTiff
dem_full = 'REMA_100m_dem.tif'

# read the REMA Tile Index
gdf = gpd.read_file('REMA_Tile_Index_Rel1.1.shp')

for i in range(len(gdf)):
    # separate one tile from the tile index
    tile = gdf.iloc[[i]]

    # create one directory for every tile
    name = '{}_100m_v1.1'.format(tile.iloc[0].tile)
    out_dir = os.path.join('REMA_100m_v1.1', name)
    utils.mkdir(out_dir, reset=True)

    # make and write one shapefile for every tile, necessary for gdalwarp  
    index_dir = os.path.join(out_dir, 'index')
    shp_file = os.path.join(index_dir, name + '_index.shp')
    utils.mkdir(index_dir)
    tile.to_file(shp_file)

    # name of the GeoTiff tile  
    dem_tile = os.path.join(out_dir, name + '_reg_dem.tif')

    # Use gdalwarp -cutline as shell command to cut one tile at a time
    command = 'gdalwarp -cutline {} -crop_to_cutline -of GTiff {} {}'.format(shp_file, dem_full, dem_tile)
    p = Popen(command, shell=True)
    p.wait()

For the Alaska DEM we did the same, except that we had to create the tile file ourselves. Here is the code that we used, and the tile shapefile is available in the oggm-sample-data repository.