Getting started with OGGM: a real case study, step by step#

The OGGM workflow is best explained with an example. In the following, we will show how to apply the standard OGGM workflow to a list of glaciers. This example is meant to guide you through a first-time setup step-by-step. If you prefer not to install OGGM on your computer, you can always run this notebook in OGGM-Edu instead!

Set-up#

Input data folders#

If you are using your own computer: before you start, make sure that you have set-up the input data configuration file at your wish.

In the course of this tutorial, we will need to download data needed for each glacier (a couple of mb at max, depending on the chosen glaciers), so make sure you have an internet connection.

cfg.initialize() and cfg.PARAMS#

An OGGM simulation script will always start with the following commands:

from oggm import cfg, utils
cfg.initialize(logging_level='WARNING')

2023-03-07 12:26:01: oggm.cfg: Reading default parameters from the OGGM `params.cfg` configuration file.
2023-03-07 12:26:01: oggm.cfg: Multiprocessing switched OFF according to the parameter file.
2023-03-07 12:26:01: oggm.cfg: Multiprocessing: using all available processors (N=2)

A call to cfg.initialize() will read the default parameter file (or any user-provided file) and make them available to all other OGGM tools via the cfg.PARAMS dictionary. Here are some examples of these parameters:

cfg.PARAMS['prcp_scaling_factor'], cfg.PARAMS['ice_density'], cfg.PARAMS['continue_on_error']

(2.5, 900.0, False)

See here for the default parameter file and a description of their role and default value.

# You can try with or without multiprocessing: with two glaciers, OGGM could run on two processors
cfg.PARAMS['use_multiprocessing'] = True

2023-03-07 12:26:02: oggm.cfg: Multiprocessing switched ON after user settings.

Workflow#

In this section, we will explain the fundamental concepts of the OGGM workflow:

Working directories
Glacier directories
Tasks

from oggm import workflow

Working directory#

Each OGGM run needs a single folder where to store the results of the computations for all glaciers. This is called a “working directory” and needs to be specified before each run. Here we create a temporary folder for you:

cfg.PATHS['working_dir'] = utils.gettempdir(dirname='OGGM-GettingStarted', reset=True)
cfg.PATHS['working_dir']

'/tmp/OGGM/OGGM-GettingStarted'

We use a temporary directory for this example, but in practice you will set this working directory yourself (for example: /home/john/OGGM_output. The size of this directory will depend on how many glaciers you’ll simulate!

This working directory is meant to be persistent, i.e. you can stop your processing workflow after any task, and restart from an existing working directory at a later stage.

You can create a persistent OGGM working directory at a specific path via path = utils.mkdir(path). Beware! If you use reset=True in utils.mkdir, ALL DATA in this folder will be deleted! Use it with caution!

Define the glaciers for the run#

rgi_ids = ['RGI60-11.01328', 'RGI60-11.00897']

You can provide any number of glacier identifiers to OGGM. In this case, we chose:

RGI60-11.01328: Unteraar Glacier in the Swiss Alps
RGI60-11.00897: Hintereisferner in the Austrian Alps.

Here is a list of other glaciers you might want to try out:

RGI60-18.02342: Tasman Glacier in New Zealand
RGI60-11.00787: Kesselwandferner in the Austrian Alps
… or any other glacier identifier! You can find other glacier identifiers by exploring the GLIMS viewer.

For an operational run on an RGI region, you might want to download the Randolph Glacier Inventory dataset instead, and start a run from it. This case is covered in the working with the RGI tutorial.

Glacier directories#

The OGGM workflow is organized as a list of tasks that have to be applied to a list of glaciers. The vast majority of tasks are called entity tasks: they are standalone operations to be realized on one single glacier entity. These tasks are executed sequentially (one after another): they often need input generated by the previous task(s): for example, the climate calibration needs the glacier flowlines, which can be only computed after the topography data has been processed, and so on.

To handle this situation, OGGM uses a workflow based on data persistence on disk: instead of passing data as python variables from one task to another, each task will read the data from disk and then write the computation results back to the disk, making these new data available for the next task in the queue.

These glacier specific data are located in glacier directories. In the model, these directories are initialized with the following command (this can take a little while on the first call, as OGGM needs to download some data):

# Where to fetch the pre-processed directories
gdirs = workflow.init_glacier_directories(rgi_ids, from_prepro_level=3, prepro_border=80)

2023-03-07 12:26:02: oggm.workflow: init_glacier_directories from prepro level 3 on 2 glaciers.
2023-03-07 12:26:02: oggm.workflow: Execute entity tasks [gdir_from_prepro] on 2 glaciers

---------------------------------------------------------------------------
RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/local/pyenv/versions/3.10.10/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/usr/local/pyenv/versions/3.10.10/lib/python3.10/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/workflow.py", line 108, in __call__
    res = self._call_internal(func, arg, kwargs)
  File "/usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/workflow.py", line 102, in _call_internal
    return call_func(gdir, **kwargs)
  File "/usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/workflow.py", line 251, in gdir_from_prepro
    return oggm.GlacierDirectory(entity, from_tar=from_tar)
  File "/usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/utils/_workflow.py", line 2529, in __init__
    self.name = filter_rgi_name(name)
  File "/usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/utils/_funcs.py", line 734, in filter_rgi_name
    if name is None or len(name) == 0:
TypeError: object of type 'numpy.float64' has no len()
"""

The above exception was the direct cause of the following exception:

TypeError                                 Traceback (most recent call last)
Cell In[7], line 2
      1 # Where to fetch the pre-processed directories
----> 2 gdirs = workflow.init_glacier_directories(rgi_ids, from_prepro_level=3, prepro_border=80)

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/workflow.py:533, in init_glacier_directories(rgidf, reset, force, from_prepro_level, prepro_border, prepro_rgi_version, prepro_base_url, from_tar, delete_tar)
    531     if cfg.PARAMS['dl_verify']:
    532         utils.get_dl_verify_data('cluster.klima.uni-bremen.de')
--> 533     gdirs = execute_entity_task(gdir_from_prepro, entities,
    534                                 from_prepro_level=from_prepro_level,
    535                                 prepro_border=prepro_border,
    536                                 prepro_rgi_version=prepro_rgi_version,
    537                                 base_url=prepro_base_url)
    538 else:
    539     # We can set the intersects file automatically here
    540     if (cfg.PARAMS['use_intersects'] and
    541             len(cfg.PARAMS['intersects_gdf']) == 0 and
    542             not from_tar):

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/workflow.py:185, in execute_entity_task(task, gdirs, **kwargs)
    183 if cfg.PARAMS['use_multiprocessing'] and ng > 1:
    184     mppool = init_mp_pool(cfg.CONFIG_MODIFIED)
--> 185     out = mppool.map(pc, gdirs, chunksize=1)
    186 else:
    187     if ng > 3:

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/multiprocessing/pool.py:367, in Pool.map(self, func, iterable, chunksize)
    362 def map(self, func, iterable, chunksize=None):
    363     '''
    364     Apply `func` to each element in `iterable`, collecting the results
    365     in a list that is returned.
    366     '''
--> 367     return self._map_async(func, iterable, mapstar, chunksize).get()

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/multiprocessing/pool.py:774, in ApplyResult.get(self, timeout)
    772     return self._value
    773 else:
--> 774     raise self._value

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/multiprocessing/pool.py:125, in worker()
    123 job, i, func, args, kwds = task
    124 try:
--> 125     result = (True, func(*args, **kwds))
    126 except Exception as e:
    127     if wrap_exception and func is not _helper_reraises_exception:

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/multiprocessing/pool.py:48, in mapstar()
     47 def mapstar(args):
---> 48     return list(map(*args))

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/workflow.py:108, in __call__()
    106 for func in self.call_func:
    107     func, kwargs = func
--> 108     res = self._call_internal(func, arg, kwargs)
    109 return res

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/workflow.py:102, in _call_internal()
     99     gdir, gdir_kwargs = gdir
    100     kwargs.update(gdir_kwargs)
--> 102 return call_func(gdir, **kwargs)

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/workflow.py:251, in gdir_from_prepro()
    248 tar_base = utils.get_prepro_gdir(prepro_rgi_version, rid, prepro_border,
    249                                  from_prepro_level, base_url=base_url)
    250 from_tar = os.path.join(tar_base.replace('.tar', ''), rid + '.tar.gz')
--> 251 return oggm.GlacierDirectory(entity, from_tar=from_tar)

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/utils/_workflow.py:2529, in __init__()
   2525     raise RuntimeError('RGI Version not supported: '
   2526                        '{}'.format(self.rgi_version))
   2528 # remove spurious characters and trailing blanks
-> 2529 self.name = filter_rgi_name(name)
   2531 # region
   2532 reg_names, subreg_names = parse_rgi_meta(version=self.rgi_version[0])

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/utils/_funcs.py:734, in filter_rgi_name()
    728 def filter_rgi_name(name):
    729     """Remove spurious characters and trailing blanks from RGI glacier name.
    730 
    731     This seems to be unnecessary with RGI V6
    732     """
--> 734     if name is None or len(name) == 0:
    735         return ''
    737     if name[-1] in ['À', 'È', 'è', '\x9c', '3', 'Ð', '°', '¾',
    738                     '\r', '\x93', '¤', '0', '`', '/', 'C', '@',
    739                     'Å', '\x06', '\x10', '^', 'å', ';']:

TypeError: object of type 'numpy.float64' has no len()

the keyword from_prepro_level indicates that we will start from pre-processed directories, i.e. data that are already pre-processed for the model users. In many cases you will want to start from level 3, 4 or 5. Here we start from level 3 and rerun some of the processing in order to demonstrate the OGGM workflow.
the prepro_border keyword indicates the number of DEM grid points which we’d like to add to each side of the glacier for the local map: the larger the glacier will grow, the larger the border parameter should be. The available pre-processed border values are: 10, 80, 160 (depending on the model set-ups there might be more or less options). These are the fixed map sizes we prepared for you - any other map size will require a full processing (see the alternative DEM example for a tutorial).

The init_glacier_directories task will allways be the very first task to call for all your OGGM experiments. Let’s see what it gives us back:

type(gdirs), type(gdirs[0])

gdirs is a list of GlacierDirectory objects (one for each glacier). Glacier directories are used by OGGM as “file and attribute manager” for single glaciers. For example, the model now knows where to find the topography data file for this glacier:

gdir = gdirs[0]  # take Unteraar glacier
print('Path to the DEM:', gdir.get_filepath('dem'))

And we can also access some attributes of this glacier:

gdir

gdir.rgi_date  # date at which the outlines are valid

The advantage of this Glacier Directory data model is that it simplifies greatly the data transfer between tasks. The single mandatory argument of all entity tasks will allways be a glacier directory. With the glacier directory, each task will find the input it needs: for example, both the glacier’s topography and outlines are needed for the next plotting function, and both are available via the gdir argument:

from oggm import graphics
graphics.plot_domain(gdir, figsize=(8, 7))

Another advantage of glacier directories is their persistence on disk: once created, they can be recovered from the same location by using init_glacier_directories again, but without keyword arguments:

# Fetch the LOCAL pre-processed directories - note that no arguments are used!
gdirs = workflow.init_glacier_directories(rgi_ids)

See the store_and_compress_glacierdirs tutorial for more information on glacier directories.

Tasks#

There are two different types of “tasks”:

Entity Tasks: Standalone operations to be realized on one single glacier entity, independently from the others. The majority of OGGM tasks are entity tasks. They are parallelisable: the same task can run on several glaciers in parallel.

Global Task: Tasks which require to work on several glacier entities at the same time. Model parameter calibration or the compilation of several glaciers’ output are examples of global tasks.

OGGM implements a simple mechanism to run a specific task on a list of GlacierDirectory objects:

from oggm import tasks

# run the glacier_masks task on all gdirs
workflow.execute_entity_task(tasks.glacier_masks, gdirs);

The task we just applied to our list of glaciers is glacier_masks. It wrote a new file in our glacier directory, providing raster masks of the glacier (among other things):

print('Path to the masks:', gdir.get_filepath('gridded_data'))

It is also possible to apply several tasks sequentially (i.e. one after an other) on our glacier list:

list_talks = [
         tasks.compute_centerlines,
         tasks.initialize_flowlines,
         tasks.compute_downstream_line,
         ]
for task in list_talks:
    # The order matters!
    workflow.execute_entity_task(task, gdirs)

The function execute_task can run a task on different glaciers at the same time, if the use_multiprocessing option is set to True in the configuration file.

Among other things, we computed the glacier flowlines and the glacier’s downstream line. We can now plot them:

graphics.plot_centerlines(gdir, figsize=(8, 7), use_flowlines=True, add_downstream=True)

As a result, the glacier directories now store many more files. If you are interested, you can have a look:

import os
print(os.listdir(gdir.dir))

For a short explanation of what these files are, see the glacier directory documentation. In practice, however, you will only rarely need to access these files yourself.

Other preprocessing tasks#

Let’s continue with the other preprocessing tasks:

list_talks = [
         tasks.catchment_area,
         tasks.catchment_width_geom,
         tasks.catchment_width_correction,
         tasks.compute_downstream_bedshape
         ]
for task in list_talks:
    # The order matters!
    workflow.execute_entity_task(task, gdirs)

We just computed the catchment areas of each flowline (the colors are arbitrary):

graphics.plot_catchment_areas(gdir, figsize=(8, 7))

Each flowline now knows what area will contribute to its surface mass-balance and ice flow. Accordingly, it is possible to compute each glacier cross-section’s width, and correct it so that the total glacier area and elevation distribution is conserved:

graphics.plot_catchment_width(gdir, corrected=True, figsize=(8, 7))

Climate tasks#

The glacier directories we downloaded already contains the climate timeseries for each glacier (from_prepro_level=3). Let’s have a look at them:

import xarray as xr
fpath = gdir.get_filepath('climate_historical')
ds = xr.open_dataset(fpath)
# Data is in hydrological years
# -> let's just ignore the first and last calendar years
ds.temp.resample(time='AS').mean()[1:-1].plot();

This climate data is called the “baseline climate” for this glacier. It will be used for the mass-balance model calibration, and at the end of this tutorial also to generate the random climate to drive a simulation. When running OGGM with GCM data, the GCM timeseries will be computed as anomalies to this baseline timeseries, hence the name.

Here we are using CRU, but OGGM-Shop also allows to use ERA5 and CERA as baseline.

Now, let’s calibrate the mass-balance model for this glacier. The calibration procedure of OGGM is … original, but is also quite powerful. Read the doc page or the GMD paper for more details, and you can also follow the mass-balance calibration tutorial explaining some of the model internals.

The default calibration process is automated (see also local_t_star):

# Fetch the reference t* list and associated model parameters
params_url = 'https://cluster.klima.uni-bremen.de/~oggm/ref_mb_params/oggm_v1.4/RGIV62/CRU/centerlines/qc3/pcp2.5'
workflow.download_ref_tstars(base_url=params_url)

# Now calibrate
workflow.execute_entity_task(tasks.local_t_star, gdirs);
workflow.execute_entity_task(tasks.mu_star_calibration, gdirs);

¡Important! The calibration of the mass-balance model is automated only for certain parameter combinations of the model - any change in the mass-balance model settings (e.g. the melt threshold, the precipitation correction factor, etc.) will require a re-calibration of the model (see the mass-balance calibration tutorial for an introduction to this topic).

From there, OGGM can now compute the mass-balance for these glaciers. For example:

from oggm.core.massbalance import MultipleFlowlineMassBalance
gdir_hef = gdirs[1]
mbmod = MultipleFlowlineMassBalance(gdir_hef, use_inversion_flowlines=True)

import numpy as np
import matplotlib.pyplot as plt
years = np.arange(1902, 2017)
mb_ts = mbmod.get_specific_mb(year=years)
plt.plot(years, mb_ts); plt.ylabel('SMB (mm yr$^{-1}$)');

For the Hintereiferner (not for Unteraar where no observational data is available), we can also compare our computed mass-balance to the measured one:

mbdf = gdir_hef.get_ref_mb_data()
mbdf['OGGM'] = mbmod.get_specific_mb(year=mbdf.index)
mbdf[['ANNUAL_BALANCE', 'OGGM']].plot(); plt.ylabel('SMB (mm yr$^{-1}$)');

This graphic is interesting because it shows an effect often observed when comparing the computed mass balance to the observed one: since (in this case) the OGGM geometry is fixed with time, the modelled specific mass-balance series are likely to have a stronger trend than the observed ones.

To assess the results of the OGGM mass-balance model for all WGMS glaciers worldwide, visit the score summary for this particular model settings.

Computing the ice thickness (“inversion”)#

With the computed mass-balance and the flowlines, OGGM can now compute the ice thickness, based on the principles of mass conservation and ice dynamics.

list_talks = [
         tasks.prepare_for_inversion,  # This is a preprocessing task
         tasks.mass_conservation_inversion,  # This does the actual job
         tasks.filter_inversion_output  # This smoothes the thicknesses at the tongue a little
         ]
for task in list_talks:
    workflow.execute_entity_task(task, gdirs)

The ice thickness is computed for all sections along the flowline, and can be displayed with the help of OGGM’s graphics module:

graphics.plot_inversion(gdir, figsize=(8, 7))

The inversion is realized with the default parameter settings: it must be noted that the model is sensitive to the choice of some of them, most notably the creep parameter A:

cfg.PARAMS['inversion_glen_a']

a_factor = np.linspace(0.1, 10., 100)
volume = []
for f in a_factor:
    # Recompute the volume without overwriting the previous computations
    v = tasks.mass_conservation_inversion(gdir, glen_a=f * cfg.PARAMS['inversion_glen_a'], write=False)
    volume.append(v * 1e-9)
plt.plot(a_factor, volume); plt.title('Unteraar total volume');
plt.ylabel('Volume (km$^3$)'); plt.xlabel('Glen A factor (1 = default)'); 

There is no simple way to find the best A for each individual glacier. It can easily vary by a factor of 10 (or more) from one glacier to another. At the global scale, the “best” A is close to the default value (possibly between 1 and 1.5 times larger). The default parameter a good choice in a first step but be aware that reconstructions based on this default parameter might be very uncertain! See our ice thickness inversion tutorial for a more in-depth discussion.

Simulations#

For most applications, this is where the fun starts! With climate data and an estimate of the ice thickness, we can now start transient simulations. For this tutorial, we will show how to realize idealized experiments based on the baseline climate only, but it is also possible to drive OGGM with real GCM data.

# Convert the flowlines to a "glacier" for the ice dynamics module
workflow.execute_entity_task(tasks.init_present_time_glacier, gdirs);

Let’s start a run driven by a the climate of the last 31 years, shuffled randomly for 200 years. This can be seen as a “commitment” simulation, i.e. how much glaciers will change even without further climate change:

cfg.PARAMS['store_model_geometry'] = True  # add additional outputs for the maps below
workflow.execute_entity_task(tasks.run_random_climate, gdirs, nyears=200,
                             y0=2000, output_filesuffix='_2000');

The output of this simulation is stored in two separate files: a diagnostic file (which contains time series variables such as length, volume, ELA, etc.) and a full model output file, which is larger but allows to reproduce the full glacier geometry changes during the run.

In practice, the diagnostic files are often compiled for the entire list of glaciers:

ds2000 = utils.compile_run_output(gdirs, input_filesuffix='_2000')

This dataset is also stored on disk (in the working directory) as NetCDF file for later use. Here we can access it directly:

ds2000

We opened the file with xarray, a very useful data analysis library based on pandas. For example, we can plot the volume and length evolution of both glaciers with time:

f, (ax1, ax2) = plt.subplots(1, 2, figsize=(13, 4))
ds2000.volume.plot.line(ax=ax1, hue='rgi_id');
ds2000.length.plot.line(ax=ax2, hue='rgi_id');

The full model output files can be used for plots:

f, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(14, 6))
graphics.plot_modeloutput_map(gdir, filesuffix='_2000', modelyr=0, ax=ax1, vmax=350)
graphics.plot_modeloutput_map(gdir, filesuffix='_2000', modelyr=50, ax=ax2, vmax=350)
graphics.plot_modeloutput_map(gdir, filesuffix='_2000', modelyr=150, ax=ax3, vmax=350)
plt.tight_layout();

Sensitivity to temperature#

Now repeat our simulations with a +0.5°C and -0.5°C temperature bias, which for a glacier is quite a lot!

workflow.execute_entity_task(tasks.run_random_climate, gdirs, nyears=200,
                             temperature_bias=0.5,
                             y0=2000, output_filesuffix='_p05');
workflow.execute_entity_task(tasks.run_random_climate, gdirs, nyears=200,
                             temperature_bias=-0.5,
                             y0=2000, output_filesuffix='_m05');

dsp = utils.compile_run_output(gdirs, input_filesuffix='_p05')
dsm = utils.compile_run_output(gdirs, input_filesuffix='_m05')

f, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(16, 4))
rgi_id = 'RGI60-11.01328'
ds2000.sel(rgi_id=rgi_id).volume.plot.line(ax=ax1, hue='rgi_id', label='Commitment');
ds2000.sel(rgi_id=rgi_id).area.plot.line(ax=ax2, hue='rgi_id');
ds2000.sel(rgi_id=rgi_id).length.plot.line(ax=ax3, hue='rgi_id');
dsp.sel(rgi_id=rgi_id).volume.plot.line(ax=ax1, hue='rgi_id', label='$+$ 0.5°C');
dsp.sel(rgi_id=rgi_id).area.plot.line(ax=ax2, hue='rgi_id');
dsp.sel(rgi_id=rgi_id).length.plot.line(ax=ax3, hue='rgi_id');
dsm.sel(rgi_id=rgi_id).volume.plot.line(ax=ax1, hue='rgi_id', label='$-$ 0.5°C');
dsm.sel(rgi_id=rgi_id).area.plot.line(ax=ax2, hue='rgi_id');
dsm.sel(rgi_id=rgi_id).length.plot.line(ax=ax3, hue='rgi_id');
ax1.legend();

From this experiment, we learn that a climate that is -0.5° colder than today (i.e. mean climate of 1985-2015) would barely be enough to maintain the Unteraar glacier in its present day geometry.

What’s next?#

return to the OGGM documentation
back to the table of contents

OGGM tutorials

Getting started with OGGM: a real case study, step by step

Contents