A quick look into the mass-balance calibration procedure#

The default mass-balance model of OGGM is a very standard temperature index melt model. What’s … unusual about it, is the calibration procedure. We have explained it in quite some length in the documentation, in our paper, in the original publication that introduced it for the first time, and we have created a website to monitor its performance. The method is not perfect (by far), but it is quite powerful: I’ve often said that it is the best idea that Ben ever had, and he has a lot of good ideas.

However, experience shows that most people (including us sometimes ;) don’t understand how it works. Let’s add a new tutorial to the list, and jump on the opportunity to play around with the calibration procedure (don’t do this at home!).

Set-up#

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import os

import oggm
from oggm import cfg, utils, workflow, tasks, graphics
from oggm.core import massbalance, climate
cfg.initialize(logging_level='WARNING')
cfg.PATHS['working_dir'] = utils.gettempdir(dirname='OGGM-ref-mb', reset=True)
cfg.PARAMS['border'] = 10
2023-03-07 12:44:07: oggm.cfg: Reading default parameters from the OGGM `params.cfg` configuration file.
2023-03-07 12:44:07: oggm.cfg: Multiprocessing switched OFF according to the parameter file.
2023-03-07 12:44:07: oggm.cfg: Multiprocessing: using all available processors (N=2)
2023-03-07 12:44:07: oggm.cfg: PARAMS['border'] changed from `40` to `10`.

We start from two well known glaciers in the Austrian Alps, Kesselwandferner and Hintereisferner:

gdirs = workflow.init_glacier_directories(['RGI60-11.00787', 'RGI60-11.00897'], from_prepro_level=3)
2023-03-07 12:44:07: oggm.workflow: init_glacier_directories from prepro level 3 on 2 glaciers.
2023-03-07 12:44:07: oggm.workflow: Execute entity tasks [gdir_from_prepro] on 2 glaciers

The two glaciers are neighbors but have very different geometries:

f, ax = plt.subplots(figsize=(8, 8))
graphics.plot_googlemap(gdirs, ax=ax)
../_images/massbalance_calibration_8_0.png

Calibration on glaciers with data#

Like most models, they are always “perfect” where observations are available (both glaciers are reference glaciers in the WGMS database):

# Get HEF
gdir = gdirs[1]
# Get the reference mass-balance from the WGMS
ref_df = gdir.get_ref_mb_data()
# Get the calibrated mass-balance model
mbmod = massbalance.PastMassBalance(gdir)
# Compute the specific MB for this glacier
fls = gdir.read_pickle('inversion_flowlines')
ref_df['OGGM (calib)'] = mbmod.get_specific_mb(fls=fls, year=ref_df.index.values)
ref_df[['ANNUAL_BALANCE', 'OGGM (calib)']].plot();
/usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/utils/_workflow.py:2939: UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle again; shapely 2.1 will not have this compatibility.
  out = pickle.load(f)
../_images/massbalance_calibration_11_1.png

Perfect? Not really: the model is calibrated so that the bias over the calibration period is zero, but other uncertainties remain: the variability in our timeseries is too high (this has something to do with precipitation amounts) and we have a stronger trend in the model than observations (this is due to the changing glacier geometry, that our model doesn’t know about here: so this is more a “feature” than a bug).

What’s important for OGGM here is the bias, which is zero:

ref_df[['ANNUAL_BALANCE', 'OGGM (calib)']].mean()
ANNUAL_BALANCE   -648.223881
OGGM (calib)     -675.892979
dtype: float64

The calibration procedure did following: it found a year \(t^*\) in the past which is selected so that the glacier, with its present day geometry, would be in equilibrium with a 31-yr climate centered on \(t^*\) (see the documentation for more info). The year \(t^*\) and the associated temperature sensitivity \(\mu^*\) (units: mm K\(^{-1}\) yr\(^{-1}\)) for this glacier are:

gdir.read_json('local_mustar')['t_star'], gdir.read_json('local_mustar')['mu_star_glacierwide']
(1974, 221.28378354824062)

What about Kesselwandferner?

gdir = gdirs[0]
gdir.read_json('local_mustar')['t_star'], gdir.read_json('local_mustar')['mu_star_glacierwide']
(1987, 343.0814484116781)

These are very different values! Not surprising: the two glaciers are in the same climate (from the forcing data) but are very different in size, orientation and geometry. The resulting mass-balance values are also different:

ref_df = gdir.get_ref_mb_data()
mbmod = massbalance.PastMassBalance(gdir)
ref_df['OGGM (calib)'] = mbmod.get_specific_mb(fls=gdir.read_pickle('inversion_flowlines'), year=ref_df.index.values)
ref_df[['ANNUAL_BALANCE', 'OGGM (calib)']].mean()
/usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/utils/_workflow.py:2939: UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle again; shapely 2.1 will not have this compatibility.
  out = pickle.load(f)
ANNUAL_BALANCE   -167.955224
OGGM (calib)     -208.495382
dtype: float64

Kesselwandferner has a mass-balance less negative than its neighbor, probably because it is smaller in size and spans less altitude.

Calibration on glaciers without data#

Let’s trick OGGM a little. Let’s make it appear that no data is available at Kesselwandferner for calibration:

# Set an empty reference mass-balance
gdir.set_ref_mb_data(ref_df.iloc[[]])

And re-run the mass-balance calibration:

cfg.PARAMS['run_mb_calibration'] = True
climate.compute_ref_t_stars(gdirs)
workflow.execute_entity_task(tasks.local_t_star, gdirs);
workflow.execute_entity_task(tasks.mu_star_calibration, gdirs);
2023-03-07 12:44:14: oggm.cfg: PARAMS['run_mb_calibration'] changed from `False` to `True`.
2023-03-07 12:44:14: oggm.core.climate: Applying global task compute_ref_t_stars on 2 glaciers
2023-03-07 12:44:14: oggm.utils: Applying global task get_ref_mb_glaciers on 2 glaciers
2023-03-07 12:44:14: oggm.workflow: Execute entity tasks [t_star_from_refmb] on 1 glaciers
/usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/utils/_workflow.py:2939: UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle again; shapely 2.1 will not have this compatibility.
  out = pickle.load(f)
2023-03-07 12:44:14: oggm.workflow: Execute entity tasks [local_t_star] on 2 glaciers
/usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/utils/_workflow.py:2939: UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle again; shapely 2.1 will not have this compatibility.
  out = pickle.load(f)
/usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/utils/_workflow.py:2939: UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle again; shapely 2.1 will not have this compatibility.
  out = pickle.load(f)
2023-03-07 12:44:14: oggm.workflow: Execute entity tasks [mu_star_calibration] on 2 glaciers
/usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/utils/_workflow.py:2939: UserWarning: Unpickling a shapely <2.0 geometry object. Please save the pickle again; shapely 2.1 will not have this compatibility.
  out = pickle.load(f)

Now, what’s the new \(\mu^*\) of Kesselwandferner?

gdir.read_json('local_mustar')['t_star'], gdir.read_json('local_mustar')['mu_star_glacierwide']
(1975, 424.74830246056376)

In the absence of other glaciers to interpolate from (usually the 10 closest), OGGM simply assigned the value of \(t^*\) from Hintereisferner to Kesselwandferner (1963). From this, \(\mu^*\) could be estimated. This “blind” mass-balance model is of course not as good as if we calibrated it using observations:

mbmod = massbalance.PastMassBalance(gdir)
ref_df['OGGM (blind)'] = mbmod.get_specific_mb(fls=gdir.read_pickle('inversion_flowlines'), year=ref_df.index.values)
ref_df[['ANNUAL_BALANCE', 'OGGM (calib)', 'OGGM (blind)']].mean()
ANNUAL_BALANCE    -167.955224
OGGM (calib)      -208.495382
OGGM (blind)     -1044.091649
dtype: float64

Quite a negative bias! This is as good as OGGM can be in the absence of observations and without any other reference glacier in vicinity (OGGM needs at least one).

What else could we do to calibrate \(\mu^*\) for this glacier? We tried for example to simply interpolate \(\mu^*\), and found that on average, this leads to worse results. Here for example:

mbmod = massbalance.PastMassBalance(gdir, mu_star=221.275)  # Apply mu* from Hintereisferner
ref_df['OGGM (μ∗ from HEF)'] = mbmod.get_specific_mb(fls=gdir.read_pickle('inversion_flowlines'), year=ref_df.index.values)
ref_df[['ANNUAL_BALANCE', 'OGGM (calib)', 'OGGM (blind)', 'OGGM (μ∗ from HEF)']].mean()
ANNUAL_BALANCE        -167.955224
OGGM (calib)          -208.495382
OGGM (blind)         -1044.091649
OGGM (μ∗ from HEF)     941.432404
dtype: float64

Interpolating \(\mu^*\) instead of \(t^*\) is quite quite worse! And we found out that this is generally the case globally:

img Benefit of spatially interpolating \(t^*\) instead of \(\mu^*\) as shown by leave-one-glacier-out cross-validation (N = 255). Left: error distribution of the computed mass-balance if determined by the interpolated \(t^*\). Right: error distribution of the mass-balance if determined by interpolation of \(\mu^*\). From Maussion et al., 2019.

If you come up with a great idea to make the calibration procedure better, please reach out! We have some ideas, but haven’t come up to try it yet. There are so many things we want to try out with OGGM!

Prepared reference t* lists#

As of OGGM v1.4, we offer various t* lists for download and use in your workflow. You will find them here:

https://cluster.klima.uni-bremen.de/~oggm/ref_mb_params

The easiest is to let OGGM download them for you with, for example:

# Reference data for the recalibration of the mass-balance
ref_tstars_url = 'https://cluster.klima.uni-bremen.de/~oggm/ref_mb_params/oggm_v1.4/RGIV62/CRU/centerlines/qc3/pcp2.5'
workflow.download_ref_tstars(base_url=ref_tstars_url)

See also: local_t_star documentation.

Bonus: additional MB data for OGGM#

This is a rare, but possible use case: you might have mass-balance data that is not yet available in the WGMS database, or reanalyzed data that you would like to use in place of the default WGMS. Doing this is quite simple: let’s assume we have some secret data for a third glacier, here the Oberer Grindelwald Glacier in the Swiss Alps (this is an example of course):

gdirs = workflow.init_glacier_directories(['RGI60-11.00787', 'RGI60-11.00897', 'RGI60-11.01270'], from_prepro_level=3, reset=True, force=True)
# Let's make sure that OGG is not a reference glacier: this line would throw an error:
# gdirs[2].get_ref_mb_data()
2023-03-07 12:44:15: oggm.workflow: init_glacier_directories from prepro level 3 on 3 glaciers.
2023-03-07 12:44:15: oggm.workflow: Execute entity tasks [gdir_from_prepro] on 3 glaciers
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[15], line 1
----> 1 gdirs = workflow.init_glacier_directories(['RGI60-11.00787', 'RGI60-11.00897', 'RGI60-11.01270'], from_prepro_level=3, reset=True, force=True)
      2 # Let's make sure that OGG is not a reference glacier: this line would throw an error:
      3 # gdirs[2].get_ref_mb_data()

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/workflow.py:533, in init_glacier_directories(rgidf, reset, force, from_prepro_level, prepro_border, prepro_rgi_version, prepro_base_url, from_tar, delete_tar)
    531     if cfg.PARAMS['dl_verify']:
    532         utils.get_dl_verify_data('cluster.klima.uni-bremen.de')
--> 533     gdirs = execute_entity_task(gdir_from_prepro, entities,
    534                                 from_prepro_level=from_prepro_level,
    535                                 prepro_border=prepro_border,
    536                                 prepro_rgi_version=prepro_rgi_version,
    537                                 base_url=prepro_base_url)
    538 else:
    539     # We can set the intersects file automatically here
    540     if (cfg.PARAMS['use_intersects'] and
    541             len(cfg.PARAMS['intersects_gdf']) == 0 and
    542             not from_tar):

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/workflow.py:191, in execute_entity_task(task, gdirs, **kwargs)
    187     if ng > 3:
    188         log.workflow('WARNING: you are trying to run an entity task on '
    189                      '%d glaciers with multiprocessing turned off. OGGM '
    190                      'will run faster with multiprocessing turned on.', ng)
--> 191     out = [pc(gdir) for gdir in gdirs]
    193 return out

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/workflow.py:191, in <listcomp>(.0)
    187     if ng > 3:
    188         log.workflow('WARNING: you are trying to run an entity task on '
    189                      '%d glaciers with multiprocessing turned off. OGGM '
    190                      'will run faster with multiprocessing turned on.', ng)
--> 191     out = [pc(gdir) for gdir in gdirs]
    193 return out

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/workflow.py:108, in _pickle_copier.__call__(self, arg)
    106 for func in self.call_func:
    107     func, kwargs = func
--> 108     res = self._call_internal(func, arg, kwargs)
    109 return res

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/workflow.py:102, in _pickle_copier._call_internal(self, call_func, gdir, kwargs)
     99     gdir, gdir_kwargs = gdir
    100     kwargs.update(gdir_kwargs)
--> 102 return call_func(gdir, **kwargs)

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/workflow.py:251, in gdir_from_prepro(entity, from_prepro_level, prepro_border, prepro_rgi_version, base_url)
    248 tar_base = utils.get_prepro_gdir(prepro_rgi_version, rid, prepro_border,
    249                                  from_prepro_level, base_url=base_url)
    250 from_tar = os.path.join(tar_base.replace('.tar', ''), rid + '.tar.gz')
--> 251 return oggm.GlacierDirectory(entity, from_tar=from_tar)

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/utils/_workflow.py:2529, in GlacierDirectory.__init__(self, rgi_entity, base_dir, reset, from_tar, delete_tar)
   2525     raise RuntimeError('RGI Version not supported: '
   2526                        '{}'.format(self.rgi_version))
   2528 # remove spurious characters and trailing blanks
-> 2529 self.name = filter_rgi_name(name)
   2531 # region
   2532 reg_names, subreg_names = parse_rgi_meta(version=self.rgi_version[0])

File /usr/local/pyenv/versions/3.10.10/lib/python3.10/site-packages/oggm/utils/_funcs.py:734, in filter_rgi_name(name)
    728 def filter_rgi_name(name):
    729     """Remove spurious characters and trailing blanks from RGI glacier name.
    730 
    731     This seems to be unnecessary with RGI V6
    732     """
--> 734     if name is None or len(name) == 0:
    735         return ''
    737     if name[-1] in ['À', 'È', 'è', '\x9c', '3', 'Ð', '°', '¾',
    738                     '\r', '\x93', '¤', '0', '`', '/', 'C', '@',
    739                     'Å', '\x06', '\x10', '^', 'å', ';']:

TypeError: object of type 'numpy.float64' has no len()

First, let’s start by adding some data to this glacier. We are just creating some fake data and giving it to the glacier directory object.

# This dataframe just needs some index and an 'ANN'
df = pd.DataFrame(index=range(1980, 2010))
# Get random but repeatable results
np.random.seed(0)
df['ANNUAL_BALANCE'] = np.random.randn(len(df.index)) * 800 - 200
df.plot();

# Give it to the object
gdir = gdirs[2]  # get OGG
gdir.set_ref_mb_data(df)

Second, we have to tell OGGM that this glacier is a reference glacier. We do this by adding it’s ID to the list:

# Make sure the default list is generated
utils.get_ref_mb_glaciers_candidates(rgi_version='6');
# Append our glacier
cfg.DATA['RGI60_ref_ids'].append(gdir.rgi_id)
# Make sure all three glaciers are now reference glaciers
assert len(utils.get_ref_mb_glaciers(gdirs)) == 3

Third, re-run the calibration!

cfg.PARAMS['run_mb_calibration'] = True
climate.compute_ref_t_stars(gdirs)
workflow.execute_entity_task(tasks.local_t_star, gdirs);
workflow.execute_entity_task(tasks.mu_star_calibration, gdirs);

Note: what just happened here? We asked OGGM to recalibrate its mass-balance model. In essence, it is as simple as computing a list of so-called “reference \(t^*\)” from which the model can interpolate from. This list is then written as CSV file in the working directory:

df_tstar = pd.read_csv(os.path.join(cfg.PATHS['working_dir'], 'ref_tstars.csv'), index_col=0)
df_tstar

This list has to be generated once, and has to be copied in each working directory you use if you want OGGM to take it into account. In the absence of this file in the working directory, OGGM will use its default pre-calibrated file, located in the online sample data repository.

After copying the file in a fresh working directory, the task compute_ref_t_stars should not be called. Only local_t_star and mu_star_calibration matter.

Let’s see how well our new glacier’s mass-balance can be reproduced:

ref_df = gdir.get_ref_mb_data()
mbmod = massbalance.PastMassBalance(gdir)
ref_df['OGGM (fake calib)'] = mbmod.get_specific_mb(fls=gdir.read_pickle('inversion_flowlines'), 
                                                    year=ref_df.index.values)
ref_df[['ANNUAL_BALANCE', 'OGGM (fake calib)']].mean()

A bias of zero! Works perfectly, right? ;-)

ref_df[['ANNUAL_BALANCE', 'OGGM (fake calib)']].plot();

Take home points#

  • We illustrated with a simplified example how to do a simple cross-validation of the mass-balance calibration of OGGM. In reality, the \(t^*\) are interpolated, not assigned like this. The results of the full cross-validation are therefore different for Kesselwandferner.

  • It is possible to play around with OGGM mass-balance data for sensitivity experiments.

  • It is also possible to add custom mass-balance data to OGGM glaciers and re-calibrate the model based on this data

  • The mass-balance model of OGGM can lead to substantial biases at the local scale. We show that on average the bias is low, but individual glaciers can have substantial biases.

  • Future improvements will have to deal with this bias, e.g. by making use of the increasingly available geodetic mass-balance measurements.

What’s next?#