Small overview of PyViz capability of data exploration

This notebook is intended to present a small overview of PyViz and the capability for data exploration, with interactive plots (show difference between matplotlib and bokeh). Many parts are based on or copied from the official PyViz Tutorial (highly recommended for a more extensive overview of the possibilities of PyViz).

PyViz Packages used for this notebook


Exploring Pandas Dataframes

If your data is in a Pandas dataframe, it’s natural to explore it using the .plot() method (based on Matplotlib). Let’s have a look at some automatic weather station data from Langenferner:

import pandas as pd
url = 'https://cluster.klima.uni-bremen.de/~oggm/tutorials/aws_data_Langenferner_UTC+2.csv'
df = pd.read_csv(url, index_col=0, parse_dates=True)
df.head()
TEMP RH SWIN SWOUT LWIN LWOUT WINDSPEED WINDDIR PRESSURE
2013-07-13 00:00:00 1.634333 67.595753 0.0 0.0 212.744817 303.656833 4.436833 211.533333 692.622250
2013-07-13 01:00:00 1.388667 68.150512 0.0 0.0 209.781683 302.588717 5.544000 206.166667 692.395683
2013-07-13 02:00:00 1.064500 66.853977 0.0 0.0 207.234933 300.872133 5.573167 210.750000 692.200800
2013-07-13 03:00:00 0.985167 55.827547 0.0 0.0 207.913533 295.684267 3.970167 203.250000 692.163967
2013-07-13 04:00:00 1.155333 43.371014 0.0 0.0 211.513517 292.688400 3.267000 203.366667 692.001667

Just calling .plot() won’t give anything meaningful, because of the different magnitudes of the parameters:

df.plot();
../_images/pyviz_intro_7_0.png

Of course we can have a look at one variable only:

df.TEMP.plot();
../_images/pyviz_intro_9_0.png

This creates a static plot using matplotlib. With this approach we also can make some further explorations, like calculating the monthly mean temperature:

dfm = df.resample('m').mean()
dfm.TEMP.plot();
../_images/pyviz_intro_11_0.png

We can see the course of the parameter but we can not tell what was the exact temperature at January and we also cannot zoom in.

Exploring Data with hvPlot and Bokeh

If we are using hvplot instead we can create interactive plots with the same plotting API:

you might need to install first hvplot via e.g. conda install -c pyviz hvplot

import hvplot.pandas

df.TEMP.hvplot()