Pastas and Metran example

This notebook shows how output from Pastas time series models can be analyzed using Metran.

[1]:

import os

import hydropandas as hpd
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pastas as ps

import metran

ps.logger.setLevel("ERROR")

metran.show_versions()

Python version: 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0]
numpy version: 1.26.4
scipy version: 1.12.0
pandas version: 2.0.3
matplotlib version: 3.8.3
pastas version: 1.4.0
numba version: 0.59.0
lmfit version: 1.2.2

Read data

Load the observed heads from piezometers at different depths at location B21B0214. The outliers (values outside of \(5 \sigma\) (std. dev.)) are removed from the time series.

[2]:

oc = hpd.read_dino("./data", subdir=".")

[3]:

oc

[3]:

	x	y	filename	source	unit	monitoring_well	tube_nr	screen_top	screen_bottom	ground_level	tube_top	metadata_available	obs
name
B21B0214-001	198085.0	518413.0	B21B0214001_1	dino	m NAP	B21B0214	1.0	-7.3	-9.3	-0.29	0.24	True	GroundwaterObs B21B0214-001 -----metadata-----...
B21B0214-003	198085.0	518413.0	B21B0214003_1	dino	m NAP	B21B0214	3.0	-24.3	-26.3	-0.29	0.25	True	GroundwaterObs B21B0214-003 -----metadata-----...
B21B0214-002	198085.0	518413.0	B21B0214002_1	dino	m NAP	B21B0214	2.0	-15.3	-17.3	-0.29	0.28	True	GroundwaterObs B21B0214-002 -----metadata-----...
B21B0214-004	198085.0	518413.0	B21B0214004_1	dino	m NAP	B21B0214	4.0	-36.3	-38.3	-0.29	0.23	True	GroundwaterObs B21B0214-004 -----metadata-----...
B21B0214-005	198085.0	518413.0	B21B0214005_1	dino	m NAP	B21B0214	5.0	-51.3	-53.3	-0.29	0.21	True	GroundwaterObs B21B0214-005 -----metadata-----...

[4]:

oseries = {}

for o in oc.obs:
    name = o.name
    o = o["stand_m_tov_nap"].rename(o.name)

    # remove outliers outside 5*std
    mean = o.median()
    std = o.std()
    mask_outliers = (o - mean).abs() > 5 * std

    ts = o.copy()
    ts.loc[mask_outliers] = np.nan

    # store time series
    oseries[name] = ts

[5]:

# sort the names
sorted_names = list(oseries.keys())
sorted_names.sort()
sorted_names

[5]:

['B21B0214-001',
 'B21B0214-002',
 'B21B0214-003',
 'B21B0214-004',
 'B21B0214-005']

Plot the heads:

[6]:

fig, ax = plt.subplots(1, 1, figsize=(10, 4))

for name in sorted_names:
    o = oseries[name]
    ftop = oc.loc[name, "screen_top"]
    fbot = oc.loc[name, "screen_bottom"]
    lbl = f"{name} (filter: NAP{ftop:+.1f} - {fbot:+.1f} m)"
    ax.plot(o.index, o, label=lbl)

ax.set_ylabel("stijghoogte (m NAP)")
ax.legend(loc=(0, 1), ncol=2, frameon=False)
ax.set_ylim(top=-0.5)
ax.grid(True)

../_images/examples_ex02_pastas_metran_example_9_0.png

Load the precipitation and evaporation data from two nearby weather stations

[7]:

p = pd.read_csv(
    "./data/RD_338.csv", index_col=[0], parse_dates=True, usecols=["YYYYMMDD", "RD"]
)
e = pd.read_csv(
    "./data/EV24_260.csv", index_col=[0], parse_dates=True, usecols=["YYYYMMDD", "EV24"]
)

Plot precipitation and evaporation time series

[8]:

fig, ax = plt.subplots(1, 1, figsize=(10, 4))
ax.plot(p.index, p, label="precipitation")
ax.plot(e.index, e, label="evaporation")
ax.set_ylabel("m/day")
ax.legend(loc="best")

[8]:

<matplotlib.legend.Legend at 0x7fe032f56770>

../_images/examples_ex02_pastas_metran_example_13_1.png

Build time series models

The time series models attempt to simulate the heads using recharge as stress. The recharge is calculated using \(R = P - f \cdot E\), where \(R\) is recharge, \(P\) is precipitation, \(E\) is evaporation and \(f\) is factor that is optimized. The model fit results are printed to the console. The model residuals are stored for analysis with Metran.

[9]:

# Normalize the index (reset observation time to midnight (the end of the day)).
p.index = p.index.normalize()
e.index = e.index.normalize()

# set tmin/tmax
tmin = "1988-10-14"
tmax = "2005-11-28"

# store models and residuals
models = []
residuals = []

for name in sorted_names:
    # create model
    ml = ps.Model(oseries[name])
    rm = ps.RechargeModel(prec=p, evap=e)
    ml.add_stressmodel(rm)

    # solve model
    ml.solve(tmin=tmin, tmax=tmax, report=False)

    # print fit statistic
    print(name, f"EVP = {ml.stats.evp():.1f}%")

    # store model
    models.append(ml)

    # get residuals
    r = ml.residuals()
    r.name = name
    residuals.append(r)

B21B0214-001 EVP = 68.4%
B21B0214-002 EVP = 58.9%
B21B0214-003 EVP = 66.8%
B21B0214-004 EVP = 61.3%
B21B0214-005 EVP = 51.0%

Build Metran model

A Metran model is created using the residuals of the time series models. By analyzing the model residuals we can determine for example, whether there is a common pattern in the residuals, which could indicate a missing influence, or a shortcoming in the model structure. Additionally we might be able to analyze whether there are still outliers left in our time series.

[10]:

mt = metran.Metran(residuals)
mt.solve()

INFO: Number of factors according to Velicer's MAP test: 1


Fit report Cluster                  Fit Statistics
======================================================
tmin     None                       obj       3223.12
tmax     None                       nfev            7
freq     D                          AIC       3235.12
solver   ScipySolve

Parameters (6 were optimized)
======================================================
                        optimal   stderr initial  vary
B21B0214-001_sdf_alpha       10  ±10.00%      10  True
B21B0214-002_sdf_alpha       10  ±10.00%      10  True
B21B0214-003_sdf_alpha       10  ±10.00%      10  True
B21B0214-004_sdf_alpha       10  ±10.00%      10  True
B21B0214-005_sdf_alpha       10  ±10.00%      10  True
cdf1_alpha                   10  ±10.00%      10  True

Parameter correlations |rho| > 0.5
======================================================
None

Metran report Cluster          Factor Analysis
==============================================
tmin     None                nfct      1
tmax     None                fep     81.32%
freq     D

Communality
==============================================

B21B0214-001      84.36%
B21B0214-002      67.77%
B21B0214-003      88.55%
B21B0214-004      90.71%
B21B0214-005      63.38%

State parameters
==============================================
                       phi         q
B21B0214-001_sdf  0.904837  0.028355
B21B0214-002_sdf  0.904837  0.058418
B21B0214-003_sdf  0.904837  0.020764
B21B0214-004_sdf  0.904837  0.016846
B21B0214-005_sdf  0.904837  0.066383
cdf1              0.904837  0.181269

Observation parameters
==============================================
                    gamma1     scale      mean
B21B0214-001      0.918464  0.052053 -0.000691
B21B0214-002      0.823243  0.060746 -0.000529
B21B0214-003      0.940984  0.050673 -0.000204
B21B0214-004      0.952399  0.056584  0.000345
B21B0214-005      0.796109  0.070582  0.000577

State correlations |rho| > 0.5
==============================================
None

Plot the specific and common dynamic components

[11]:

axes = mt.plots.state_means()

../_images/examples_ex02_pastas_metran_example_19_0.png

Plot a simulation, including a confidence interval for B21B0214003.

[12]:

ax = mt.plots.simulation(mt.snames[2])

../_images/examples_ex02_pastas_metran_example_21_0.png