bmorph¶
bmorph is a package for streamflow bias correction. bmorph is designed to work in conjunction with the mizuRoute streamflow routing model. bmorph provides methods for performing bias corrections that are spatially consistent as well as offers methods which can account for process-dependent biases. We discuss bmorph’s methodological details on the bias correction page. For an overview of the structure of bmorph, see the package overview, below.
Installation¶
We recommend installing bmorph using mamba using the command:
mamba create -n bmorph -c conda-forge bmorph
If you want to install from source, we provide an environment in environment.yml
to manage the dependencies. You can build the environment by running:
mamba env create -f environment.yml
Then, to install bmorph run,
conda activate bmorph
python setup.py develop
python -m ipykernel install --user --name bmorph
Getting started¶
You can run through our interactive tutorial here. A static version of the tutorial can be found here.
bmorph Overview¶

core
¶
The core
directory is where the functions for performing bias correction are located.
The bmorph.py
module contains the functions for individual bias corrections.
workflows.py
contains the functions that define some helper workflows that make it easier to apply bias corrections across a stream network.
More on the specifics of how bias correction is performed can be found in the Bias Correction page.
util
¶
The util
directory contains the mizuroute_utils.py
module for organizing data exported by mizuRoute into an easily accessible form for bmorph
.
More on how data is handled can be found on the Data Overview page.
evaluation
¶
The evaluation
directory provides tools for plotting and analyzing results.
More on plotting functions and implemented statistics can be found on the Evaluation of Bias Correction page.
More on the Simple River Network tool can be found on the Simple River Network (SRN) page.
Sitemap¶
Bias Correction¶
This page describes the implementation of bmorph bias correction for streamflow.
bmorph Functionality¶

The figure above shows the full workflow and options available in bmorph. The input, in the upper left, shows that bmorph takes in routed streamflows. Then, if necessary, we perform some pre-processing steps as implemented in the bmorph.utils.mizuroute_utils.to_bmorph function. These preprocessing steps are necessary if either spatially consistent bias correction (through blending) and/or process-aware bias correction (through conditioning) are selected. Following the pre-processing steps bias correction is performed according to the selected options. If spatially consistent bias correction is performed the bias corrected local flows are then rerouted through mizuRoute.
PresRat and EDCDFm¶
The basic underlying bias correction technique that bmorph implements is the PresRat bias correction from Pierce et al. (2015), which is an extension of Equidistant quantile matching (EDCDFm) technique of Li et al. (2010). bmorph uses the amended EDCDFm to compute multiplicative changes in the quantiles of a Cumulative Distribution Function (CDF). bmorph also modifies these underlying techniques by taking a rolling window over successive bias correction intervals, which eliminates statistical artifacts at the edges of these intervals.
Conditional Quantile Mapping (CQM)¶
Conditional Quantile Mapping, (CQM), incorporates meteorologic data into the bmorph
bias correction process to condition flow time series to other hydrologically relevant information. By creating a series of CDFs based on meteorologic data, (such as minimum daily temperature), bmorph
can select a CDF that will not only correct the time series, but most closely match the meteorologic conditions simulated.

Spatially Consistent Bias correction (SCBC)¶
To address the artifacts that traditional bias correction techniques can introduce into distributed streamflow predictions we have developed a regionalization technique which blends the target distribution between reference flow sites. This makes use of the topology of the river network by selecting target distributions which are nearby and interpolating between them as a function of distance. A schematic representation of this blending is shown below.
There are two edge cases which require special handling. In the case where there are no gauge sites to select either up or down stream we choose the reference site according to some user-definable criteria.
There are a few different options for doing this: leaving the sites empty (leave_null
), using xarray’s forward_fill, or selecting based on some measure of similarity (r2
, kldiv
, kge
).
When a site has multiple upstream gauge sites as tributaries we simply choose the closest.
When we make use of this blended bias correction we do not simply bias correct total flows to produce bias corrected total flows, but rather use the ratio of the bias corrected total flow to the raw total flow to develop a correction factor which is then applied to the local (incremental) flow along each river reach in the network.
These corrected local flows are then re-routed through mizuRoute to produce a spatially-consistent bias-corrected streamflow, thus we refer to this method as SCBC.

The blend factor describes how upstream and downstream flows should be interpolated, or “blended” together. It is used in the following way,
Selecting bias correction techniques¶
Below we show how to implement and combine the the various options described above.
Independent Bias Correction: Univariate (IBC_U)¶
Univariate Independent Bias Correction (IBC_U) is considered the traditional bias correction method implemented here as described in the EDCDFm section above. This method can only be performed at sites with reference data, which is useful when gauge sites can measure flows but does not guarantee spatially consistent corrections amongst a series of gauge sites.
Workflow function : `bmorph.core.workflows.apply_bmorph`_
raw_ts = basin_met_seg.sel(seg=seg)['IRFroutedRunoff'].to_series()
train_ts = basin_met_seg.sel(seg=seg)['IRFroutedRunoff'].to_series()
obs_ts = basin_met_seg.sel(seg=seg)['up_ref_flow'].to_series()
ibc_u_flows[site], ibc_u_mults[site] = bmorph.workflows.apply_interval_bmorph(
raw_ts, train_ts, obs_ts,
apply_window, train_window, reference_window,
interval=interval, overlap=overlap)
Independent Bias Correction: Conditioned (IBC_C)¶
Similar to IBC_U, Conditioned Independent Bias Correction (IBC_C) can only apply corrections at gauge sites where there is reference flow data. IBC_C integrates meteorologic data into the bmorph
bias correction process as described in bmorph.core.bmorph.cqm. Conditioning allows hydrologic process based knowledge to be included in the bias correction process that can help to root bias corrections in meteorologic trends.
Workflow function : `bmorph.core.workflows.apply_bmorph`_
raw_ts = basin_met_seg.sel(seg=seg)['IRFroutedRunoff'].to_series()
train_ts = basin_met_seg.sel(seg=seg)['IRFroutedRunoff'].to_series()
ref_ts = basin_met_seg.sel(seg=seg)['up_ref_flow'].to_series()
cond_var = basin_met_seg.sel(seg=seg)[f'up_{condition_var}'].to_series()
ibc_c_flows[site], ibc_c_mults[site] = bmorph.workflows.apply_bmorph(
raw_ts, train_ts, ref_ts,
apply_window, train_window, reference_window,
condition_ts=cond_var,
interval=interval, overlap=overlap)
Notice that in order to use conditioning, the *_y
variables are needed to specify which meteorological time series to use in conditioning.
Spatially Consistent Bias Correction: Univariate (SCBC_U)¶
Univariate Spatially Consistent Bias Correction (SCBC_U) aims to address IBC’s inability to correct flows at non-gauge sites where reference timeseries do not exist. Spatial consistency is conserved by performing bias corrections at every river segment, or seg, and then rerouting the corrected flows through mizuRoute. Reference data for each seg that is not a gauge site is done by creating proxy reference data for each seg from upstream and downstream proxy gauge flows that can be combined, or blended, together to create what the reference flow data for that seg should look like, as described in Spatial Consistency: Reference Site Selection & CDF Blend Factor.
Workflow functions : `bmorph.core.workflows.apply_blendmorph`_, `bmorph.core.workflows.run_parallel_scbc`_
univariate_config = {
'train_window': train_window,
'apply_window': apply_window,
'reference_window': reference_window,
'interval': interval,
'overlap': overlap,
}
unconditioned_seg_totals = bmorph.workflows.run_parallel_scbc(
basin_met_seg, client, output_prefix, mizuroute_exe, univariate_config)
Spatially Consistent Bias Correction: Conditioned (SCBC_C)¶
Conditioned Spatially Consistent Bias Correction (SCBC_C) combines the meteorologic conditioning elements of IBC_C with the spatial consistency of SCBC_U. This implementation of SCBC factors in meteorologic variables given into the formulation of reference flows for each seg to be corrected to. Defined by the hydrologic response units, or hru’s, they impact, meteorologic data is mappable to each seg within the watershed topology. In IBC_C, only the data mapped to gauge sites would be used in bias correction, whereas SCBC_C can utilize meteorologic data across the watershed as it incorporates all defined segs.
Workflow functions : `bmorph.core.workflows.apply_blendmorph`_, `bmorph.core.workflows.run_parallel_scbc`_
conditonal_config = {
'train_window': train_window,
'apply_window': apply_window,
'reference_window': reference_window,
'interval': interval,
'overlap': overlap,
'condition_var': condition_var
}
conditioned_seg_totals = bmorph.workflows.run_parallel_scbc(
basin_met_seg, client, output_prefix, mizuroute_exe, conditonal_config)
Again, because we are conditioning our bias corrections, condition_var
must be included in running this script.
Citations¶
Pierce, D. W., Cayan, D. R., Mauerer, E. P., Abatzoglou J. T., & Hegewisch, K. C. (2015). Improved Bias Correction Techniques for Hydrological Simulations of Climate Change. *Journal of Hydrometeorology, 16*(6), 2421-2442. http://dx.doi.org/10.1175/JHM-D-14-0236.1
Li, H., Sheffield, J., & Wood, E. F. (2010). Bias correction of monthly precipitation and temperature fields from Intergovernmental Panel on Climate Change AR4 models using equidistant quantile matching. *Journal of Geophysical Research: Atmospheres, 115*(D10), 1-20. https://doi.org/10.1029/2009JD012882
bmorph Tutorial: Getting your first bias corrections¶
This notebook demonstrates how to setup data for and bias correct it through bmorph, it contains the same information as the tutorial page. In this notebook, we will demonstrate how to perform four variations of bias correction. For more information about the details of each method refer to the bias correction page page. The rest of the documentation for bmorph can be found here.
Independent Bias Correction: Univariate (IBC_U) : IBC_U is the traditional bias correction method. This method can only be performed at sites with reference data.
Independent Bias Correction: Conditioned (IBC_C) : IBC_C allows for correcting for specific biases that are process-dependent. This method can only be performed at sites with reference data.
Spatially Consistent Bias Correction: Univariate (SCBC_U): SCBC_U corrects local flows at each river reach in the network, and then reroutes them to aggregate, producing bias corrected flows at locations of interest where reference data does and does not exist.
Spatially Consistent Bias Correction: Conditioned (SCBC_C): SCBC_C also corrects the local flows like SCBC_U, but allows for conditioning on dependent processes.
Expectations using this Tutorial¶
Before we begin, we are expecting that anyone using this notebook is already familiar with the following:
Python 3 syntax and logic
This tutorial also makes use of a number of other open source libraries & programs which, although no familiarity is required, may be of interest: - geopandas - tqdm - dask - networkx - scipy - scikit-learn
While proficiency with each item is not required to understand the notebook, the details of these items will not be included in the tutorial itself.
Learning objectives & goals¶
By the end of this tutorial you will learn
The data requirements to perform streamflow bias corrections with bmorph
Input and output formats that bmorph expects
The meaning of parameters for algorithmic control of bias corrections
How to perform independent bias correction at locations with reference flows
How to perform spatially consistent bias corrections across a river network, as well as rerouting the corrected flows with mizuRoute
How to use the analysis, evaluation, and visualization tools built into bmorph
We expect that this tutorial will take approximately half an hour for someone familiar with Python.
Import Packages and Load Data¶
We start by importing the necessary packages for the notebook. This
tutorial mainly shows how to use bmorph.core.workflows
and
bmorph.core.mizuroute_utils
to bias correct streamflow data in the
Yakima River Basin.
%pylab inline
%load_ext autoreload
%autoreload 2
%reload_ext autoreload
import warnings
warnings.filterwarnings('ignore')
import os
import sys
import numpy as np
import xarray as xr
import pandas as pd
import geopandas as gpd
import matplotlib as mpl
import matplotlib.pyplot as plt
from tqdm.notebook import tqdm
from dask.distributed import Client, progress
# Set a bigger default plot size
mpl.rcParams['figure.figsize'] = (10, 8)
mpl.rcParams['font.size'] = 22
# Import bmorph, and mizuroute utilities
import bmorph
from bmorph.util import mizuroute_utils as mizutil
from bmorph.evaluation.simple_river_network import SimpleRiverNetwork
from bmorph.evaluation import plotting as bplot
# Set the environment directory, this is a workaround for portability
envdir = os.path.dirname(sys.executable)
# Import pyproj and set the data directory, this is a workaround for portability
import pyproj
pyproj.datadir.set_data_dir(f'{envdir}/../share/proj')
Populating the interactive namespace from numpy and matplotlib
Getting the sample data¶
The following code cell has two commands preceded by !
, which
indicates that they are shell commands. They will download the sample
data and unpackage it. The sample data can be viewed as a HydroShare
resource
here.
This cell may take a few moments.
! wget https://www.hydroshare.org/resource/fd2a347d34f145b4bfa8b6bff39c782b/data/contents/bmorph_testdata.tar.gz
! tar xvf bmorph_testdata.tar.gz
--2021-07-02 12:20:36-- https://www.hydroshare.org/resource/fd2a347d34f145b4bfa8b6bff39c782b/data/contents/bmorph_testdata.tar.gz
Resolving www.hydroshare.org (www.hydroshare.org)... 152.54.2.89
Connecting to www.hydroshare.org (www.hydroshare.org)|152.54.2.89|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: /resource/fd2a347d34f145b4bfa8b6bff39c782b/data/contents/bmorph_testdata.tar.gz/ [following]
--2021-07-02 12:20:37-- https://www.hydroshare.org/resource/fd2a347d34f145b4bfa8b6bff39c782b/data/contents/bmorph_testdata.tar.gz/
Reusing existing connection to www.hydroshare.org:443.
HTTP request sent, awaiting response... 500 Internal Server Error
2021-07-02 12:20:37 ERROR 500: Internal Server Error.
yakima_workflow/
yakima_workflow/mizuroute_configs/
yakima_workflow/mizuroute_configs/.ipynb_checkpoints/
yakima_workflow/mizuroute_configs/.ipynb_checkpoints/reroute_deschutes_univariate-checkpoint.control
yakima_workflow/mizuroute_configs/.ipynb_checkpoints/reroute_deschutes_conditional-checkpoint.control
yakima_workflow/mizuroute_configs/__init__.py
yakima_workflow/input/
yakima_workflow/input/nrni_reference_flows.nc
yakima_workflow/input/yakima_raw_flows.nc
yakima_workflow/input/yakima_met.nc
yakima_workflow/output/
yakima_workflow/output/__init__.py
yakima_workflow/.ipynb_checkpoints/
yakima_workflow/topologies/
yakima_workflow/topologies/.ipynb_checkpoints/
yakima_workflow/topologies/.ipynb_checkpoints/param.nml-checkpoint.default
yakima_workflow/topologies/param.nml.default
yakima_workflow/topologies/yakima_huc12_topology.nc
yakima_workflow/topologies/yakima_huc12_topology_scaled_area.nc
yakima_workflow/gis_data/
yakima_workflow/gis_data/crcc_pointlist.txt
yakima_workflow/gis_data/yakima_hru.shp
yakima_workflow/gis_data/yakima_seg.prj
yakima_workflow/gis_data/yakima_seg.cpg
yakima_workflow/gis_data/yakima_hru.dbf
yakima_workflow/gis_data/yakima_seg.shx
yakima_workflow/gis_data/yakima_seg.shp
yakima_workflow/gis_data/yakima_seg.dbf
yakima_workflow/gis_data/yakima_hru.cpg
yakima_workflow/gis_data/yakima_hru.prj
yakima_workflow/gis_data/yakima_hru.shx
yakima_workflow/README.md
Test dataset: The Yakima River Basin¶
Before getting into how to run bmorph, let’s look at what is in the
sample data. You will note that we now have a yakima_workflow
directory downloaded in the same directory as this notebook (found by
clicking on the 1tutorial/
tab left by Binder or navigating to
this directory in the file explorer of your choice). This contains
all of the data that you need to run the tutorial. There are a few
subdirectories:
gis_data
: contains shapefiles, this is mainly used for plotting, not for analysisinput
: this is the input meteorologic data, simulated streamflow to be corrected, and the reference flow datasetmizuroute_configs
: this is an empty directory that will automatically be populated with mizuroute configuration files during the bias correction processoutput
: this is an empty directory that will be where the bias corrected flows will be written out totopologies
: this contains the stream network topologies that will be used for routing flows via mizuroute
The Yakima River Basin is a tributary of the Columbia river basin in the Pacific northwestern United States. It’s western half is situated in the Cascade mountains and receives seasonal snowpack. The eastern half is lower elevation and is semi-arid. Let’s load up the shapefiles for the sub-basins and stream network and plot it. In this discretization we have 285 sub-basins (HRU) and 143 stream segments.
Setting up some metadata¶
Next we set up the gauge site names and their respective river segment
identification numbers, or site
’s and seg
’s. This will be
used throughout to ensure the data does not get mismatched. bmorph uses
the convention:
site_to_seg = { site_0_name : site_0_seg, ..., site_n_name, site_n_seg}
Since it is convenient to be able to access this data in different
orders we also set up some other useful forms of these gauge site
mappings for later use. We will show you on the map where each of these
sites are on the stream network in the next section.
site_to_seg = {'KEE' : 4175, 'KAC' : 4171, 'EASW': 4170,
'CLE' : 4164, 'YUMW': 4162, 'BUM' : 5231,
'AMRW': 5228, 'CLFW': 5224, 'RIM' : 5240,
'NACW': 5222, 'UMTW': 4139, 'AUGW': 594,
'PARW': 588, 'YGVW': 584, 'KIOW': 581}
seg_to_site = {seg: site for site, seg in site_to_seg.items()}
ref_sites = list(site_to_seg.keys())
ref_segs = list(site_to_seg.values())
Mapping the Yakima River Basin¶
With our necessary metadata defined let’s make a couple of quick plots
orienting you to the Yakima River Basin. To do so we will read in a
network topology file, and shapefiles for the region. We will make one
plot which has the Yakima River Basin, along with stream network,
subbasins, and gauged sites labeled. We will also plot a network diagram
which displays in an abstract sense how each stream segment is
connected. For the latter we use the
SimpleRiverNetwork
that we’ve implemented in bmorph. To set up the SimpleRiverNetwork
we need the topology of the watershed (yakima_topo
). The river
network and shapefiles that we use to draw the map, and perform
simulations on is the Geospatial Fabric.
In the Geospatial Fabric, rivers and streams are broken into
segments, each with a unique identifier, as illustrated above in the
site_to_seg
definition. The locations of the gauged sites are
shown in red, while all of the un-gauged stream segments are shown in
darker grey. The sub-basins for each stream segment are shown in
lighter grey.
yakima_topo = xr.open_dataset('yakima_workflow/topologies/yakima_huc12_topology.nc').load()
yakima_hru = gpd.read_file('./yakima_workflow/gis_data/yakima_hru.shp').to_crs("EPSG:4326")
yakima_seg = gpd.read_file('./yakima_workflow/gis_data/yakima_seg.shp').to_crs("EPSG:4326")
fig, axes = plt.subplots(1, 2, figsize=(14, 9), gridspec_kw={'width_ratios': [1.5, 1]})
axes[1].invert_xaxis() # flip makes nodes line up better with map
# Plot the subbasins and stream segments
ax = yakima_hru.plot(color='gainsboro', edgecolor='white', ax=axes[0])
yakima_seg.plot(ax=ax, color='grey')
# Plot the reference flow sites
ref_lats = yakima_seg[yakima_seg['seg_id'].isin(ref_segs)]['end_lat']
ref_lons = yakima_seg[yakima_seg['seg_id'].isin(ref_segs)]['end_lon']
ref_names = [seg_to_site[s] for s in yakima_seg[yakima_seg['seg_id'].isin(ref_segs)]['seg_id']]
ax.scatter(ref_lons, ref_lats, color='crimson', zorder=100, marker='s', label='Gauged locations')
for name, lat, lon in zip(ref_names, ref_lats, ref_lons):
if name in ['AUGW', 'EASW']:
# Set labels at a slightly different position so we don't have overlaps
offset_x, offset_y = -0.16, -0.04
else:
offset_x, offset_y = 0.02, 0.02
ax.text(lon+offset_x, lat+offset_y, name, fontsize=10, color='white', weight='bold',
bbox=dict(boxstyle="round", ec='crimson', fc='crimson', ),)
ax.legend()
ax.set_xlabel('Longitude')
ax.set_ylabel('Latitude')
# Now plot the abstracted river network, with gauged sites highlighted
yakima_srn = SimpleRiverNetwork(yakima_topo)
yakima_srn.draw_network(color_measure=yakima_srn.generate_node_highlight_map(ref_segs),
cmap=mpl.cm.get_cmap('Set1_r'), ax=axes[1], node_size=60)
plt.tight_layout(pad=0)

Loading in the streamflow data and associated meteorological data¶
Now we load in the meteorological data that will be used for conditional
bias correction: daily minimum temperature (tmin
), seasonal
precipitation (prec
), and daily maximum temperature (tmax
).
In principle, any type of data can be used for conditioning, (i.e.
Snow-Water Equivalent (SWE), groundwater storage, landscape slope
angle, etc.). This data is initially arranged on the sub-basins,
rather than stream segments. We will remap these onto the stream
segments in a moment, so that they can be used in the bias correction
process.
Finally, we load the simulated flows and reference flows. bmorph is designed to bias correct streamflow simulated with mizuroute. We denote the simulated flows as the “raw” flows when they are uncorrected, and the flows that will be used to correct the raw flows as the “reference” flows. During the bias correction process bmorph will map the raw flow values to the reference flow values by matching their quantiles. In our case the reference flows are estimated no-reservoir-no-irrigation (NRNI) flows taken from the River Management Joint Operating Committee (RMJOC).
All of the datasets discussed are in the xarray
Dataset
format,
which contains the metadata associated with the original NetCDF files.
You can inspect the data simply by printing it out. For instance, here
you can see that both the reference flows and raw flows (named
IRFroutedRunoff
, for “Impulse Response Function routed runoff” from
mizuRoute) are in cubic meters per second.
# Meteorologic data
yakima_met = xr.open_dataset('yakima_workflow/input/yakima_met.nc').load()
# Remove the 17* prefix, which was used to denote the domain covers the region 17 of the Hydrologic Unit Maps
yakima_met['hru'] = (yakima_met['hru'] - 1.7e7).astype(np.int32)
# Raw streamflows
yakima_raw = xr.open_dataset('yakima_workflow/input/yakima_raw_flows.nc')[['IRFroutedRunoff', 'dlayRunoff', 'reachID']].load()
# Update some metadata
yakima_raw['seg'] = yakima_raw.isel(time=0)['reachID'].astype(np.int32)
# Reference streamflows - this contains sites from the entire Columbia river basin, but we will select out only the `ref_sites`
yakima_ref = xr.open_dataset('yakima_workflow/input/nrni_reference_flows.nc').rename({'outlet':'site'})[['seg', 'seg_id', 'reference_flow']]
# Pull out only the sites in the Yakima basin
yakima_ref = yakima_ref.sel(site=ref_sites).load()
print('Reference flow units: ', yakima_ref['reference_flow'].units)
print('Raw flow units: ', yakima_raw['IRFroutedRunoff'].units)
Reference flow units: m3/s
Raw flow units: m3/s
Convert from mizuroute
output to bmorph
format¶
mizuroute_utils
is our utility module that will handle converting
mizuroute outputs to the format that we need for bmorph
. We will use
the mizutil.to_bmorph
function to merge together all of the data we
previously loaded, and calculate some extra pieces of information to
perform spatially consistent bias corrections (SCBC). For more
information about how we perform SCBC see the SCBC page in the
documentation.
Now we pass our data in to to_bmorph
, the primary utility function
for automating bmorph
pre-processing.
yakima_met_seg = mizutil.to_bmorph(yakima_topo, yakima_raw, yakima_ref, yakima_met, fill_method='r2')
Setting up bmorph
configuration and parameters¶
Before applying bias correction we need to specify some parameters and configuration for correction. Returning to these steps can help fine tune your bias corrections to the basin you are analyzing.
The train_window
is what we will use to train the bias correction
model. This is the time range that is representative of the basin’s
expected behavior that bmorph
should mirror.
The bmorph_window
is when bmorph
should be applied to the series
for bias correction.
Lastly the reference_window
is when the reference flows should be
used to smooth the Cumulative Distribution function (CDF) of the bias
corrected flows. This is recommended to be set as equivalent to the
train_window
.
train_window = pd.date_range('1981-01-01', '1990-12-30')[[0, -1]]
reference_window = train_window
apply_window= pd.date_range('1991-01-01', '2005-12-30')[[0, -1]]
interval
is the length of bmorph
‘s application intervals,
typically a factor of years to preserve hydrologic relationships. Note
that for pandas.DateOffset
, ’year’ and ‘years’ are different and an
‘s’ should always be included here for bmorph
to run properly, even
for a single year.
overlap
describes how many days the bias correction cumulative
distribution function windows should overlap in total with each other.
overlap
is evenly distributed before and after this window. This is
used to reduce discontinuities between application periods. Typical
values are between 60 and 120 days.
The two “smoothing” parameters are used to smooth the timeseries before
the CDFs are computed and have two different uses. THe
n_smooth_short
is used in the actual calculation of the CDFs which
are used to perform the quantile mapping. Smoothing is used to ensure
that the CDFs are smooth. Setting a very low value here may cause
noisier bias corrected timeseries. Setting a very high value may cause
the bias corrections to not match extreme flows. Typical values are from
7-48 days.
n_smooth_long
on the other hand is used to preserve long-term trends
in mean flows from the raw flows. Typical values are 270 to 720 days.
Using very low values may cause bias corrections to be degraded. This
feature can be turned off by setting n_smooth_long
to None
.
condition_var
names the variable to use in conditioning, such as
maximum temperature (tmax), 90 day rolling total precipitation
(seasonal_precip), or daily minimum temperature (tmin). At this time,
only one conditioning meteorological variable can be used per bmorph
execution. In this example, tmax
and seasonal_precip
have been
commented out to select tmin
as the conditioning variable. If you
wish to change this, be sure to either change which variables are
commented out or change the value of condition_var
itself. For
now we will just use tmin
, which is the daily minimum
temperature. Our hypothesis on choosing tmin
is that it will be a
good indicator for errors in snow processes, which should provide a
good demonstration for how conditional bias correction can modify
flow timing in desirable ways.
Further algorithmic controls can be used to tune the conditional bias
correction as well. Here we use the histogram method for estimating the
joint PDF, which is provided as hist
as the method
. We also have
implemented a kernel density estimator which will be used if you set the
method
to kde
. While kde
tends to make smoother PDFs it
comes with a larger computational cost. For both methods we specify the
number of xbins
and ybins
which control how fine grained the
joint PDFs should be calculated as. Setting a very high number here can
potentially cause jumpy artifacts in the bias corrected timeseries.
# bmorph parameter values
interval = pd.DateOffset(years=5)
overlap = 90
n_smooth_long = 365
n_smooth_short = 21
# Select from the various available meteorologic fields for conditioning
#condition_var = 'tmax'
#condition_var = 'seasonal_precip'
condition_var = 'tmin'
Here we name some configuration parameters for bmorph
’s
conditional and univariate bias correction methods, respectively.
output_prefix
will be used to write and load files according to the
basin’s name, make certain to update this with the actual name of the
basin you are analyzing so you can track where different files are
written.
conditional_config = {
'data_path': './yakima_workflow',
'output_prefix': "yakima",
'raw_train_window': train_window,
'ref_train_window': reference_window,
'apply_window': apply_window,
'interval': interval,
'overlap': overlap,
'n_smooth_long': n_smooth_long,
'n_smooth_short': n_smooth_short,
'condition_var': condition_var,
'method': 'hist',
'xbins': 100,
'ybins': 100,
}
univariate_config = {
'data_path': './yakima_workflow',
'output_prefix': "yakima",
'raw_train_window': train_window,
'ref_train_window': reference_window,
'apply_window': apply_window,
'interval': interval,
'overlap': overlap,
'n_smooth_long': n_smooth_long,
'n_smooth_short': n_smooth_short,
}
You made it! Now we can actually bias correct with bmorph
!
First off we run the Independent Bias Corrections, which are completely contained in the cell below.
Here we run through each of the gauge sites and correct them individually. Since independent bias correction can only be performed at locations with reference data, corrections are only performed at the gauge sites here.
Independent bias correction¶
ibc_u_flows = {}
ibc_u_mults = {}
ibc_c_flows = {}
ibc_c_mults = {}
cond_vars = {}
raw_flows = {}
ref_flows = {}
for site, seg in tqdm(site_to_seg.items()):
raw_ts = yakima_met_seg.sel(seg=seg)['IRFroutedRunoff'].to_series()
train_ts = yakima_met_seg.sel(seg=seg)['IRFroutedRunoff'].to_series()
obs_ts = yakima_met_seg.sel(seg=seg)['up_ref_flow'].to_series()
cond_var = yakima_met_seg.sel(seg=seg)[f'up_{condition_var}'].to_series()
ref_flows[site] = obs_ts
raw_flows[site] = raw_ts
cond_vars[site] = cond_var
## IBC_U (Independent Bias Correction: Univariate)
ibc_u_flows[site], ibc_u_mults[site] = bmorph.workflows.apply_bmorph(
raw_ts, train_ts, obs_ts, **univariate_config)
## IBC_C (Independent Bias Correction: Conditioned)
ibc_c_flows[site], ibc_c_mults[site] = bmorph.workflows.apply_bmorph(
raw_ts, train_ts, obs_ts, condition_ts=cond_var, **conditional_config)
0%| | 0/15 [00:00<?, ?it/s]
Spatially consistent bias correction¶
Here we specify where the mizuroute
executable is installed on your
system.
mizuroute_exe = f'{envdir}/route_runoff.exe'
Now we use run_parallel_scbc
to do the rest. In the first cell we
will run the spatially-consistent bias correction without any
conditioning. The second cell will run the spatially-consistent bias
correction with conditioning. This produced bias corrected flows at all
143 stream segments in the Yakima River Basin. Finally, we select out
the corrected streamflows for both cases (with and without conditioning)
to only contain the gauged sites. Selecting out only the gauged
locations allows us to compare the spatially-consistent methods with the
independent bias corrections. Finally we combine all the data into a
single xarray Dataset
to make analysis easier.
# SCBC without conditioning
unconditioned_seg_totals = bmorph.workflows.apply_scbc(yakima_met_seg, mizuroute_exe, univariate_config)
0%| | 0/143 [00:00<?, ?it/s]
# SCBC with conditioning
conditioned_seg_totals = bmorph.workflows.apply_scbc(yakima_met_seg, mizuroute_exe, conditional_config)
0%| | 0/143 [00:00<?, ?it/s]
# Here we select out our rerouted gauge site modeled flows.
unconditioned_site_totals = {}
conditioned_site_totals = {}
for site, seg in tqdm(site_to_seg.items()):
unconditioned_site_totals[site] = unconditioned_seg_totals['IRFroutedRunoff'].sel(seg=seg).to_series()
conditioned_site_totals[site] = conditioned_seg_totals['IRFroutedRunoff'].sel(seg=seg).to_series()
0%| | 0/15 [00:00<?, ?it/s]
Merging together the results¶
# Merge everything together
yakima_analysis = xr.Dataset(coords={'site': list(site_to_seg.keys()), 'time': unconditioned_seg_totals['time']})
yakima_analysis['scbc_c'] = bmorph.workflows.bmorph_to_dataarray(conditioned_site_totals, 'scbc_c')
yakima_analysis['scbc_u'] = bmorph.workflows.bmorph_to_dataarray(unconditioned_site_totals, 'scbc_u')
yakima_analysis['ibc_u'] = bmorph.workflows.bmorph_to_dataarray(ibc_u_flows, 'ibc_u')
yakima_analysis['ibc_c'] = bmorph.workflows.bmorph_to_dataarray(ibc_c_flows, 'ibc_c')
yakima_analysis['raw'] = bmorph.workflows.bmorph_to_dataarray(raw_flows, 'raw')
yakima_analysis['ref'] = bmorph.workflows.bmorph_to_dataarray(ref_flows, 'ref')
yakima_analysis.to_netcdf(f'./yakima_workflow/output/{univariate_config["output_prefix"]}_data_processed.nc')
# And also output it as some CSV files
yakima_analysis['scbc_c'].to_pandas().to_csv(f'./yakima_workflow/output/{univariate_config["output_prefix"]}_data_processed_scbc_c.csv')
yakima_analysis['scbc_u'].to_pandas().to_csv(f'./yakima_workflow/output/{univariate_config["output_prefix"]}_data_processed_scbc_u.csv')
yakima_analysis['ibc_u'].to_pandas().to_csv(f'./yakima_workflow/output/{univariate_config["output_prefix"]}_data_processed_ibc_u.csv')
yakima_analysis['ibc_c'].to_pandas().to_csv(f'./yakima_workflow/output/{univariate_config["output_prefix"]}_data_processed_ibc_u.csv')
yakima_analysis['raw'].to_pandas().to_csv(f'./yakima_workflow/output/{univariate_config["output_prefix"]}_data_processed_raw.csv')
yakima_analysis['ref'].to_pandas().to_csv(f'./yakima_workflow/output/{univariate_config["output_prefix"]}_data_processed_ref.csv')
Now let’s take a look at our results¶
If you look closely, the following plots are the same ones included in
Plotting! Because the plotting functions
expect the variable seg
, we will need to rename site
to seg
for them to properly run.
yakima_ds = xr.open_dataset(f'yakima_workflow/output/{univariate_config["output_prefix"]}_data_processed.nc')
yakima_ds = yakima_ds.rename({'site':'seg'})
Let’s pick a few sites and colors to plot for consistency. To simplify
our plots, we will only focus on scbc_c
in the dataset we just
created. The methods do allow for multiple methods to be compared at
once however, so we will still need to store the singular scbc_c
in
a list.
select_sites = ['KIOW','YUMW','BUM']
select_sites_2 = ['KIOW','CLFW','BUM','UMTW']
bcs = ['scbc_c', 'scbc_u', 'ibc_c', 'ibc_u']
colors = ['grey', 'black', 'red', 'orange', 'purple', 'blue']
Time Series¶
Here we plot the mean weekly flows for some of the sites in Yakima River Basin. You can change or add sites above, but we will start with a small number of sites to make the plots more tractable. In the following function call you
As mentioned, these averages are computed on weekly intervals to
simplify the figure, but can also be plotted on daily or monthly
intervals for more or less granularity. You can also change the
reduce_func
to calculate any other statistic over the dataset (you
might try np.median
or np.var
for instance). Don’t forget to
change the statistic_label
for other measures!
bplot.plot_reduced_flows(
flow_dataset=yakima_ds,
plot_sites=select_sites_2,
interval='week',
reduce_func=np.mean,
statistic_label='Mean',
raw_var='raw', raw_name="Uncorrected",
ref_var='ref', ref_name="Reference",
bc_vars=bcs, bc_names=[bc.upper() for bc in bcs],
plot_colors=colors
)
(<Figure size 1440x864 with 4 Axes>, <AxesSubplot:title={'center':'UMTW'}>)

From the plot above we can see that the conditional corrections (x_C
methods) have more accurate flow timings, particularly during the
falling limb of the hydrograph. This hints that our hypothesis on
correcting on daily minimum temperature would provide a good proxy for
correcting snowmelt biases. We will explore this a little bit more
later.
We also see that generally the SCBC_x
and IBC_x
methods are
fairly similar in the mean, with an exception at CLFW. This indicates
that the spatially consistent bias correction produces useful bias
corrections. The advantage of the SCBC method is that we produce bias
corrections on every river reach, as well as produce bias corrected
incremental flows which are consistent across the network.
Scatter¶
This compares how absolute error changes through each bias correction with Q being stream discharge. 1 to 1 and -1 to 1 lines are plotted for reference, as points plotted vertically between the lines demonstrates a reduction in absolute error while points plotted horizontally between the lines demonstrates an increase in absolute error for each flow time.
bplot.compare_correction_scatter(
flow_dataset= yakima_ds,
plot_sites = select_sites,
raw_var = 'raw',
ref_var = 'ref',
bc_vars = bcs,
bc_names = [bc.upper() for bc in bcs],
plot_colors = list(colors[2:]),
pos_cone_guide = True,
neg_cone_guide = True,
symmetry = False,
title = '',
fontsize_legend = 120,
alpha = 0.3
)

Probabilitiy Distribtutions¶
Since probability distributions are used to predict extreme flow events
and are what bmorph
directly corrects, looking at them will give us
greater insight to the changes we made.
bplot.compare_mean_grouped_CPD(
flow_dataset= yakima_ds,
plot_sites = select_sites,
grouper_func = bplot.calc_water_year,
figsize = (60,40),
raw_var = 'raw', raw_name = 'Uncorrected',
ref_var = 'ref', ref_name = 'Reference',
bc_vars = bcs, bc_names = [bc.upper() for bc in bcs],
plot_colors = colors,
linestyles = 2 * ['-','-','-'],
markers = ['o', 'X', 'o', 'o', 'o', 'o'],
fontsize_legend = 90,
legend_bbox_to_anchor = (1.9,1.0)
);

This function is also capable of subsetting data by month should you
want to compare only January flows for example. Because bmorph
makes
changes based on flow distributions, this plot is the closest to
directly analyzing how the different methods correct flows.
Simple River Network¶
Finally, we can plot information of the SCBC across the simple river
network. Let’s look at the difference in the average percent difference
for both SCBC_U
and SCBC_C
. From the timeseries plots created
earlier you might have noticed that the conditional bias corrections
produced lower flows in the spring months. We will start by looking only
at those months. You might try changing the season if you’re interested.
season = 'MAM' # Choose from DJF, MAM, JJA, SON
scbc_c = conditioned_seg_totals['IRFroutedRunoff']
scbc_u = unconditioned_seg_totals['IRFroutedRunoff']
raw = yakima_met_seg['IRFroutedRunoff']
scbc_c_percent_diff = 100 * ((scbc_c-raw)/raw).groupby(scbc_c['time'].dt.season).mean().sel(season=season)
scbc_u_percent_diff = 100 * ((scbc_u-raw)/raw).groupby(scbc_u['time'].dt.season).mean().sel(season=season)
mainstream_map = yakima_srn.generate_mainstream_map()
scbc_u_percent_diff = pd.Series(data=scbc_u_percent_diff.to_pandas().values, index=mainstream_map.index)
scbc_c_percent_diff = pd.Series(data=scbc_c_percent_diff.to_pandas().values, index=mainstream_map.index)
fig, axes = plt.subplots(1, 2, figsize=(14,10))
yakima_srn.draw_network(color_measure=scbc_u_percent_diff, cmap=mpl.cm.get_cmap('coolwarm_r'), node_size=40,
with_cbar=True, cbar_labelsize=20, ax=axes[0], cbar_title='')
axes[0].set_title('SCBC_U')
yakima_srn.draw_network(color_measure=scbc_c_percent_diff, cmap=mpl.cm.get_cmap('coolwarm_r'), node_size=40,
with_cbar=True, cbar_labelsize=20, ax=axes[1], cbar_title='Mean percent difference from raw flows(%)')
axes[1].set_title('SCBC_C')
Text(0.5, 1.0, 'SCBC_C')

From the plot above we can see that the main differences between the two methods was in modifying the headwater flows, which are at higher elevations and receive more precipitation. This aligns with our hypothesis that the daily minimum temperature would provide a good proxy for erros in snow processes.
Moving forward¶
In this tutorial you have learned how to set up, perform bias
corrections, and analyze them with bmorph. While this tutorial is meant
to cover the essentials there are quite a few diversions/alternatives
that you could try out before leaving. If you’d like to mess around a
bit before moving on. For instance: - What happens if you conditionally
bias correct on a different variable? Try seasonal_precip
, or even
implement a bias correction conditional on the month if you’re feeling
adventurous! - How do the smoothing parameters affect the bias corrected
flows? Try a wide range of n_smooth_short
, or try setting
n_smooth_long
to None
to turn off the correction of the mean
trend. - Try removing half of the gauged sites to see how it affects the
spatially-consistent bias correction. You can do this by commenting out
(or deleting) some of the entries in site_to_seg
up at the top.
Data Overview¶
Directory Setup¶
The following directories and names are required to properly execute bmorph
:
input contains uncorrected and reference flows and meteorological data. mizuroute_configs is where the configuration file for mizuRoute will be written. notebooks contains the python notebooks that will execute bmorph
. output is where the bmorph
bias corrected flows will be written to. topologies contains topographical data for the watershed to connect river segments.
Variable Naming Conventions¶
Configuration Utilities¶
to mizuRoute¶
bmorph
is designed to bias correct simulated streamflow as modeled by mizuRoute. bmorph.util.mizuroute_utils.write_mizuroute_config automates writing a valid mizuroute configuration file for the purposes of bmorph
.
Running mizuRoute
during bias correction is only necessary for spatially consistent methods were local flows need to be routed. As a result, writing the configuration file is automated through only the bmorph.core.workflows.run_parallel_scbc. The configuration script receives the name of the region, the type of bias correction, and the time window from run_parallel_scbc
, writing the rest of the configuration file assuming the Directory Setup described above.
to bmorph¶
bmorph.utils.mizuroute_utils.to_bmorph handles formatting the output of mizuroute
for bmorph
. While configuring streamflows can be performed without this function, this helps to speed up the whole workflow with a number of customizable options. As detailed in the tutorial, bmorph.utils.mizuroute_utils.to_bmorph
is the primary function for handling streamflow formatting:
from bmorph.util import mizuroute_utils as mizutil
basin_met_seg = mizutil.to_bmorph(
topo = basin_topo,
routed = watershed_raw.copy(),
reference = basin_ref,
met_hru = watershed_met,
route_var = "IRFroutedRunoff",
fill_method = 'r2',
).ffil(dim='seg')
Where topo
is the topology file for the basin, routed
are the uncorrected streamflows, reference
are the reference streamflows from gauge sites, met_hru
are the meteorologic variables used in conditioning, route_var
is the name of the uncorrected flows in routed
, and fill_method
describes spatial consistency determining as described in Spatial Consistency.
Input Specifications¶
Input data will be need identical time indices as pandas.Series without null flow values. So long as the size of the pandas.Series
are the same across all flow data, the magnitude of the length should not impact bias correction. Further information on non-flow parameters can be found in Implementation.
Output Specifications¶
bmorph
outputs a pandas.Series
time series with flows as values indexed by time entires provided by given data. Total length of the output is the number of flow values.
The tutorial stores each of these outputs in a dictionary with their site/seg being their corresponding keys. bmoprh.workflows.bmorph_to_datarray converts such a dictionary into an xarray.DataArray with coordinates site
and time
, corresponding the the dictionary keys and the time of the pandas.Series
that they access. From there, uncorrected and reference flows can be combined with the corrected flows into a singular xarray.Dataset and saved into a netCDF
file if desired, storing it in the output
directory if following the tutorial.
Evaluation of Bias Correction¶
Below are descriptions of statistics implemented in bmorph for evaluating bias correction performance. Let P be predicted values, such as the corrected flows, and O be the observed values, such as reference flows.
Mean Bias Error (MBE)¶
Mean Bias Error is useful for preserving directionality of bias, yet may hide bias if both positive and negative bias exist in a dataset. MBE is therefore useful to determine how well biases balance out if you are only interested in the net or cumulative behavior of a system. This method is recommended against if looking to describe local changes without fine discretization to minimize biases canceling each other out.
Root Mean Square Error (RMSE)¶
Root Mean Square Error preserves magnitudes over directionality, unlike MBE.
Percent Bias (PB)¶
Percent Bias preserves direction like MBE, but aims to describe relative error instead of absolute error. PB is often used when analyzing performance across sites with different magnitudes. Because direction is preserved, the issue of positive and negative biases canceling out arises again here like in MBE.
Kullback-Leibler Divergence (KL Divergence)¶
Kullback-Leibler Divergence, or relative entropy, is used to describe how similar predicted and observation distributions are. Taken from Information Theory, KL Divergence describes the error in using one probability distribution in place of another, turning out to be a strong statistic to analyze how bmorph
corrects probability distributions of stream flows. A KL Divergence value of 0 symbolizes a perfect match between the two probability distributions, or no error in assuming the one distribution in place of the other.
Kling-Gupta Efficiency (KGE)¶
The Kling-Gupta Efficiency compares predicted flows to observed flows by combining linear correlations, flow variability, and bias (Knoben et. al. 2019). A KGE value of 1 represents predicted flows matching observed flows perfectly.
Plotting¶
The Evaluation
package of bmorph comes with a number of plotting tools to analyze bias correction performance.
Whether you want to compare corrected to uncorrected values, contrast different bias correction parameters, or simply compile your results into a tidy plot across multiple sites, there are plenty of plotting functions to choose from.
You can find plotting functions here.
The following plots are created from the results produced by the tutorial, where yakima_ds
is yakima_data_processed.nc
from the sample data before using any of the code provided below, make certain to import plotting
like so:
from bmorph.evaluation import plotting
Scatter¶
plotting.compare_correction_scatter(
flow_dataset= yakima_ds,
plot_sites = ['KIOW','YUMW','BUM'],
raw_var = 'raw',
ref_var = 'ref',
bc_vars = ['scbc_c'],
bc_names = ['SCBC_C'],
plot_colors = list(colors[-1]),
pos_cone_guide = True,
neg_cone_guide = True,
symmetry = False,
title = '',
fontsize_legend = 120,
alpha = 0.3
)

Scatter plots are most useful for comparing absolute error before and after bias correction. The above plot is produced from bmorph.evaluation.plotting.compare_correction_scatter to compare how absolute error changes with SCBC_C bias correction with Q being stream discharge. 1 to 1 and -1 to 1 lines are plotted for reference, as points plotted vertically between the lines demonstrates a reduction in absolute error while points plotted horizontally between the lines demonstrates an increase in absolute error for each flow time.
Time Series¶
plotting.plot_reduced_flows(
flow_dataset= yakima_ds,
plot_sites = ['KIOW','YUMW','BUM','KEE'],
interval = 'week',
raw_var = 'raw', raw_name = "Uncorrected",
ref_var = 'ref', ref_name = "Reference",
bc_vars = ['scbc_c'], bc_names = ['SCBC_C'],
plot_colors = ['grey', 'black', 'red']
);

The above plot is produced from bmorph.evaluation.plotting.plot_reduced_flows to compare a statistical representation of the flows at each site, (Mean in this case), for raw, reference, and bias corrected flows according to SCBC_C. Here, averages are computed on weekly intervals to simplify the figure, but can also be plotted on daily or monthly intervals for more or less granularity. Comparing this with median flows can describe how much the mean is impacted by extreme flows.
Probability Distributions¶
plotting.compare_mean_grouped_CPD(
flow_dataset= yakima_ds,
plot_sites = ['KIOW','YUMW','BUM'],
grouper_func = plotting.calc_water_year,
figsize = (60,40),
raw_var = 'raw', raw_name = 'Uncorrected',
ref_var = 'ref', ref_name = 'Reference',
bc_vars = ['scbc_c'], bc_names = ['SCBC_C'],
plot_colors = ['grey', 'black', 'red'],
linestyles = ['-','-','-'],
markers = ['o', 'X', 'o'],
fontsize_legend = 90,
legend_bbox_to_anchor = (1.9,1.0)
);

The above plot is produced from bmorph.evaluation.plotting.compare_mean_grouped_CPD to compare cumulative percentile distributions of mean annual flow at each site for raw, reference, and bias corrected flows according to SCBC_C. This function is also capable of subsetting data by month should you want to compare only January flows for example. Because bmorph
makes changes based on flow distributions, this plot is the closest to directly analyzing how the different methods correct flows.
Box & Whisker¶
plotting.kl_divergence_annual_compare(
flow_dataset= yakima_ds,
sites = ['KIOW','YUMW','BUM'],
fontsize_legend = 60, title = '',
raw_var = 'raw', raw_name = 'Uncorrected',
ref_var = 'ref', ref_name = 'Reference',
bc_vars = ['scbc_c'], bc_names = ['SCBC_C'],
plot_colors = ['grey','red']
);

The above plot is produced from bmorph.evaluation.plotting.kl_divergence_annual_compare to compare KL Divergence with respect to reference flows for raw and SCBC_C. Being able to view KL Divergence for different scenarios side-by-side helps to provide a better understanding of how well probability distributions are being fitted across the entire time provided.
Citations¶
Knoben, W. J. M., Freer, J. E., & Woods, R. A. (2019). Technical note: Inherent benchmark or not? Comparing Nash-Sutcliffe and Kling-Gupta efficiency scores. Hydrology and Earth System Sciences, 23, 4323-4331. https://doi.org/10.5194/hess-23-4323-2019
Simple River Network (SRN)¶
Overview¶
The Simple River Network, or SRN, is a graphical, pseudo-physical diagnostic tool used to visualize watershed models. Utilizing NetworkX’s nodal network structure, SRNs represent each river segment, or seg, as a singular SegNode and connects them according to the watershed’s topology. Each SRN is color-codable to assigned data values, such as percent bias, so you can visualize where issues may appear in the watershed during bmorph
bias correction to more easily understand spatial patterns of bias correction in the network.

SRN SegNode’s contain identifying information that allow the network to be partitioned according to Pfaffstetter Codes (Verdin & Verdin 1999, Arge et. al. 2006). Pfaffstetter enconding not only allows the networks to be partitioned, but also to be “rolled up”, effectively reducing the granularity of the network to simplify large watersheds. Data can also be subsetted and split into new SRN’s for simple manipulation.
SRN does not aim to supplant geographically accurate drawings of watershed networks. Instead it aims to provide a quicker, intermediate tool that allows for easy identification of spatial patterns within the network without having to configure spatial data.
Construction¶
from bmorph.evaluation import simple_river_network as srn
srn_basin = srn.SimpleRiverNetwork(
topo=basin_topo, pfaf_seed='', outlet_index=0, max_level_pfaf=42)
All we need to setup the SRN is the topology of the basin, basin_topo
, as an xarray.Dataset
. If you wish to include the Pfaffstetter digits of the larger watershed that contains the basin, you can do so with pfaf_seed
. While the constructor will assume that the outlet of the basin is the first index of the topology file provided, you can specify a different index to build the SRN off of with outlet_index
. Lastly, because the SRN is constructed recursively, a maximum number of Pfaffstetter digits is specifiable in max_level_pfaf
, which defaults as 42. Note that the larger the basin SRN, the longer construction may take.
Plotting on the SRN¶
Plotting the SRN is fully describe in bmorph.evaluation.simple_river_network.SimpleRiverNetwork.draw_network, so let’s cover some of the basics.
Plotting data on the SRN is done through draw_network
’s color_measure
that requires a pandas.Series
with and index that contains the indices of basin_topo
and corresponding values to be plotted on a linear colorbar. Several functions automate this process for highlighting different sections of the SRN, such as bmorph.evaluation.simple_river_network.SimpleRiverNetwork.generate_mainstream_map to plot the mainstream, bmorph.evaluation.simple_river_network.SimpleRiverNetwork.generate_pfaf_color_map to color code Pfaffstetter basins, and bmorph.evaluation.simple_river_network.SimpleRiverNetwork.generate_node_highlight_map to highlight specific river segments within the drawing.
mainstream_map = srn_basin.generate_mainstream_map()
fig, ax = plt.subplots(figsize=(30,40))
srn_yak.draw_network(color_measure=mainstream_map, cmap=mpl.cm.get_cmap('cividis'), ax=ax)
ax.invert_xaxis() # inverting the axis can be applied to increase resemblance with real watershed
For an example of how to construct your own color_measure
, we can look at how the tutorial plots percent difference across the SRN:
scbc_c = conditioned_seg_totals['IRFroutedRunoff']
raw = yakima_met_seg['IRFroutedRunoff']
p_diff = ((scbc_c-raw)/raw).mean(dim='time')*100
percent_diff = pd.Series(data=p_diff.to_pandas().values,index=mainstream_map.index)
fig, ax = plt.subplots(figsize=(30,40))
srn_yak.draw_network(color_measure=percent_diff, cmap=mpl.cm.get_cmap('coolwarm_r'),
with_cbar=True, cbar_labelsize=40, ax=ax, cbar_title='Percent Difference (%)')
The SRN is compatible with subplotting, but may require a large subplot space to spread out.
Citations¶
Arge, L., Danner, A., Haverkort, H., & Zeh, N. (2006). I/O-Efficient Hierarchial Watershed Decomposition og Grid Terrain Models. In A. Riedl, W. Kainz, G.A. Elmes (Eds.), Progress in Spatial Data Handling (pp. 825-844). Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-35589-8_51
Verdin, K.L., & Verdin, J. P. (1999). A topological system for delineation and codification of the Earth’s river basins. Elsevier Journal of Hydrology, 218, 1-12.
For Developers¶
This file contains helpful tips & tricks found in the process of developing this project.
Converting notebooks to restructured text¶
jupyter nbconvert mynotebook.ipynb --to rst
Any images in the notebook will be saved as png files within a newly created mynotebook_files
and automatically referenced within the rst
file.
For more information, check out here
Helper functions from bmorph
¶
log10_1p
¶
Similar to numpy’s log1p, bmorph.evaluation.plotting.log10_1p has been added to address wanting to perform log10 computations on a dataset that contains zeros. It effectively adds 1 to the data, element-wise, and then takes the log10. This is useful in scientific plots where a log10 scale is desired yet zeros reside in the dataset.
determine_row_col
¶
Tired of having to constantly reformat you subplots whenever you want to tack on one more plot or scratch off something you didn’t think you wanted? Well bmorph.plotting.evaluation.determine_row_col automates that process for you by calculating the tightest possible square/rectangular dimensions for your subplots. There may be some extra subplots leftover (and therefore we recommend turning off axis past the number you wish to plot), there will be at least enough subplots to fit all that you ask for.
Creating Documentation¶
Documentation for this project was done with Sphinx’s reStructuredText. Compiling of the documentation into an HTMl format was performed by Read the Docs. Thanks to through software, documentation was made easily updatable through GitHub version control without needing to develop a website from scratch in HTML itself.
The tutorial is made runnable by binder while data is stored on Hydroshare for simple online access.
API Reference¶
This page provides an auto-generated summary of the bmorph API. For more details and examples, refer to the relevant chapters in the main part of the documentation.
Core¶
Workflows¶
- bmorph.core.workflows.apply_blendmorph(raw_upstream_ts, raw_downstream_ts, train_upstream_ts, train_downstream_ts, ref_upstream_ts, ref_downstream_ts, apply_window, raw_train_window, ref_train_window, interval, overlap, blend_factor, raw_upstream_y=None, raw_downstream_y=None, train_upstream_y=None, train_downstream_y=None, ref_upstream_y=None, ref_downstream_y=None, n_smooth_long=None, n_smooth_short=5, bw=3, xbins=200, ybins=10, rtol=1e-06, atol=1e-08, method='hist', train_cdf_min=0.0001, **kwargs)[source]¶
Bias correction performed by blending bmorphed flows on user defined intervals.
Blendmorph is used to perform spatially consistent bias correction, this function does so on a user-defined interval. This is done by performing bmorph bias correction for each site’s timeseries according to upstream and downstream gauge sites (or proxies) where true flows are known. The upstream and downstream corrected timeseries are then multiplied by fractional weights, blend_factor, that sum to 1 between them so the corrected flows can be combined, or “blended,” into one, representative corrected flow series for the site. It is thereby important to specify upstream and downstream values so bias corrections are performed with values that most closely represent each site being corrected.
- Parameters
raw_upstream_ts (pandas.Series) – Raw flow timeseries corresponding to the upstream flows.
raw_downstream_ts (pandas.Series) – Raw flow timerseries corresponding to the downstream flows.
train_upstream_ts (pandas.Series) – Flow timeseries to train the bias correction model with for the upstream flows.
train_downstream_ts (pandas.Series) – Flow timeseries to train the bias correction model with for the downstream flows.
ref_upstream_ts (pandas.Series) – Observed/reference flow timeseries corresponding to the upstream flows.
ref_downstream_ts (pandas.Series) – Observed/reference flow timeseries corresponding to the downstream flows.
raw_train_window (pandas.date_range) – Date range to train the bias correction model.
apply_window (pandas.date_range) – Date range to apply bmorph onto flow timeseries.
ref_train_window (pandas.date_range) – Date range to smooth elements in ‘raw_ts’ and ‘bmorph_ts’.
interval (pandas.DateOffset) – Difference between bmorph application intervals.
overlap (int) – Total overlap in number of days the CDF windows have with each other, distributed evenly before and after the application window.
blend_factor (numpy.array) – An array determining how upstream and downstream bmorphing is proportioned. This is determined by the fill_method used in mizuroute_utils. The blend_factor entries are the proportion of upstream multiplers and totals added with 1-blend_factor of downstream multipliers and totals.
n_smooth_long (int, optional) – Number of elements that will be smoothed in raw_ts and bmorph_ts. The nsmooth value in this case is typically much larger than the one used for the bmorph function itself. For example, 365 days.
n_smooth_short (int) – Number of elements that will be smoothed when determining CDFs used for the bmorph function itself.
raw_upstream_y (pandas.Series, optional) – Raw time series of the second time series variable for conditioning corresponding to upstream flows.
raw_downstream_y (pandas.Series, optional) – Raw time series of the second time series variable for conditioning corresponding to downstream flows.
train_upstream_y (pandas.Series, optional) – Training second time series variable for conditioning correpsonding to downstream flows.
train_downstream_y (pandas.Series, optional) – Training second time series variable for conditioning correpsonding to upstream flows.
ref_upstream_y (pandas.Series, optional) – Target second time series variable for conditioning corresponding to upstream flows.
ref_downstream_y (pandas.Series, optional) – Target second time series variable for conditioning corresponding to downtream flows.
bw (int, optional) – Bandwidth for KernelDensity. This should only be used if method=’kde’.
xbins (int, optional) – Bins for the first time series. This should only be used if method=’hist’.
ybins (int, optional) – Bins for the second time series. This should only be used if method=’hist’.
rtol (float, optional) – The desired relatie tolerance of the result for KernelDensity. This should only be used if method=’kde’.
atol (float, optional) – The desired absolute tolerance of the result for KernelDensity. This should only be used if method=’kde’.
method (str, optional) – Method to use for conditioning. Currently ‘hist’ using hist2D and ‘kde’ using kde2D are the only supported methods.
**kwargs – Additional keyword arguments. Mainly implemented for cross-compatibility with other methods so that a unified configuration can be used
- Returns
bc_totals (pandas.Series) – Returns a time series of length of an interval in the bmoprh window with bmorphed values.
bc_multipliers (pandas.Series) – Returns a time series of equal length to bc_totals used to scale the raw flow values into the bmorphed values returned in bc_totals.
- bmorph.core.workflows.apply_bmorph(raw_ts, train_ts, ref_ts, apply_window, raw_train_window, ref_train_window, condition_ts=None, raw_y=None, train_y=None, ref_y=None, interval=<DateOffset: years=1>, overlap=60, n_smooth_long=None, n_smooth_short=5, bw=3, xbins=200, ybins=10, rtol=1e-06, atol=1e-08, method='hist', train_cdf_min=0.0001, **kwargs)[source]¶
Bias correction is performed by bmorph on user-defined intervals.
- Parameters
raw_ts (pandas.Series) – Raw flow timeseries.
train_ts (pandas.Series) – Flow timeseries to train the bias correction model with.
ref_ts (pandas.Series) – Observed/reference flow timeseries.
condition_ts (pandas.Series) – A timeseries with a variable to condition on. This will be used in place of raw_y, train_y, and ref_y. It is mainly added as a convenience over specifying the same timeseries for each of those variables.
apply_window (pandas.date_range) – Date range to apply bmorph onto flow timeseries.
raw_train_window (pandas.date_range) – Date range to train the bias correction model.
ref_train_window (pandas.date_range) – Date range to smooth elements in ‘raw_ts’ and ‘bmorph_ts’.
raw_y (pandas.Series, optional) – Raw time series of the second time series variable for conditioning.
train_y (pandas.Series, optional) – Training second time series.
ref_y (pandas.Series, optional) – Target second time series.
interval (pandas.DateOffset) – Difference between bmorph application intervals.
overlap (int) – Total number of days overlap CDF windows have with each other, distributed evenly before and after the application window.
n_smooth_long (int, optional) – Number of elements that will be smoothed in raw_ts and bmorph_ts. The nsmooth value in this case is typically much larger than the one used for the bmorph function itself. For example, 365 days.
n_smooth_short (int, optional) – Number of elements that will be smoothed when determining CDFs used for the bmorph function itself.
bw (int, optional) – Bandwidth for KernelDensity. This should only be used if method=’kde’.
xbins (int, optional) – Bins for the first time series. This should only be used if method=’hist’.
ybins (int, optional) – Bins for the second time series. This should only be used if method=’hist’.
rtol (float, optional) – The desired relatie tolerance of the result for KernelDensity. This should only be used if method=’kde’.
atol (float, optional) – The desired absolute tolerance of the result for KernelDensity. This should only be used if method=’kde’.
method (str) – Method to use for conditioning. Currently ‘hist’ using hist2D and ‘kde’ using kde2D are the only supported methods.
**kwargs – Additional keyword arguments. Mainly implemented for cross-compatibility with other methods so that a unified configuration can be used
- Returns
bmorph_corr_ts (pandas.Series) – Returns a time series of length of an interval in the bmoprh window with bmorphed values.
bmorph_multipliers (pandas.Series) – Returns a time series of equal length to bc_totals used to scale the raw flow values into the bmorphed values returned in bc_totals.
- bmorph.core.workflows.apply_scbc(ds, mizuroute_exe, bmorph_config, client=None, save_mults=False)[source]¶
Applies Spatially Consistent Bias Correction (SCBC) by bias correcting local flows and re-routing them through mizuroute. This method can be run in parallel by providing a dask client.
- Parameters
ds (xr.Dataset) – An xarray dataset containing time and seg dimensions and variables to be bias corrected. This will mostly likely come from the provided preprocessing utility, mizuroute_utils.to_bmorph
mizuroute_exe (str) – The path to the mizuroute executable
bmorph_config (dict) – The configuration for the bias correction. See the documentation on input specifications and selecting bias correction techniques for descriptions of the options and their choices.
client (dask.Client (optional)) – A client object to manage parallel computation.
save_mults (boolean (optional)) – Whether to save multipliers from bmorph for diagnosis. If True, multipliers are saved in the same directory as local flows. Defaults as False to not save multipliers.
- Returns
region_totals – The rerouted, total, bias corrected flows for the region
- Return type
xr.Dataset
BMORPH¶
- bmorph: modify a time series by removing elements of persistent differences
(aka bias correction)
Persistent differences are inferred by comparing a ‘ref’ sample with a ‘training’ sample. These differences are then used to correct a ‘raw’ sample that is presumed to have the same persistent differences as the ‘training’ sample. The resulting ‘bmorph’ sample should then be consistent with the ‘ref’ sample.
- bmorph.core.bmorph.bmorph(raw_ts, train_ts, ref_ts, raw_apply_window, raw_train_window, ref_train_window, raw_cdf_window, raw_y=None, ref_y=None, train_y=None, nsmooth=12, bw=3, xbins=200, ybins=10, rtol=1e-07, atol=0, method='hist', smooth_multipliers=True, train_cdf_min=0.0001)[source]¶
Morph raw_ts based on differences between ref_ts and train_ts
bmorph is an adaptation of the PresRat bias correction procedure from Pierce et al. (2015; http://dx.doi.org/10.1175/JHM-D-14-0236.1), which is itself an extension of the Equidistant quantile matching (EDCDFm) technique of Li et al. (2010; http://dx.doi.org/10.1029/94JD00483). The method as implemented here uses a multiplicative change in the quantiles of a CDF, followed by a simple correction to preserve changes in the long-term mean. No further frequency-based corrections are applied.
The method differs from PresRat in that it is not applied for fixed periods (but uses a moving window) to prevent discontinuities in the corrected time series and it does not apply a frequency-based correction.
The method also allows changes to be made through an adapted version of the EDCDFm technique or through the multiDimensional ConDitional EquiDistant CDF matching function if a second timeseries variable is passed.
- Parameters
raw_ts (pandas.Series) – Raw time series that will be bmorphed
raw_cdf_window (slice) – Slice used to determine the CDF for raw_ts
raw_bmorph_window (slice) – Slice of raw_ts that will be bmorphed
ref_ts (pandas.Series) – Target time series. This is the time series with ref values that overlaps with train_ts and is used to calculated ref_cdf
train_ts (pandas.Series) – Training time series. This time series is generated by the same process as raw_ts but overlaps with ref_ts. It is used to calculate train_cdf
training_window (slice) – Slice used to subset ref_ts and train_ts when the mapping between them is created
nsmooth (int) – Number of elements that will be smoothed when determining CDFs
raw_y (pandas.Series) – Raw time series of the second time series variable for cqm
ref_y (pandas.Series) – Target second time series
train_y (pandas.Series) – Training second time series
bw (int) – bandwidth for KernelDensity
xbins (int) – Bins for the flow time series
ybins (int) – Bins for the second time series
train_cdf_min (float (optional)) – Minimum percentile allowed for train cdf. Defaults as 1e-4 to help handle data spikes in corrections caused by multipliers being too large from the ratio between reference and training flows being large.
- Returns
bmorph_ts – Returns a time series of length raw_bmorph_window with bmorphed values
- Return type
pandas.Series
- bmorph.core.bmorph.bmorph_correct(raw_ts, bmorph_ts, correction_window, ref_mean, train_mean, nsmooth)[source]¶
Correct bmorphed values to preserve the ratio of change
Apply a correction to bmorphed values to preserve the mean change over a correction_window. This is similar to teh correction applied in the original PresRat algorithm; Pierce et al. 2015; http://dx.doi.org/10.1175/JHM-D-14-0236.1), except that we use a rolling mean to determine the correction to avoid discontinuities on the boundaries.
- Parameters
raw_ts (pandas.series) – Series of raw values that have not been bmorphed
bmorph_ts (Bmorphed version of raw_ts) – Series of bmorphed values
correction_window (slice) – Slice of raw_ts and bmorph_ts over which the correction is applied
ref_mean (float) – Mean of target time series (ref_ts) for the base period
train_mean (float) – Mean of training time series (train_ts) for the base period
nsmooth (int) – Number of elements that will be smoothed in raw_ts and bmorph_ts. The nsmooth value in this case is typically much larger than the one used for the bmorph function itself. For example, 365 days.
- Returns
bmorph_corrected_ts – Corrected series of length correction_window.
- Return type
pandas.Series
- bmorph.core.bmorph.cqm(raw_x: pandas.core.series.Series, train_x: pandas.core.series.Series, ref_x: pandas.core.series.Series, raw_y: pandas.core.series.Series, train_y: pandas.core.series.Series, ref_y: Optional[pandas.core.series.Series] = None, method='hist', xbins=200, ybins=10, bw=3, rtol=1e-07, atol=0, nsmooth=5, train_cdf_min=0.0001) pandas.core.series.Series [source]¶
Conditional Quantile Mapping
- Multidimensional conditional equidistant CDF matching function:
- ilde{x_{mp}} = x_{mp} + F^{-1}_{oc}(F_{mp}(x_{mp}|y_{mp})|y_{oc})
F^{-1}_{mc}(F_{mp}(x_{mp}|y_{mp})|y_{mc})
- bmorph.core.bmorph.edcdfm(raw_x, raw_cdf, train_cdf, ref_cdf, train_cdf_min=0.0001)[source]¶
Calculate multipliers using an adapted version of the EDCDFm technique
This routine implements part of the PresRat bias correction method from Pierce et al. (2015; http://dx.doi.org/10.1175/JHM-D-14-0236.1), which is itself an extension of the Equidistant quantile matching (EDCDFm) technique of Li et al. (2010; http://dx.doi.org/10.1029/94JD00483). The part that is implemented here is the amended form of EDCDFm that determines multiplicative changes in the quantiles of a CDF.
In particular, if the value raw_x falls at quantile u_t (in raw_cdf), then the bias-corrected value is the value in ref_cdf at u_t (ref_x) multiplied by the model-predicted change at u_t evaluated as a ratio (i.e., model future (or raw_x) / model historical (or ref_x)). Thus, the bias-corrected value is raw_x multiplied by ref_x/train_x. Here we only return the multiplier ref_x/train_x. This method preserves the model-predicted median (not mean) change evaluated multiplicatively. Additional corrections are required to preserve the mean change. Inclusion of these additional corrections constitutes the PresRat method.
- Parameters
raw_x (pandas.Series) – Series of raw values that will be used to determine the quantile u_t
raw_cdf (pandas.Series) – Series of raw values that represents the CDF that is used to determine the non-parametric quantile of raw_x
train_cdf (pandas.Series) – Series of training values that represents the CDF based on the same process as raw_cdf, but overlapping in time with ref_cdf
ref_cdf (pandas.Series) – Series of ref values that represents the ref CDF and that overlaps in time with train_cdf
- Returns
multiplier – Multipliers for raw_x. The pandas.Series has the same index as raw_x
- Return type
pandas.Series
- bmorph.core.bmorph.hist2D(x, y, xbins, ybins, **kwargs)[source]¶
Create a 2 dimensional pdf vias numpy histogram2d
Local Flows¶
- bmorph.core.local_flows.estimate_local_flow(ref_total_flow: pandas.core.series.Series, sim_total_flow: pandas.core.series.Series, sim_local_flow: pandas.core.series.Series, how: str = 'flow_fraction', method_kwargs: Dict[str, Any] = {}) pandas.core.series.Series [source]¶
Estimate the local flow for a single site.
- Parameters
ref_total_flow – Reference total flow to calculate the reference_local_flow from
sim_total_flow – Simulated total flow
sim_local_flow – Simulated reference flow
how – How to estimate the local reference flow. Available options are
flow_fraction
method_kwargs – Additional arguments to pass to the estimator.
- Returns
- Return type
the estimated local reference flow
- bmorph.core.local_flows.find_local_segment(lats, lons, target_latlon, n_return=10, metric='euclidean', gridsearch=False) Dict[str, numpy.ndarray] [source]¶
Finds the closest coordinates to a given target. Can return multiple coordinates, in ascending order from closest to furthest.
- Parameters
lats – Latitudes to search through
lons – Longitudes to search through
target_latlon – Tuple of (lat, lon) that is being searched for
n_return – Number of closest coordinates to return
metric – Distance metric. Can be any valid metric from
scipy.spatial.distance.cdist
gridsearch – Whether to create a meshgrid from the given
lats
andlons
- Returns
coords
: Coordinates ofn_return
nearest coordinates as (lat, lon)distances
: Distances according tometric
fromtarget_latlon
indices
: Indices fromlats
andlons
to then_return
nearest- Return type
dictionary containing
- bmorph.core.local_flows.flow_fraction_multiplier(total_flow: pandas.core.series.Series, local_flow: pandas.core.series.Series, nsmooth: int = 30) pandas.core.series.Series [source]¶
Calculate a the ratio of local flow to total flow from timeseries after applying a rolling mean.
- Parameters
total_flow – Total accumulated flow at the given location
local_flow – Portion of flow that is directly from the sub-basin (excludes upstream contributions)
nsmooth – Number of timesteps to use
Utilities¶
mizuRoute Utilities¶
- bmorph.util.mizuroute_utils.calculate_blend_vars(routed: xarray.core.dataset.Dataset, topology: xarray.core.dataset.Dataset, reference: xarray.core.dataset.Dataset, gauge_sites=None, route_var='IRFroutedRunoff', fill_method='kldiv', min_kge=- 0.41)[source]¶
Calculates a number of variables used in blendmorph and map_var_to_seg.
- Parameters
routed (xr.Dataset) – The dataset that will be modified and returned ready for map_var_to_seg.
topology (xr.Dataset) – Contains the network topology with a “seg” dimension that identifies reaches, matching the routed dataset.
reference (xr.Dataset) – Contains reaches used for reference with dimension “site” and coordinate “seg”.
gauge_sites (list, optional) – Contains the gauge site names from the reference dataset to be used that are automatically pulled from reference if None are given.
route_var (str) – Variable name of flows used for fill_method purposes within routed. This is defaulted as ‘IRFroutedRunoff’.
fill_method (str) – While finding some upstream/downstream reference segs may be simple, (segs with ‘is_gauge’ = True are their own reference segs, others may be easy to find looking directly up or downstream), some river networks may have multiple options to select gauge sites and may fail to have upstream/downstream reference segs designated. ‘fill_method’ specifies how segs should be assigned upstream/downstream reference segs for bias correction if they are missed walking upstream or downstream.
- Currently supported methods:
- ‘leave_null’
nothing is done to fill missing reference segs, np.nan values are replaced with a -1 seg designation and that’s it
- ‘forward_fill’
xarray’s ffill method is used to fill in any np.nan values
- ‘r2’
reference segs are selected based on which reference site that seg’s flows has the greatest r2 value with
- ‘kldiv’ (default)
reference segs are selected based on which reference site that seg’s flows has the smallest KL Divergence value with
- ‘kge’
reference segs are selected based on which reference site that seg’s flows has the greatest KGE value with
min_kge (float) – If not None, all upstream/downstream reference seg selections will be filtered according to the min_kge criteria, where seg selections that have a kge with the current seg that is less that min_kge will be set to -1 and determined unsuitable for bias correction. This is defaulted as -0.41.
- Returns
routed – with the following added: ‘is_headwaters’ ‘is_gauge’ ‘down_seg’ ‘distance_to_up_gauge’ ‘distance_to_down_gauge’ ‘cdf_blend_factor’ ‘up_seg’ ‘up_ref_seg’ ‘down_ref_seg’
- Return type
xr.Dataset
- bmorph.util.mizuroute_utils.calculate_cdf_blend_factor(routed: xarray.core.dataset.Dataset, gauge_reference: xarray.core.dataset.Dataset, gauge_sites=None, fill_method='kldiv', min_kge=- 0.41)[source]¶
Calculates the cumulative distribution function blend factor based on distance to a seg’s nearest up gauge site with respect to the total distance between the two closest guage sites to the seg.
- Parameters
routed (xr.Dataset) – Contains flow timeseries data.
gauge_reference (xr.Dataset) – Contains reference flow timeseries data for the same watershed as the routed dataset.
gauge_sites (list, optional) – If None, gauge_sites will be taken as all those listed in gauge_reference.
fill_method (str) – See map_ref_sites for full description of how fill_method works.
Because each fill_method selects reference segs differently, calculate_blend_vars needs to know how they were selected to create blend factors. Note that ‘leave_null’ is not supported for this method because there is no filling for this method. Currently supported:
- ‘forward_fill’
- cdf_blend_factor = distance_to_upstream /
(distance_to_upstream + distance_to_downstream)
- ‘kldiv’
cdf_blend_factor = kldiv_upstream / (kldiv_upstream + kldiv_downstream)
- ‘r2’
cdf_blend_factor = r2_upstream / (r2_upstream + r2_downstream)
- Returns
routed – The original routed dataset updated with ‘cdf_blend_factors’ used to combine upstream and downstream relative bias corrections. Each fill_method will also add or use upstream and downstream statistical measures calculated in map_ref_sites.
- Return type
xr.Dataset
- bmorph.util.mizuroute_utils.find_max_kge(ds, curr_seg_flow)[source]¶
Searches through ds to find which seg has the larges Kling-Gupta Efficiency (KGE) value with respect to curr_seg_flow. If no seg is found, max_kge = -np.inf and max_kge_ref_seg = -1.
- Parameters
ds (xr.Dataset) – Contains the variable ‘reference_flow’ to compare curr_seg_flow against and the coordinate ‘seg’.
curr_seg_flow (int) – A numpy array containing flow values that KGE is to be maximized with respect to.
- Returns
max_kge (float) – Maximum KGE value found.
max_kge_ref_seg – River segment designation corresponding to max_kge.
- bmorph.util.mizuroute_utils.find_max_r2(ds, curr_seg_flow)[source]¶
Searches through ds to find which seg has the greatest r2 value with respect to curr_seg_flow. If no seg is found, max_r2 = 0 and max_r2_ref_seg = -1.
- Parameters
ds (xr.Dataset) – Contains the variable ‘reference_flow’ to compare curr_seg_flow against and the coordinate ‘seg’.
curr_seg_flow (np.array) – A numpy array containing flow values that r2 is to be maximized with respect to.
- Returns
max_r2 (float) – Magnitude of the maximum R squared value found.
max_r2_ref_seg (int) – River segement designation corresponding to the max_r2.
- bmorph.util.mizuroute_utils.find_min_kldiv(ds, curr_seg_flow)[source]¶
Searches through ds to find which seg has the smallest Kullback-Leibler Divergence value with respect to curr_seg_flow. If no seg is found, min_kldiv = -1 and min_kldiv_ref_seg = -1. https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence
- Parameters
ds (xr.Dataset) – contains the variable ‘reference_flow’ to compare curr_seg_flow against and the coordinate ‘seg’.
curr_seg_flow (np.array) – a numpy array containing flow values that KL Divergence is to be maximized with respect to.
- Returns
min_kldiv (float) – Magnitude of the minimum KL Divergence found.
min_kldiv_ref_seg (int) – River segment designation corresponding to min_kldiv.
- bmorph.util.mizuroute_utils.find_up(ds, seg, sel_method='first', sel_var='IRFroutedRunoff')[source]¶
Finds the segment directly upstream of seg given seg is not a headwater segment, (in which case np.nan is returned).
- Parameters
ds (xr.Dataset) – Dataset containing river segments as ‘seg’, headwater segments by ‘is_headwaters’, and what is downstream of each seg in ‘down_seg’.
seg (int) – River segment designation to search from.
sel_method (str) – Method to use when selecting among multiple upstream segments.
sel_var (str) – Variable used when comparing segments amonth multiple upstream segments. Can be ‘forward_fill’, ‘r2’, or ‘kge’.
- Returns
up_seg – Upstream segment designation found, or np.nan if seg is a headwater segement.
- Return type
int
- bmorph.util.mizuroute_utils.kling_gupta_efficiency(sim, obs)[source]¶
Calculates the Kling-Gupta Efficiency (KGE) between two flow arrays. https://agrimetsoft.com/calculators/Kling-Gupta%20efficiency
- Parameters
sim (array-like) – Simulated flow array.
obs (array-like) – Observed flow array.
- Returns
kge – Kling-Gupta Efficiency calculated between the two arrays.
- Return type
float
- bmorph.util.mizuroute_utils.map_headwater_sites(routed: xarray.core.dataset.Dataset)[source]¶
Boolean identifies whether a river segement is a headwater with ‘is_headwater’.
- Parameters
routed (xr.Dataset) – Contains watershed river segments designated as the dimension ‘seg’. River segments are connected by referencing immediate downstream segments as ‘down_seg’ for each ‘seg’.
- Returns
routed – The original routed dataset updated with which sites are headwaters.
- Return type
xr.Dataset
- bmorph.util.mizuroute_utils.map_met_hru_to_seg(met_hru, topo)[source]¶
Maps meterological data from hru to seg.
- Parameters
met_hru (xr.Dataset) – A dataset of meteorological data to be mapped onto the stream segments to facilitate conditioning. All variables in this dataset will automatically be mapped onto the stream segments and returned.
topo (xr.Dataset) – Topology dataset for running mizuRoute. We expect this to have
seg
andhru
dimensions.
- Returns
met_seg – A dataset of meterological data mapped onto the stream segments to facilitate conditioning.
- Return type
xr.Dataset
- bmorph.util.mizuroute_utils.map_ref_sites(routed: xarray.core.dataset.Dataset, gauge_reference: xarray.core.dataset.Dataset, gauge_sites=None, route_var='IRFroutedRunoff', fill_method='r2', min_kge=- 0.41)[source]¶
Assigns segs within routed boolean ‘is_gauge’ “identifiers” and what each seg’s upstream and downstream reference seg designations are.
- Parameters
routed (xr.Dataset) – Contains the input flow timeseries data.
gauge_reference (xr.Dataset) – Contains reference flow timeseries data for the same watershed as the routed dataset.
gauge_sites (list, optional) – If None, gauge_sites will be taken as all those listed in gauge_reference.
route_var (str) – Variable name of flows used for fill_method purposes within routed. This is defaulted as ‘IRFroutedRunoff’.
fill_method (str) – While finding some upstream/downstream reference segs may be simple, (segs with ‘is_gauge’ = True are their own reference segs, others may be easy to find looking directly up or downstream), some river networks may have multiple options to select gauge sites and may fail to have upstream/downstream reference segs designated. ‘fill_method’ specifies how segs should be assigned upstream/downstream reference segs for bias correction if they are missed walking upstream or downstream.
- Currently supported methods:
- ‘leave_null’
nothing is done to fill missing reference segs, np.nan values are replaced with a -1 seg designation and that’s it
- ‘forward_fill’
xarray’s ffill method is used to fill in any np.nan values
- ‘r2’
reference segs are selected based on which reference site that seg’s flows has the greatest r2 value with
- ‘kldiv’
reference segs are selected based on which reference site that seg’s flows has the smallest KL Divergence value with
- ‘kge’
reference segs are selected based on which reference site that seg’s flows has the greatest KGE value with
- Returns
routed – Routed timeseries with reference gauge site river segments assigned to each river segement in the original routed.
- Return type
xr.Dataset
- bmorph.util.mizuroute_utils.map_segs_topology(routed: xarray.core.dataset.Dataset, topology: xarray.core.dataset.Dataset)[source]¶
Adds contributing_area, average elevation, length, and down_seg to routed from topology.
- Parameters
routed (xr.Dataset) – Contains streamflow timeseries mapped to river segments denoted as ‘seg’.
topology (xr.Dataset) – Contains topological data of the watershed that routed’s streamflow timeseries describe. River segment designations, lengths, and immeditate downstream segments are expected as ‘seg’, ‘Length’, and ‘Tosegment’.
- Returns
routed – The input dataset routed updated with the topological data.
- Return type
xr.Dataset
- bmorph.util.mizuroute_utils.map_var_to_segs(routed: xarray.core.dataset.Dataset, map_var: xarray.core.dataarray.DataArray, var_label: str, var_key: str, gauge_segs=None)[source]¶
Splits the variable into its up and down components to be used in blendmorph.
- Parameters
routed (xr.Dataset) – the dataset that will be modified and returned having been prepared by calculate_blend_vars with the dimension ‘seg’
map_var (xr.DataArray) – contains the variable to be split into up and down components and can be the same as routed, (must also contain the dimension ‘seg’)
var_label (str) – suffix of the up and down parts of the variable
var_key (str) – variable name to access the variable to be split in map_var
gauge_segs (list, optional) – List of the gauge segs that identify the reaches that are gauge sites, pulled from routed if None.
—-
- Returns
routed – with the following added: f’down_{var_label}’ f’up_{var_label}’
- Return type
xr.Dataset
- bmorph.util.mizuroute_utils.to_bmorph(topo: xarray.core.dataset.Dataset, routed: xarray.core.dataset.Dataset, reference: xarray.core.dataset.Dataset, met_hru: Optional[xarray.core.dataset.Dataset] = None, route_var: str = 'IRFroutedRunoff', fill_method='r2', min_kge=None)[source]¶
Prepare mizuroute output for bias correction via the blendmorph algorithm. This allows an optional dataset of hru meteorological data to be given for conditional bias correction.
- Parameters
topo (xr.Dataset) – Topology dataset for running mizuRoute. We expect this to have
seg
andhru
dimensions.routed (xr.Dataset) – The initially routed dataset from mizuRoute.
reference (xr.Dataset) – A dataset containing reference flows for bias correction. We expect this to have
site
andtime
dimensions with flows being stored inreference_flow
.met_hru (xr.Dataset, optional) – A dataset of meteorological data to be mapped onto the stream segments to facilitate conditioning. All variables in this dataset will automatically be mapped onto the stream segments and returned.
route_var (str) – Name of the variable of the routed runoff in the
routed
dataset. Defaults toIRFroutedRunoff
.fill_method (str) – While finding some upstream/downstream reference segs may be simple, (segs with ‘is_gauge’ = True are their own reference segs, others may be easy to find looking directly up or downstream), some river networks may have multiple options to select gauge sites and may fail to have upstream/downstream reference segs designated. ‘fill_method’ specifies how segs should be assigned upstream/downstream reference segs for bias correction if they are missed walking upstream or downstream.
- Currently supported methods:
- ‘leave_null’
nothing is done to fill missing reference segs, np.nan values are replaced with a -1 seg designation and that’s it
- ‘forward_fill’
xarray’s ffill method is used to fill in any np.nan values
- ‘r2’
reference segs are selected based on which reference site that seg’s flows has the greatest r2 value with
- ‘kldiv’ (default)
reference segs are selected based on which reference site that seg’s flows has the smallest KL Divergence value with
- ‘kge’
reference segs are selected based on which reference site that seg’s flows has the greatest KGE value with
min_kge (float, optional) – See calculate_blend_vars for more information defaults None unless fill_method = ‘kge’.
- Returns
A dataset with the required data for applying the
blendmorph
routines. See theblendmorph
documentation for further information.- Return type
met_seg
- bmorph.util.mizuroute_utils.trim_time(dataset_list: list)[source]¶
Trims all times of the xr.Datasets in the list to the shortest timeseries.
- Parameters
dataset_list (List[xr.Dataset]) – Contains a list of xr.Datasets
- Returns
Contains a list in the same order as dataset_list except with all items in the list having the same start and end time.
- Return type
list
- bmorph.util.mizuroute_utils.walk_down(ds, start_seg)[source]¶
Finds the nearest downstream gauge site and returns the distance traveled to reach it from start_seg.
- Parameters
ds (xr.Dataset) – Dataset containing river segments, downstream segs, the length of the river segments, and which segs are gauge sites as ‘seg’, ‘down_seg’, ‘lenght’, and ‘is_gauge’, respectively.
start_seg (int) – River segment designation to start walking from to a downstream gauge site.
- Returns
tot_length (float) – Total length traveled during walk, (e.g. cumulative river distance from start_seg to the downstream gauge site).
cur_seg (int) – River segment designation of the gauge site reached.
- bmorph.util.mizuroute_utils.walk_up(ds, start_seg)[source]¶
Finds the nearest upstream gauge site and returns the distance traveled to reach it from start_seg.
- Parameters
ds (xr.Dataset) – Dataset containing river segments, upstream segs, the length of the river segments, and which segs are gauge sites as ‘seg’, ‘up_seg’, ‘lenght’, and ‘is_gauge’, respectively.
start_seg (int) – River segment designation to start walking from to an upstream gauge site.
- Returns
tot_length (float) – Total length traveled during walk, (e.g. cumulative river distance from start_seg to the downstream gauge site).
cur_seg (int) – River segment designation of the gauge site reached.
Evaluation¶
Plotting¶
- bmorph.evaluation.plotting.anomaly_scatter2D(computations: dict, baseline_key: str, vert_key: str, horz_key: str, sites=[], multi=True, colors=['Maroon', 'Navy', 'Black', 'Orange', 'Yellow', 'Blue', 'Grey', 'Pink', 'Lavender'], show_legend=True)[source]¶
Plots two correction models against each other after Raw is subracted from each.
- Parameters
computations (dict) – Expecting {“Correction Name”: correction pandas.DataFrame}.
baseline_key (str) – Dictionary key for the computations dictionary that accesses what baseline the corrections should be compared to. This is typically observations.
vert_key (str) – Dictionary key for the computations dictionary that accesses the model to be plotted on the vertical axis.
horz_key (str) – Dictionary key for the computations dictionary that accesses the model to be plotted on the horizontal axis.
sites (list) – Site(s) to be compared in the plot, can have a size of 1. If multi is set to False and this is not changed to a single site, then the first value in the list will be chosen.
multi (boolean, optional) – Whether the plot uses data from multiple sites or a single site.
colors (list, optional) – Colors as strings to be plotted from. Plotting colors are different for each correction DataFrame, but same across sites for a singular correction. An error will be thrown if there are more cor_keys then colors.
show_legend (boolean, optional) – Whether or not to display the legend, defaults as True.
- bmorph.evaluation.plotting.calc_water_year(df: pandas.core.frame.DataFrame)[source]¶
Calculates the water year.
- Parameters
df (pandas.DataFrame) – Flow timeseries with a DataTimeIndex.
- Returns
A pandas.DataFrame index grouped by water year.
- Return type
pandas.DataFrame.index
- bmorph.evaluation.plotting.color_code_nxgraph(graph: <module 'networkx.classes.graph' from '/home/docs/checkouts/readthedocs.org/user_builds/bmorph/envs/master/lib/python3.7/site-packages/networkx/classes/graph.py'>, measure: pandas.core.series.Series, cmap=<matplotlib.colors.LinearSegmentedColormap object>, vmin=None, vmax=None) dict [source]¶
Creates a dictionary mapping of nodes to color values.
- Parameters
graph (networkx.graph) – Graph to be color coded
measure (pandas.Series) – Contains river segment ID’s as the index and desired measures as values.
cmap (matplotlib.colors.LinearSegmentedColormap, optional) – Colormap to be used for coloring the SimpleRiverNewtork plot. This defaults as ‘coolwarm_r’.
vmin (float, optional) – Minimum value for coloring
vmax (float, optional) – Maximum value for coloring
- Returns
Dictionary of {i:color} where i is the index of the river segment.
- Return type
dict
- bmorph.evaluation.plotting.compare_CDF(flow_dataset: xarray.core.dataset.Dataset, plot_sites: list, raw_var: str, raw_name: str, ref_var: str, ref_name: str, bc_vars: list, bc_names: list, plot_colors: list, logit_scale=True, logarithm_base='10', units='Q [$m^3/s$]', markers=['o', 'x', '*', '*'], figsize=(20, 20), sharex=False, sharey=True, fontsize_title=40, fontsize_labels=30, fontsize_tick=20, markersize=1, alpha=0.3)[source]¶
Compare probability distribution functions on a logit scale.
Plots the CDF’s of the raw, reference, and bias corrected flows.
- Parameters
flow_dataset (xarray.Dataset) – Contatains raw, reference, and bias corrected flows.
gauges_sites (list) – Gauge sites to be plotted contained in flow_dataset.
raw_var (str) – Accesses the raw flows in flow_dataset for flows before bias correction.
raw_name (str) – Label for the raw flows in the legend, corresponding to raw_var.
ref_var (str) – Accesses the reference flows in flow_dataset for true flows.
ref_name (str) – Label for the reference flows in the legend, corresponding to ref_var.
bc_vars (list) – Accesses the bias corrected flows in flow_dataset, where each element accesses its own bias corrected flows. Can be a size of 1.
bc_names (list) – Label for the bias corrected flows in the legend for each entry in bc_var, assumed to be in the same order. Can be a size of 1.
logit_scale (boolean, optional) – Whether to plot the vertical scale on a logit axis (True) or not (False). Defaults as True.
logarithm_base (str, optional) – The logarthimic base to use for the horizontal scale. Only the following are currently supported:
‘10’ to use a log10 horizontal scale (default) ‘e’ to use a natural log horizontal scale
units (str, optional) – The horizontal axis’s label for units, defaults as r’Q [$m^3/s$]’.
plot_colors (list, optional) – Colors to be plotted for the flows corresponding to raw_var, ref_var, and bc_var, defaulting as [‘grey’, ‘black’, ‘blue’, ‘red’], assuming there are two entries in bc_var.
markers (list, optional) – Markers to be plotted for the flows corresponding to raw_var, ref_var, and bc_var, defaulting as [‘o’, ‘x’, ‘*’, ‘*’], assuming there are two entries in bc_var.
figsize (tuple, optional) – Figure size following matplotlib notation, defaults as (20, 20).
sharex (boolean or str, optional) – Whether horizontal axis should be shared amongst subplots, defaulting as False.
sharey (boolean or str, optional) – Whether vertical axis should be shared amongst subplots, defaulting as True.
fontsize_title (int, optional) – Font size of the title, defaults as 40.
fontsize_labels (int, optional) – Font size of the labels, defaults as 30.
fontsize_tick (int, optional) – Font size of the ticks, defaults as 20.
markersize (float, optional) – Size of the markers plotted, defaults as 1.
alpha (float, optional) – Transparancy of the markers plotted, defaults as 0.3 to help see where markers clump together.
- Returns
- Return type
matplotlib.figure, matplotlib.axes
- bmorph.evaluation.plotting.compare_CDF_all(flow_dataset: xarray.core.dataset.Dataset, plot_sites: list, raw_var: str, raw_name: str, ref_var: str, ref_name: str, bc_vars: list, bc_names: list, plot_colors: list, logit_scale=True, logarithm_base='10', units='Q [$m^3/s$]', figsize=(20, 20), fontsize_title=40, fontsize_labels=40, fontsize_tick=40, markersize=1, alpha=0.3)[source]¶
Compare probability distribution functions as a summary statistic.
Plots the CDF’s of the raw, reference, and bias corrected flows with data from all sites in plot_sites combined for a summary statistic.
- Parameters
flow_dataset (xarray.Dataset) – Contains raw (uncorrected), reference (true), and bias corrected flows.
plot_sites (list) – Gauge sites to be plotted.
raw_var (str) – Accesses the raw (uncorrected) flows in flow_dataset.
raw_name (str) – Label for the raw flows in the legend, corresponding to raw_var.
ref_var (str) – Accesses the reference (true) flows in flow_dataset.
ref_name (str) – Label for the reference flows in the legend, corresponding to ref_var.
bc_vars (list) – Accesses the bias corrected flows in flow_dataset. Can be a size of 1.
bc_names (list) – Label for the bias corrected flows in the legend, corresponding to bc_var. Can be a size of 1.
plot_colors (list, optional) – Colors to be plotted for the flows corresponding to raw_var, ref_var, and bc_var, defaulting as [‘grey’, ‘black’, ‘blue’, ‘red’], assuming there are two entries in bc_var.
logit_scale (True, optional) – Whether to plot the vertical scale on a logit axis (True) or not (False). Defaults as True.
logarithm_base (str, optional) – The logarthimic base to use for the horizontal scale. Only the following are currently supported:
‘10’ to use a log10 horizontal scale (default) ‘e’ to use a natural log horizontal scale
units (str, optional) – The horizontal axis’s label for units, defaults as r’Q [$m^3/s$]’.
figsize (tuple, optional) – Figure size following matplotlib connventions, defaults as (20, 20).
fontsize_title (int, optional) – Font size of the title, defaults as 40.
fontsize_labels (int, optional) – Font size of the labels, defaults as 40.
fontsize_tick (int, optional) – Font size of the ticks, defaults as 40.
markersize (float, optional) – Size of the markers plotted, defaults as 1. Linewidth is half of this value.
alpha (float, optional) – Transparancy of the lines and markers, defaults as 0.3.
- Returns
- Return type
matplotlib.figure, matplotlib.axes
- bmorph.evaluation.plotting.compare_PDF(flow_dataset: xarray.core.dataset.Dataset, gauge_sites=<class 'list'>, raw_var='raw_flow', ref_var='reference_flow', bc_var='bias_corrected_total_flow', raw_name='Mizuroute Raw', ref_name='NRNI Reference', bc_name='BMORPH BC', fontsize_title=40, fontsize_labels=30, fontsize_tick=20)[source]¶
Compare probability distribution functions.
Plots the PDF’s of the raw, reference, and bias corrected flows for each gauge site.
- Parameters
flow_dataset (xarray.Dataset) – Contatains raw, reference, and bias corrected flows.
gauges_sites (list) – Gauge sites to be plotted as used in the flow_dataset.
raw_var (str, optional) – The string to access the raw flows in flow_dataset for flows before bias correction. Defaults as ‘raw_flow’.
ref_var (str, optional) – The string to access the reference flows in flow_dataset for true flows. Defaults as ‘reference_flow’.
bc_var (str, optional) – The string to access the bias corrected flows in flow_dataset for flows after bias correction. Defaults as ‘bias_corrected_total_flow’.
raw_name (str, optional) – Label for the raw flows before bias correction. Defaults as ‘Mizuroute Raw’.
ref_name (str, optional) – Label for the reference flows. Defaults as ‘NRNI Reference’.
bc_name (str, optional) – Label for the bias corrected flows after bias correction. Defaults as ‘BMORPH BC’.
fontsize_title (int, optional) – Fontsize of the title. Defaults as 40.
fontsize_labels (int, optional) – Fontsize of the lables. Defaults as 30.
fontsize_tick (int, optional) – Fontsize of the ticks. Defaults as 20.
- bmorph.evaluation.plotting.compare_correction_scatter(flow_dataset: xarray.core.dataset.Dataset, plot_sites: list, raw_var='raw', raw_name='Mizuroute Raw', ref_var='ref', ref_name='Reference', bc_vars=[], bc_names=[], plot_colors=['blue', 'purple', 'orange', 'red'], title='Absolute Error in Flow$(m^3/s)$', fontsize_title=80, fontsize_legend=68, alpha=0.05, fontsize_subplot=60, fontsize_tick=45, fontcolor='black', pos_cone_guide=False, neg_cone_guide=False, symmetry=True)[source]¶
Difference from reference flows before and after correction.
Plots differences between the raw and reference flows on the horizontal and differences between the bias corrected and refrerence on the vertical. This compares corrections needed before and after the bias correction method is applied.
- Parameters
flow_dataset (xarray.Dataset) – contains raw, reference, and bias corrected flows.
plot_sites (list) – Sites to be plotted, expected as the seg coordinate in flow_dataset.
raw_var (str, optional) – The string to access the raw flows in flow_dataset, defaults as raw.
raw_name (str, optional) – Label for the raw flows in the legend, defaults as ‘Mizuroute Raw’.
ref_var (str, optional) – The string to access the reference flows in ref, defaults as ‘upstream_ref_flow’.
ref_name (str, optional) – Label for the reference flows in the legend, defaults as ‘Reference’.
bc_vars (list) – The strings to access the bias corrected flows in flow_dataset.
bc_names (list) – Labels for the bias corrected flows in the legend, expected in the same order as bc_vars.
plot_colors (list, optional) – Colors to be plotted for each site in plot sites. Defaults as [‘blue’, ‘purple’, ‘orange, ‘red’].
fontsize_title (int, optional) – Fontsize of the title, defaults as 80.
fontsize_legend (int, optional) – Fontsize of the legend, defaults as 68.
fontsize_subplot (int, optional) – Fontsize of the subplots, defaults as 60.
fontsize_tick (int, optional) – Fontsize of the ticks, defaults as 45.
fontcolor (str, optional) – Color of the font, defaults as ‘black’.
pos_cone_guide (boolean, optional) – If True, plots a postive 1:1 line through the origin for reference.
neg_cone_guide (boolean, optional) – If True, plots a negative 1:1 line through the origin for reference.
symmetry (boolean, optional) – If True, the plot axis are symmetrical about the origin (default). If False, plotting limits will minimize empty space while not losing any data.
- bmorph.evaluation.plotting.compare_mean_grouped_CPD(flow_dataset: xarray.core.dataset.Dataset, plot_sites: list, grouper_func, raw_var: str, raw_name: str, ref_var: str, ref_name: str, bc_vars: list, bc_names: list, plot_colors: list, subset_month=None, units='Mean Annual Flow [$m^3/s$]', figsize=(20, 20), sharex=False, sharey=False, pp_kws={'postype': 'cunnane'}, fontsize_title=80, fontsize_legend=68, fontsize_subplot=60, fontsize_tick=45, fontsize_labels=80, linestyles=['-', '-', '-'], markers=['.', '.', '.'], markersize=30, alpha=1, legend_bbox_to_anchor=(1, 1), fig=None, axes=None, start_ax_index=0, tot_plots=None)[source]¶
- Cumulative Probability Distributions
plots the CPD’s of the raw, reference, and bias corrected flows on a probability axis
- flow_datasetxarray.Dataset
Contatains raw, reference, and bias corrected flows.
- plot_siteslist
A list of sites to be plotted.
- grouper_func
Function to group a pandas.DataFrame index by to calculate the mean of the grouped values.
- raw_varstr
The string to access the raw flows in flow_dataset.
- raw_namestr
Label for the raw flows in the legend.
- ref_varstr
The string to access the reference flows in flow_dataset.
- ref_namestr
Label for the reference flows in the legend.
- bc_varslist
List of strings to access the bias corrected flows in flow_dataset.
- bc_nameslist
List of labels for the bias corrected flows in the legend.
- plot_colorslist
Contains the colors to be plotted for raw_var, ref_var, and bc_vars, respectively.
- subset_month: int, optional
The integer date of a month to subset out for plotting, (ex: if you want to subset out January, enter 1). Defaults as None to avoid subsetting and use all the data in the year.
- unitsstr, optional
Vertical axis’s label for units, defaults as r’Mean Annual Flow [$m^3/s$]’.
- pp_kwsdict, optional
Plotting position computation as specified by https://matplotlib.org/mpl-probscale/tutorial/closer_look_at_plot_pos.html. Defaults as dict(postype=’cunnane’) for cunnane plotting positions.
- fontsize_titleint, optional
Font size for the plot title, defaults as 80.
- fontsize_legendint, optional
Font size for the plot legend, defaults as 68.
- fontsize_subplotint, optional
Font size for the plot subplot text, default as 60.
- fontsize_tickint, optional
Font size for the plot ticks, defaults as 45.
- fontsize_labelsint, optional
Font size for the horizontal and vertical axis labels, defaults as 80.
- linestyleslist, optional
Linestyles for ploting raw_var, ref_var, and bc_vars, respectively. Defaults as [‘-‘,’-‘,’-‘], expecting one of each.
- markerslist, optional
Markers for ploting raw_var, ref_var, and bc_vars, respectively. Defaults as [‘.’,’.’,’.’], expecting one of each.
- markersizeint, optional
Size of the markers for plotting, defaults as 30.
- alphafloat, optional
Alpha transparency value for plotting, where 1 is opaque and 0 is transparent.
- legend_bbox_to_anchortuple, optional
Box that is used to position the legend to the final axes. Defaults as (1,1). Modify this is the legend does not plot where you desire it to be.
- figmatplotlib.figure, optional
matplotlib figure object to plot on, defaulting as None and creating a new object unless otherwise specified.
- axesmatplotlib.axes, optional
Array-like of matplotlib axes objet to plot multiple plots on, defaulting as None and creating a new object unless otherwise specified.
- start_ax_indexint, optional
If the plots should not be plotted starting at the first ax in axes, specifiy the index that plotting should begin on. Defaults as None, assuming plotting should begin from the first ax.
- tot_plotsint, optional
If more plotting is to be done than with the total data to be provided, describe how many total plots there should be. Defalts as None, assuming plotting should begin form the first ax.
- bmorph.evaluation.plotting.correction_scatter(site_dict: dict, raw_flow: pandas.core.frame.DataFrame, ref_flow: pandas.core.frame.DataFrame, bc_flow: pandas.core.frame.DataFrame, colors: list, title='Flow Residuals', fontsize_title=80, fontsize_legend=68, fontsize_subplot=60, fontsize_tick=45, fontcolor='black', pos_cone_guide=False, neg_cone_guide=False)[source]¶
Difference from reference flows before and after correction.
Plots differences between the raw and reference flows on the horizontal and differences between the bias corrected and refrerence on the vertical. This compares corrections needed before and after the bias correction method is applied.
- Parameters
site_dict (dict) – Expects {subgroup name: list of segments in subgroup} how sites are to be seperated.
raw_flow (pandas.DataFrame) – Contains flows before correction.
ref_flow (pandas.DataFrame) – Contains the reference flows to compare raw_flow and bc_flow.
bc_flow (pandas.DataFrame) – Contains flows after correction.
colors (list) – Colors to be plotted for each site in site_dict.
title (str, optional) – Title label for the plot, defaults as ‘Flow Residuals’.
fontsize_title (int, optional) – Fontsize of the title, defaults as 80.
fontsize_legend (int, optional) – Fontsize of the legend, defaults as 68.
fontsize_subplot (int, optional) – Fontsize of the subplots, defaults as 60.
fontsize_tick (int, optional) – Fontsize of the ticks, defaults as 45.
fontcolor (str, optional) – Color of the font, defaults as ‘black’.
pos_cone_guide (boolean, optional) – If True, plots a postive 1:1 line through the origin for reference.
neg_cone_guide (boolean, optional) – If True, plots a negative 1:1 line through the origin for reference.
- bmorph.evaluation.plotting.create_adj_mat(topo: xarray.core.dataset.Dataset) numpy.ndarray [source]¶
Forms the adjacency matrix for the graph of the topography.
Note that this is independent of whatever the segments are called, it is a purely a map of the relative object locations. :Parameters: topo (xarray.Dataset) – Describes the topograph of the river network.
- Returns
An adjacency matrix describing the river network.
- Return type
numpy.ndarray
- bmorph.evaluation.plotting.create_nxgraph(adj_mat: numpy.ndarray) networkx.classes.graph.Graph [source]¶
Creates a NetworkX Graph object given an adjacency matrix.
- Parameters
adj_mat (numpy.ndarray) – Adjacency matrix describing the river network.
- Returns
NetworkX Graph of respective nodes.
- Return type
networkx.graph
- bmorph.evaluation.plotting.custom_legend(names: List, colors=['Maroon', 'Navy', 'Black', 'Orange', 'Yellow', 'Blue', 'Grey', 'Pink', 'Lavender'])[source]¶
Creates a list of patches to be passed in as handles for the plt.legends function.
- Parameters
names (list) – Legend names.
colors (list) – A list of the colors corresponding to names.
- Returns
Handle parameter for matplotlib.legend.
- Return type
handles
- bmorph.evaluation.plotting.determine_row_col(n: int, pref_rows=True)[source]¶
Determines rows and columns for rectangular subplots
Calculates a rectangular subplot layout that contains at least n subplots, some may need to be turned off in plotting. If a square configuration is possible, then a square configuration will be proposed. This helps automate the process of plotting a variable number of subplots.
- Parameters
n (int) – Total number of plots.
pref_rows (boolean) – If True, and only a rectangular arrangment is possible, then put the longer dimension in n_rows. If False, then it is placed in the n_columns.
- Returns
n_rows (int) – Number of rows for matplotlib.subplot.
n_columns (int) – Number of columns for matplotlib.subplot.
- bmorph.evaluation.plotting.diff_maxflow_plotter(observed: pandas.core.frame.DataFrame, names: list, colors: list, *models: pandas.core.frame.DataFrame)[source]¶
Plots box plots of numerous models grouped by site.
- Parameters
observed (pandas.Dataframe) – a dataframe containing observations
names (list) – List of the model names.
colors (list) – List of colors to be plotted.
*models (List[pandas.DataFrame]) – Any number of pandas.DataFrame objects to be evaluated.
- Returns
- Return type
matplotlib.figure, matplotlib.axes
- bmorph.evaluation.plotting.diff_maxflow_sites(observed: pandas.core.frame.DataFrame, predicted: pandas.core.frame.DataFrame)[source]¶
Calculates difference in maximum flows on a hydrologic year and site-by-site basis.
- Parameters
observed (pandas.DataFrame) – Dataframe containing all observations.
predicted (pandas.DataFrame) – Dataframe containing all predictions.
- Returns
DataFrame containing the difference in maximum flows.
- Return type
pandas.DataFrame
- bmorph.evaluation.plotting.draw_dataset(topo: xarray.core.dataset.Dataset, color_measure: pandas.core.series.Series, cmap=<matplotlib.colors.LinearSegmentedColormap object>)[source]¶
“Plots the river network through networkx.
Draws a networkx graph from a topological xrarray.Dataset and color codes it based on a pandas.Series.
- Parameters
topo (xarray.Dataset) – Contains river segment identifications and relationships.
color_measure (pandas.Series) – Indicies are concurrent with the number of segs in topo. Typically this contains statistical information about the flows that will be color coded by least to greatest value.
cmap (matplotlib.colors.LinearSegmentedColormap, optional) – Colormap to be used for coloring the SimpleRiverNewtork plot. This defaults as ‘coolwarm_r’.
- bmorph.evaluation.plotting.find_all_upstream(topo: xarray.core.dataset.Dataset, segID: int, return_segs: list = []) numpy.ndarray [source]¶
Finds all upstream river segments for a given river segment from the xarray.Dataset.
- Parameters
topo (xarray.Dataset)
segID (int)
return_segs (list)
- Returns
- Return type
numpy.ndarray
- bmorph.evaluation.plotting.find_index_water_year(data: pandas.core.frame.DataFrame) int [source]¶
Finds the index of the first hydrologic year.
- Parameters
data (pd.DataFrame) – Flow timeseries with a DateTime index.
- Returns
Index of the first hydrologic year.
- Return type
int
- bmorph.evaluation.plotting.find_upstream(topo: xarray.core.dataset.Dataset, segID: int, return_segs: list = [])[source]¶
Finds what river segment is directly upstream from the xarray.Dataset.
- Parameters
topo (xarray.Dataset) – Contains river network topography. Expecting each river segment’s immeditate downstream river segment is designated by ‘Tosegment’/
segID (int) – Current river segment identification number.
return_segs (list) – River segment identification numbers upstream from segID. This defaults as an empty list to be filled by the method.
- bmorph.evaluation.plotting.kl_divergence_annual_compare(flow_dataset: xarray.core.dataset.Dataset, sites: list, raw_var: str, raw_name: str, ref_var: str, ref_name: str, bc_vars: list, bc_names: list, plot_colors: list, title='Annual KL Diveregence Before/After Bias Correction', fontsize_title=40, fontsize_tick=30, fontsize_labels=40, fontsize_legend=30, showfliers=False, sharex=True, sharey='row', TINY_VAL=1e-06, figsize=(30, 20), show_y_grid=True)[source]¶
Kullback-Liebler Divergence compared before and after bias correction as boxplots.
Plots the KL divergence for each year per site as KL(P_{ref} || P_{raw}) and KL( P_{ref} || P_{bc}).
- Parameters
flow_dataset (xarray.Dataset) – Contains raw (uncorrected), reference (true), and bias corrected flows.
sites (list) – Contains all the sites to be plotted as included in flow_dataset, (note that if the number of sites to be plotted is square or rectangular, the last site will not be plotted to save room for the legend).
raw_var (str) – Accesses the raw flows in flow_dataset.
raw_name (str) – Label for the raw flows in the legend and horizontal labels corresponding to raw_var.
ref_var (str) – Accesses the reference flows in flow_dataset.
ref_name (str) – Label the reference flows in the legend and horizontal labels corresponding to ref_var.
bc_vars (list) – String(s) to access the bias corrected flows in flow_dataset.
bc_names (list) – Label(s) for the bias corrected flows in the legend and horizontal labels, corresponding to each element in bc_vars.
plot_colors (list) – Colors to be plotted for the raw and the bias corrected flows, respectively.
title (str, optional) – Title to be plotted, defaults as “Annual KL Diveregence Before/After Bias Correction”.
fontsize_title (int, optional) – Font size of the title, defaults as 40.
fontsize_tick (int, optional) – Font size of the ticks, defaults as 30.
fontsize_labels (int, optional) – Font size of the labels, defaults as 40.
fontsize_legend (int, optional) – Font size of the legend text, defaults as 30.
showfliers (boolean, optional) – Whether to include fliers in the boxplots, defaults as False.
sharex (boolean or str, optional) – Whether the horizontal axis is shared, defaults as True.
sharey (boolean or str, optional) – Whether the vertical axis is shared, defaults as ‘row’ to share the vertical axis in the same row.
TINY_VAL (float, optional) – Used to ensure there are no zero values in the data because zero values cause unsual behavior in calculating the KL Divergence. Defaults as 1E-6.
figsize (tuple, optional) – Figure size following maptlotlib connventions, defaults as (30,20).
show_y_grid (boolean, optional) – Whether to plot y grid lines, defaults as True.
- Returns
- Return type
maptlotlib.figure, matplotlib.axes
- bmorph.evaluation.plotting.log10_1p(x: numpy.ndarray)[source]¶
Return the log10 of one plus the input array, element-wise.
- Parameters
x (numpy.ndarray) – An array of values greater than -1. If values are less than or equal to -1, then a domain error will occur in computing the logarithm.
- Returns
y – Array of the values having the log10(element+1) computer.
- Return type
numpy.ndarray
- bmorph.evaluation.plotting.norm_change_annual_flow(sites: list, before_bc: pandas.core.frame.DataFrame, after_bc: pandas.core.frame.DataFrame, colors=<class 'list'>, fontsize_title=60, fontsize_labels=40, fontsize_tick=30)[source]¶
Normalized change in annual flow volume.
Plots a series of subplots containing bar charts that depict the differnece in normalized annual flow volume due to bias correction.
- Parameters
sites (list) – String names of all the sites to be plotted, matching sites contained in the DataFrames before_bc and after_bc.
before_bc (pandas.DataFrame) – Contains flows, (not aggregated), before bias correction is applied.
after_bc (pandas.DataFrame) – Contains flows, (not aggregated), after bias correction is applied.
colors (list) – Ccolors to be used for each site’s subplot in the same order as sites, (does not have to be unique).
fontsize_title (int, optional) – Font size of the title. Defaults as 60.
fontsize_labels (int, optional) – Font size of the labels. Defaults as 40.
fontsize_tick (int, optional) – Font size of the ticks. Defaults as 30.
- bmorph.evaluation.plotting.organize_nxgraph(topo: networkx.classes.graph.Graph)[source]¶
Orders the node positions into a hierarchical structure.
Based on the “dot” layout and given topography. :Parameters: topo (xarray.Dataset) – Contains river segment identifications and relationships.
- Returns
- Return type
networkx.positions
- bmorph.evaluation.plotting.pbias_compare_hist(sites: list, raw_flow: pandas.core.frame.DataFrame, ref_flow: pandas.core.frame.DataFrame, bc_flow: pandas.core.frame.DataFrame, grouper=TimeGrouper(freq=<YearEnd: month=12>, axis=0, sort=True, closed='right', label='right', how='mean', convention='e', origin='start_day'), total_bins=None, title_freq='Yearly', fontsize_title=90, fontsize_subplot_title=60, fontsize_tick=40, fontsize_labels=84, x_extreme=150)[source]¶
Histograms comparing percent bias before/after bias correction.
Creates a number of histogram subplots by each given sites that plot percent bias both before and after bias correction.
- Parameters
sites (list) – Sites corresponding to the columns of the flow DataFrames, raw_flow, ref_flow, and bc_flow, to be plotted.
raw_flow (pandas.DataFrame) – Flows before bias correction.
ref_flow (pandas.DataFrame) – Reference flows for comparison as true values.
bc_flow (pandas.DataFrame) – Flows after bias correction.
grouper (pandas.TimeGrouper) – How flows should be grouped for bias correction, defaults as yearly.
total_bins (int, optional) – Number of bins to use in the histogram plots. if none specified, defaults to the floored square root of the number of pbias difference values.
title_freq (str, optional) – An adjective description of the frequency with which the flows are grouped corresponding to grouper. Defaults as ‘Yearly’.
fontsize_title (int, optional) – Font size of the title, defaulting as 90.
fontsize_subplot_title (int, optional) – Font size of the subplot title, defaulting as 60.
fontsize_ticler (int, optional) – Font size of the ticks, defaulting as 40.
fontsize_labels (int, optional) – Font size of the labels, defaulting as 84.
x_extreme (float, optional) – Greatest magnitude on the horizontal axis to specify the range, defaulting as 150, which results in a range of (-150, 150). This is useful if desiring to zoom in closer to the origin and exclued outlying percent biases.
- bmorph.evaluation.plotting.pbias_diff_hist(sites: list, colors: list, raw_flow: pandas.core.frame.DataFrame, ref_flow: pandas.core.frame.DataFrame, bc_flow: pandas.core.frame.DataFrame, grouper=TimeGrouper(freq=<MonthEnd>, axis=0, sort=True, closed='right', label='right', how='mean', convention='e', origin='start_day'), total_bins=None, title_freq='Monthly', fontsize_title=90, fontsize_subplot_title=60, fontsize_tick=40, fontsize_labels=84)[source]¶
Histograms of differences in percent bias before/after bias correction.
Creates a number of histogram subplots by each given site that plot the difference in percent bias before and after bias correction.
- Parameters
sites (list) – Sites that are the columns of the flow DataFrames, raw_flow, ref_flow, and bc_flow.
colors (list) – Colors to plot the sites with, (do not have to be different), that are used in the same order as the list of sites.
raw_flow (pandas.DataFrame) – Contains flows before bias correction.
ref_flow (pandas.DataFrame) – Contains reference flows for comparison as true values.
bc_flow (pandas.DataFrame) – Contains flows after bias correction.
grouper (pandas.TimeGrouper, optional) – How flows should be grouped for bias correction. This defaults as monthly.
total_bins (int, optional) – Number of bins to use in the histogram plots. If none specified, defaults to the floored square root of the number of pbias difference values.
title_freq (str, optional) – An adjective description of the frequency with which the flows are grouped, should align with grouper, although there is no check to verify this. This defaults as ‘Monthly’.
fontsize_title (int, optional) – Font size of the title. Defaults as 90.
fontsize_subplot_title (int, optional) – Font size of the subplots. Defaults as 60.
fontsize_tick (int, optional) – Font size of the ticks. Defaults as 40.
fontsize_labels (int, optional) – Font size of the labels. Defaults as 84.
- bmorph.evaluation.plotting.pbias_plotter(observed: pandas.core.frame.DataFrame, names: list, colors: list, *models: pandas.core.frame.DataFrame)[source]¶
Plots box plots of numerous models grouped by site.
- Parameters
observed (pandas.Dataframe) – Dataframe containing observations.
names (list) – List of the model names.
colors (list) – List of colors to be plotted.
*models (List[pandas.DataFrame]) – Any number of pandas.DataFrame objects to be evaluated.
- Returns
- Return type
matplotlib.figure, matplotlib.axes
- bmorph.evaluation.plotting.pbias_sites(observed: pandas.core.frame.DataFrame, predicted: pandas.core.frame.DataFrame)[source]¶
Calculates percent bias on a hydrologic year and site-by-site basis.
- Parameters
observed (pandas.DataFrame) – Dataframe containing all observations.
predicted (pandas.DataFrame) – Dataframe containing all predictions.
- Returns
Dataframe contain the percent bias computed.
- Return type
pandas.DataFrame
- bmorph.evaluation.plotting.plot_reduced_flows(flow_dataset: xarray.core.dataset.Dataset, plot_sites: list, reduce_func=<function mean>, interval='day', statistic_label='Mean', units_label='$(m^3/s)$', title_label='Annual Mean Flows', raw_var='IRFroutedRunoff', raw_name='Mizuroute Raw', ref_var='upstream_ref_flow', ref_name='upstream_ref_flow', bc_vars=[], bc_names=[], fontsize_title=24, fontsize_legend=20, fontsize_subplot=20, fontsize_tick=20, fontcolor='black', figsize_width=20, figsize_height=12, plot_colors=['grey', 'black', 'blue', 'red'], return_reduced_flows=False)[source]¶
Creates a series of subplots plotting statistical day of year flows per gauge site.
- Parameters
flow_dataset (xarray.Dataset) – Contatains raw, reference, and bias corrected flows.
plot_sites (list) – Sites to be plotted.
reduce_func (function, optional) – A function to apply to flows grouped by interval, defaults as np.mean.
interval (str, optional) – What time interval annual reduce_func should be computed on. Currently supported is day for dayofyear (default), week for weekofyear, and month for monthly.
statistic_label (str, optional) – Label for the statistic representing the reduce_func, defaults as ‘Mean’ to fit reduce_func as np.mean.
units_label (str, optional) – Label for the units of flow, defaults as r`$(m^3/s)$`.
title_label (str) – Lable for the figure title representing the reduce_func, defaults as f’Annual Mean Flows’ to fit reduce_func as np.mean.
raw_var (str, optional) – The string to access the raw flows in flow_dataset, defaults as ‘IRFroutedRunoff’.
raw_name (str, optional) – Label for the raw flows in the legend, defaults as ‘Mizuroute Raw’.
ref_var (str, optional) – The string to access the reference flows in flow_dataset, defaults as ‘upstream_ref_flow’.
ref_name (str, optional) – Label for the reference flows in the legend, defaults as ‘upstream_ref_flow’.
bc_vars (list) – The strings to access the bias corrected flows in flow_dataset.
bc_names (list) – Labels for the bias corrected flows in the legend, expected in the same order as bc_vars.
plot_colors (list, optional) – Colors to be plotted for raw_var, ref_var, bc_vars respectively. Defaults as [‘grey’, ‘black’, ‘blue’, ‘red’].
return_reduced_flows (boolean, optional) – If True, returns the reduced flows as calculated for plotting, defaults as False. This is typically used for debugging purposes.
fontsize_title (int, optional) – Font size of the plot title, defaults as 80.
fontsize_legend (int, optional) – Font size of the plot legend, defaults as 68.
fontsize_subplot (int, optional) – Font size for the subplots, defaults as 60.
fontsize_tick (int, optional) – Font size of the ticks, defaults 45.
fontcolor (str, optional) – Color of the font, defaults as ‘black’
figsize_width (int, optional) – Width of the figure, defaults as 70.
figusize_height (int, optional) – Height of the figure, defaults as 30.
- Returns
If return_reduced_flows is False, matplotlib.figure and matplotlib.axes, otherwise the reduced_flows are returned as the xarray.Dataset.
- Return type
xarray.Dataset or (matplotlib.figure, matplotlib.axes)
- bmorph.evaluation.plotting.plot_residual_overlay(flows: pandas.core.frame.DataFrame, upstream_sites: list, downstream_site: str, start_year: int, end_year: int, ax=None, fontsize_title=40, fontsize_labels=60, fontsize_tick=30, linecolor='k', alpha=0.3)[source]¶
Plots flow upstream/downstream residuals overlayed across one year span.
Plots residuals from each hydrologic year on top of each other with a refence line at zero flow. Residuals are calculated as downstream flows - sum(upstream flows).
- Parameters
flows (pandas.DataFrame) – All flows to be used in plotting.
upstream_sites (list) – Strings of the site names stored in flows to aggregate.
downstream_sites (str) – Name of the downstream site stored in flows that will have the upstream_sites subtracted from it.
start_year (int) – The starting year to plot.
end_year (int) – The year to conclude on.
ax (matplotlib.axes, optional) – Axes to plot on. If none is specified, a new one is created.
fontsize_title (int, optional) – Font size of the title.
fontsize_labels (int, optional) – Font size of the labels.
fontsize_tick (int, optional) – Font size of the ticks.
linecolor (str, optional) – Color of the lines plotted. Defaults as ‘k’ for black.
alpha (float, optional) – Transparency of the lines plotted. Defaults as 0.3 to help see how residuals line up across many years.
- Returns
- Return type
matplotlib.axes
- bmorph.evaluation.plotting.plot_spearman_rank_difference(flow_dataset: xarray.core.dataset.Dataset, gauge_sites: list, start_year: str, end_year: str, relative_locations_triu: pandas.core.frame.DataFrame, basin_map_png, cmap=<matplotlib.colors.LinearSegmentedColormap object>, blank_plot_color='w', fontcolor='black', fontsize_title=60, fontsize_tick=30, fontsize_label=45)[source]¶
Creates a site-to-site rank correlation difference comparison plot with a map of the basin.
- Parameters
flow_dataset (xarray.Dataset) – Contains raw as ‘raw_flow’ and bias corrected as ‘bias_corrected_total_flow’ flow values, times of the flow values, and the names of the sites where those flow values exist
gauge_sites (list) – Gauge sites to be plotted.
start_year (str) – String formatted as ‘yyyy-mm-dd’ to start rank correlation window.
end_year (str) – String formatted as ‘yyyy-mm-dd’ to end rank correlation window.
relative_locations_triu (pandas.DataFrame) – Denotes which sites are connected with a ‘1’ and has the lower triangle set to ‘0’.
basin_map_png (png file) – The basin map with site values marked.
cmap (matplotlib.colors.LinearSegmentedColormap, optional) – Colormap to be used for coloring the SimpleRiverNewtork plot. This defaults as ‘coolwarm_r’.
blank_plot_color (str, optional) – Color to set the lower extremes in cmap should be to keep extreme values from skewing the color map and hiding values. This is defaulted as ‘w’ for white. It should appear that this color matches the background plot color to make it appear as if no value is plotted here.
font_color (str, optional) – Color of the font, defaulted as ‘black’.
fontsize_title (int, optional) – Font size of the title, defaults as 60.
fontsize_tick (int, optional) – Font size of the ticks, defaults as 30.
fontsize_label (int, optional) – Font size of the labels, defaults as 45.
- bmorph.evaluation.plotting.rmseFracPlot(data_dict: dict, obs_key: str, sim_keys: list, sites=[], multi=True, colors=['Maroon', 'Navy', 'Black', 'Orange', 'Yellow', 'Blue', 'Grey', 'Pink', 'Lavender'])[source]¶
Root mean square values calculated by including descending values one-by-one.
- Parameters
data_dict (dict) – Expecting {“Data Name”: data pandas.DataFrame}.
obs_key (str) – Dictionary key for the computations dictionary that accesses the observations to be used as true in calculating root mean squares.
sim_keys (list) – Dictionary keys accessing the simulated DataFrames in computations, used in predictions in calculating root mean squares.
sites (list) – Site(s) to be compared in the plot, can have a size of 1. If multi is set to False and this is not changed to a single site, then the first value in the list will be chosen.
multi (boolean, optional) – Whether the plot uses data from multiple sites or a single site.
colors (list, optional) – Colors as strings to be plotted from. Plotting colors are different for each correction DataFrame, but same across sites for a singular correction. An error will be thrown if there are more sim_keys then colors.
- bmorph.evaluation.plotting.scatter_series_axes(data_x, data_y, label: str, color: str, alpha: float, ax=None) matplotlib.pyplot.axes [source]¶
Creates a scatter axis for plotting.
- Parameters
data_x (array-like) – Data for the x series.
data_y (array-like) – Data for the y series.
label (str) – Name for the axes.
color (str) – Color for the markers.
alpha (float) – Transparency for the markers.
- Returns
- Return type
matplotlib.axes
- bmorph.evaluation.plotting.site_diff_scatter(predictions: dict, raw_key: str, model_keys: list, compare: dict, compare_key: str, site: str, colors=['Maroon', 'Navy', 'Black', 'Orange', 'Yellow', 'Blue', 'Grey', 'Pink', 'Lavender'])[source]¶
Creates a scatter plot of Raw-BC versus some measure.
- Parameters
predictions (dict) – Expects {‘Prediction Names’ : Prediction pandas.DataFrame}. ‘Prediction Names’ will be printed in the legend.
raw_key (str) – The key for the predictions dictionary that directs to the raw data that each model will be subtracting.
model_keys (list) – A list of dictoionary keys pertaining to the correction models that are wanting to be plotted.
compare (dict) – Expecting {‘Measure name’ : measure pandas.DataFrame}. These are what is being plotted against on the horizontal-axis.
compare_key (str) – The dictionary key for the measure desired in the compare dictionary. ‘compare_key’ will be printed on the horizontal axis.
site (str) – A single site designiation to be examined in the plot. This will be listed as the title of the plot.
colors (List[str], optional) – Colors as strings to be plotted from.
- Returns
- Return type
matplotlib.figure, matplotlib.axes
- bmorph.evaluation.plotting.spearman_diff_boxplots_annual(raw_flows: pandas.core.frame.DataFrame, bc_flows: pandas.core.frame.DataFrame, site_pairings, fontsize_title=40, fontsize_tick=30, fontsize_labels=40, subtitle=None, median_plot_color='red')[source]¶
Annual difference in spearman rank as boxplots.
Creates box plots for each stide pairing determing the difference in spearman rank for each year between the raw and bias corrected data.
- Parameters
raw_flows (pandas.DataFrame) – Raw flows before bias correction with sites in the columns and time in the index.
bc_flows (pandas.DataFrame) – Bias corrected flows with sites in the columns and time in the index.
site_pairings (List[List[str]]) – List of list of string site pairs e.g. [[‘downstream_name’,’upstream_name’],…]. This is used to organize which sites should be paired together for computing the spearman rank difference.
fontsize_title (int, optional) – Font size of the title, defaults as 40.
fontsize_tick (int, optional) – Font size of the ticks, defaults as 30.
fontsize_labels (int, optional) – Font size of the labels, defaults as 40.
subtitle (str, optional) – Subtitle to include after “Annual Chnage in Speraman Rank: “. If no subtitle is is specified, none is included and only the title is plotted.
median_plot_color (str) – Color to plot the boxplot’s median as, defaults as ‘red’.
- bmorph.evaluation.plotting.spearman_diff_boxplots_annual_compare(flow_dataset: xarray.core.dataset.Dataset, site_pairings, raw_var: str, bc_vars: list, bc_names: list, plot_colors: list, showfliers=True, fontsize_title=40, fontsize_tick=25, fontsize_labels=30, figsize=(20, 20), sharey='row')[source]¶
Annual difference in spearman rank as boxplots.
Creates box plots for each site pairing determining the difference in spearman rank for each year between the raw and the bias corrected data.
- Parameters
flow_dataset (xarray.Dataset) – Contains raw (uncorrected), reference (true), and bias corrected flows.
site_pairings (List[List[str]]) – List of list of string site pairs e.g. [[‘downstream_name’,’upstream_name’],…]. This is used to organize which sites should be paired together for computing the spearman rank difference.
raw_var (str) – Accesses the raw (uncorrected) flows in flow_dataset.
bc_vars (list) – Strings to access the bias corrected flows in flow_dataset.
bc_names (list) – Labels for the bias corrected flows from flow_dataset, corresponding to each element in bc_vars in the same order.
plot_colors (list) – Colors that are in the same order as the bc_vars and bc_names to be used in plotting.
showfliers (boolean, optional) – Whether to show fliers on the boxplots, defaults as True.
fontsize_title (int, optional) – Font size of the title, defaults as 40.
fontsize_tick (int, optional) – Font size of the ticks, defaults as 25.
fontsize_lables (int, optional) – Font size of the labels, defaults as 30.
figsize (tuple) – Figure size following matplotlib connventions, defaults as (20, 20).
sharey (boolean or str, optional) – Whether or how the vertical axis are to be shared, defaults as ‘row’ to have vertical axis in the same row shared.
- Returns
- Return type
matplotlib.figure, matplotlib.axes
- bmorph.evaluation.plotting.stat_corrections_scatter2D(computations: dict, baseline_key: str, cor_keys: list, uncor_key: str, sites=[], multi=True, colors=['Maroon', 'Navy', 'Black', 'Orange', 'Yellow', 'Blue', 'Grey', 'Pink', 'Lavender'])[source]¶
Creates a scatter plot of the flow before/after corrections relative to observations.
- Parameters
computations (dict) – Expecting {“Correction Name”: correction pandas.DataFrame}.
baseline_key (str) – Contains the dictionary key for the computations dictionary that accesses what baseline the corrections should be compared to. This is typically observations.
cor_keys (list) – Dictionary keys accessing the correction DataFrames in computations. These will be printed in the legend.
uncor_key (str) – The dictionary key that accesses the uncorrected data in computations.
sites (list) – Site(s) to be compared in the plot, can have a size of 1. If multi is set to False and this is not changed to a single site, then the first value in the list will be chosen.
multi (boolean, optional) – Determines whether the plot uses data from multiple sites or a single site, defaults as True.
colors (List[str], optional) – Colors as strings to be plotted from. Plotting colors are different for each correction DataFrame, but same across sites for a singular correction. An error will be thrown if there are more cor_keys then colors.
- Returns
- Return type
matplotlib.figure, matplotlib.axes
Simple River Network¶
- class bmorph.evaluation.simple_river_network.SegNode(seg_id, pfaf_code)[source]¶
River segment node used in SimpleRiverNetwork.
Creates a node of a segment to be used in the simple river network.
- Variables
pfaf_code (int) – Pfafstetter code for this river segment.
seg_id (int) – Idenification for this river segment.
upstream (List[SegNode]) – Containing what is directly upstream from this SegNode.
basin_area (float) – Summative Basin Area for this seg.
_end_marker (boolean) – TRUE if this SegNode marks the end of an interbasin during simple_river_network.ecode_pfaf. This variable is only used during the encoding of pfaffstetter codes for the SimpleRiverNetwork and is set to FALSE when not in use. Changing this variable could interfere with SimpleRiverNetwork operations.
encoded (boolean) – TRUE if this SegNode has been fully given a unique pfaf_code within the SimpleRiverNetwork, otherwise it is FALSE.
- class bmorph.evaluation.simple_river_network.SimpleRiverNetwork(topo: xarray.core.dataset.Dataset, pfaf_seed=<class 'int'>, outlet_index=0, max_pfaf_level=42)[source]¶
Psuedo-physical visualization of watershed models.
The SimpleRiverNetwork maps nodes within a given topography to visualize their arragments and simplify different parts of the network to track statistics propogating through the network. This tree network has the root as outlet, parsing upstream for all operations, opposite of the direction of flow.
- Variables
topo (xarray.Dataset) – Dataset describing the topography of the river network.
seg_id_values (list) – All seg_id’s being used in the network that identify river segments in the watershed.
outlet (SegNode) – The end of the river network and start of the SimpleRiverNetwork, aka “root” of the network.
adj_mat (numpy.array) – A square adjacency matrix size N, where N is the length of seg_id_values, that can be used to graph the SimpleRiverNetwork, where the row/column index i corresponds to seg_id i in seg_id_values.
network_graph (networkx.graph) – A networkx graph part of the visual component of the network. More information about NetworkX can be found at https://networkx.org/
network_positions (dictionary) – A dictionary with networkx nodes as keys and positions as values, using in the plotting of the network.
- draw_multi_measure(color_dict, label_map=[])[source]¶
Overlays multiple network plots to compare multiple measures at once.
- find_node(target_id, node: SegNode)[source]¶
Linear search of SimpleRiverNetwork for a specific SegNode.
- find_like_pfaf(node: SegNode, target_pfaf_digits: list, degree: int)[source]¶
Finds nodes based on pfaffstetter codes.
- generate_weight_map()[source]¶
Creates a list proportional upstream area ratios for each seg_id_values.
- generate_pfaf_color_map()[source]¶
Extracts the first pfaffstetter digit of each code for colorcoding.
- generate_node_highlight_map(seg_ids: list)[source]¶
Highlight specific SegNode’s in a SimpleRiverNetwork.
- spawn_srn(spawn_outlet)[source]¶
Creates a new SimpleRiverNetwork from spawn_outlet and upstream of it.
- aggregate_measure(dataset_seg_ids: numpy.ndarray, variable: numpy.ndarray, aggregation_function) pandas.core.series.Series [source]¶
This is a preliminary function.
Aggregates the measure value for the given variable based on how SimpleRiverNetwork has been aggregated and provides a pandas.Series to plot on the SimpleRiverNetwork.
- Parameters
dataset_seg_ids (numpy.ndarray) – Contains all the seg_id values according to the original topology. This should be in the same order seg order as variable.
variable (numpy.ndarray) – Contains all the variable values according to the original topology. This should be in the same order seg order as dataset_seg_ids.
aggregation_function (numpy function) – A function to be passed in on how the aggregated segs should have this variable combined, recommended as a numpy function like np.sum, np.median, np.mean …
- Returns
a pd.Series formated as
- Return type
(seg_id_values_index, aggregated measure)
- aggregate_measure_max(dataset_seg_ids: numpy.ndarray, variable: numpy.ndarray) pandas.core.series.Series [source]¶
This is a preliminary function.
Determines the maximum measure value for the given variable based on how SimpleRiverNetwork has been aggregated and provides a pandas.Series to plot on the SimpleRiverNetwork.
- Parameters
dataset_seg_ids (numpy.ndarray) – Contains all the seg_id values according to the original topology. This should be in the same order seg order as variable.
variable (numpy.ndarray) – Contains all the variable values according to the original topology. This should be in the same order seg order as dataset_seg_ids.
- Returns
A pandas.Series formated as: (seg_id_values_index, aggregated measure)
- Return type
pandas.Series
- aggregate_measure_mean(dataset_seg_ids: numpy.ndarray, variable: numpy.ndarray) pandas.core.series.Series [source]¶
This is a preliminary function.
Determines the mean measure value for the given variable based on how SimpleRiverNetwork has been aggregated and provides a pandas.Series to plot on the SimpleRiverNetwork.
- Parameters
dataset_seg_ids (numpy.ndarray) – Contains all the seg_id values according to the original topology. This should be in the same order seg order as variable.
variable (numpy.ndarray) – Contains all the variable values according to the original topology. This should be in the same order seg order as dataset_seg_ids.
- Returns
A pandas.Series formated as: (seg_id_values_index, aggregated measure)
- Return type
pandas.Series
- aggregate_measure_median(dataset_seg_ids: numpy.ndarray, variable: numpy.ndarray) pandas.core.series.Series [source]¶
This is a preliminary function.
Determines the median measure value for the given variable based on how SimpleRiverNetwork has been aggregated and provides a pandas.Series to plot on the SimpleRiverNetwork.
- Parameters
dataset_seg_ids (numpy.ndarray) – Contains all the seg_id values according to the original topology. This should be in the same order seg order as variable.
variable (numpy.ndarray) – Contains all the variable values according to the original topology. This should be in the same order seg order as dataset_seg_ids.
- Returns
A pandas.Series formated as: (seg_id_values_index, aggregated measure).
- Return type
pandas.Series
- aggregate_measure_min(dataset_seg_ids: numpy.ndarray, variable: numpy.ndarray) pandas.core.series.Series [source]¶
Determines the minimum measure value for the given variable based on how SimpleRiverNetwork has been aggregated and provides a pandas.Series to plot on the SimpleRiverNetwork.
- Parameters
dataset_seg_ids (numpy.ndarray) – Contains all the seg_id values according to the original topology. This should be in the same order seg order as variable.
variable (numpy.ndarray) – Contains all the variable values according to the original topology. This should be in the same order seg order as dataset_seg_ids.
- Returns
A pandas.Series formated as: (seg_id_values_index, aggregated measure)
- Return type
pandas.Series
- aggregate_measure_sum(dataset_seg_ids: numpy.ndarray, variable: numpy.ndarray) pandas.core.series.Series [source]¶
This is a preliminary function.
Determines the sum measure value for the given variable based on how SimpleRiverNetwork has been aggregated and provides a pandas.Series to plot on the SimpleRiverNetwork.
- Parameters
dataset_seg_ids (numpy.ndarray) – Contains all the seg_id values according to the original topology. This should be in the same order seg order as variable.
variable (numpy.ndarray) – Contains all the variable values according to the original topology. This should be in the same order seg order as dataset_seg_ids.
- Returns
A pandas.Series formated as: (seg_id_values_index, aggregated measure)
- Return type
pandas.Series
- append_pfaf(node: bmorph.evaluation.simple_river_network.SegNode, pfaf_digit: int)[source]¶
Adds a pfaffstetter code digit to all upstream nodes.
- Parameters
node (SegNode) – A SegNode to designate the root of the flow tree.
pfaf_digit (int) – The digit to be added to the pfaffstetter codes.
- append_sequential(sequence, base='')[source]¶
Adds odd digits for unbranching stream segments.
Adds odd digits to the pfaf_codes of SegNode’s in a row, or in sequence. this ensures all SegNode’s within the SimpleRiverNetwork have a unique code.
- Parameters
sequence (list) – A list of SegNode’s in a sequence, typically aggregated from find_branch.
base (str) – An addition to the pfaf_code if needing to append the pfafstetter code being appended.
- check_upstream_end_marking(node: bmorph.evaluation.simple_river_network.SegNode)[source]¶
Checks if any directly upstream nodes are marked by _end_marker.
Checks if any nodes directly upstream are _end_marker’s and returns True if so.
- Parameters
node (SegNode) – A SegNode to check directly upstream from.
- Returns
If any directly upstream nodes are marked by _end_marker.
- Return type
boolean
- clear_end_markers(node)[source]¶
Sets all upstream _end_marker’s to False.
Sets all end_mark in nodes at and upstream of node to False.
- Parameters
node (SegNode) – SegNode to start setting _end_marker to False and moving upstream from.
- clear_network()[source]¶
Sets the network to only the oulet SegNode.
Sets the adjacency matrix to an empty array and sets the upstream designation of the outlet to an empty list, clearing the shape of the network upstream of the outlet. This does not reset the topograpghy or seg_id_values, so the original shape can be rebuilt.
- collect_upstream_nodes(node: bmorph.evaluation.simple_river_network.SegNode)[source]¶
Finds all upstream SegNode’s.
Finds all nodes upstream of a node and returns a list of them.
- Parameters
node (SegNode) – A SegNode to collect upstream nodes from.
- Returns
All upstream nodes of node.
- Return type
list
- color_network_graph(measure, cmap, vmax=None, vmin=None)[source]¶
Creats a dictionary and colorbar depicting measure.
- Parameters
measure (pandas.Series, optional) – Describes how colors for each SegNode should be allocated relative to a linear colormap. The index is expected to match the indicies of seg_id_values as a 0:len(seg_id_values)-1 array. If no measure is specified, then colors will be assigned sequentially in order of seg_id_values.
cmap (matplotlib.colors.LinearSegmentedColormap) – Colormap to be used for coloring the SimpleRiverNewtork plot.
vmin (float, optional) – Minimum value for coloring
vmax (float, optional) – Maximum value for coloring
- Returns
color_dict (dict) – Dictionary of {i:color} where i is the index of the SegNode’s seg_id in seg_id_values.
color_bar (ScalarMappable) – A color bar used to plot color values determined by measure for plotting in draw_network.
- count_net_upstream(node: bmorph.evaluation.simple_river_network.SegNode)[source]¶
Inclusively counts number of upstream nodes.
Counts the number of SegNode’s upstream of node, including the original node.
- Parameters
node (SegNode) – A SegNode to begin counting from.
- Returns
Number of SegNode’s upstream.
- Return type
int
- draw_multi_measure(color_dict, label_map=[], node_size=200, font_size=8, font_weight='bold', node_shape='s', linewidths=2, font_color='w', with_labels=False, with_cbar=False, with_background=True)[source]¶
Overlays multiple network plots to compare multiple measures at once.
Plots several networkx plots of user specified transparency for a single SimpleRiverNetwork to compare mutliple measures at once. For example, if dataset_1 is “Blues” and dataset_2 is “Reds”, then a bivariate colormap can be used where shades of purple would represent the combinations of dataset_1 and dataset_2.
- Parameters
color_dict (dict) – Expected as {name: [pandas.Series, cmap, alpha]} to organize which colormap applies to which data.
label_map (list, optional) – Text to be plotted on top of each node in the same order as seg_id_values. There must be a value for each seg_id in seg_id_values and the values must be unique, otherwise an error will arise in plotting.
node_size (float, optional) – Plotting size the nodes, defaulting at 200.
font_size (float, optional) – Font size of the text from label_map on top of each node, defaulted at 8.
font_weight (str, optional) – Font weight of the text from label_map on top of each node, defaulted as bold.
node_shape (str, optional) – Shape of the plotted nodes, defaults as ‘s’ for square. Networkx uses can use any one of ‘so^>v<dph8’.
linewidths (float, optional) – Width of the connecting lines between nodes, defaults as 2.
font_color (str, optional) – Font color of the text from label_map on top of each node, defaulted as w for white.
with_labels (boolean, optional) – Whether labels should be plotted on top of each node, True, or not, False. This is defaulted as False.
with_cbar (boolean, optional) – Whether a colorbar should be plotted right of the network plot, True, or not, False. This is defaulted as False.
with_background (boolean, optional) – Whether a background should be plotted with the network figure, True, or not, False. This is defaulted as True. If desiring to download the image with a transparent background, such as a PNG, then set this to False.
- draw_network(label_map=[], color_measure=None, cmap=<matplotlib.colors.LinearSegmentedColormap object>, node_size=200, font_size=8, font_weight='bold', node_shape='s', linewidths=2, font_color='w', node_color=None, with_labels=False, with_cbar=False, with_background=True, cbar_labelsize=10, vmin=None, vmax=None, edge_color='k', alpha=1, cbar_title='', cbar_label_pad=40, ax=None)[source]¶
Plots the river network through networkx.
Plots the visual component of the SimpleRiverNetwork where spatial connections between river segments can be seen. This graphical tool may not match the topographical shape of the actual river network, but it should be similar. Visualizng how the river segments are connected virtually can help find errors in the construction of large models or locate where analysis only associated with the seg_id of a river segment corresponds to the pseudo-physical network. Plotting the network with labels, highlighting specific nodes, color coding by pfafstetter basin, and other coloring can help visually connect this plot with a topographical plot.
- Parameters
label_map (list, optional) – Text to be plotted on top of each node in the same order as seg_id_values. There must be a value for each seg_id in seg_id_values and the values must be unique, otherwise an error will arise in plotting.
color_measure (pandas.Series, optional) – Describes how colors for each SegNode should be allocated relative to a linear colormap. The index is expected to match the indicies of seg_id_values as a 0:len(seg_id_values)-1 array.
cmap (matplotlib.colors.LinearSegmentedColormap, optional) – Colormap to be used for coloring the SimpleRiverNewtork plot. This is defaulted as matplotlib.cm.get_cmap(‘hsv’), a vibrant set of colors to alert that a more specific colormap has not been specified.
node_size (float, optional) – Plotting size the nodes, defaulting at 200.
font_size (float, optional) – Font size of the text from label_map on top of each node, defaulted at 8.
font_weight (str, optional) – Font weight of the text from label_map on top of each node, defaulted as bold.
node_shape (str, optional) – Shape of the plotted nodes, defaults as ‘s’ for square. Networkx uses can use any one of ‘so^>v<dph8’.
linewidths (float, optional) – Width of the connecting lines between nodes, defaults as 2.
font_color (str, optional) – Font color of the text from label_map on top of each node, defaulted as w for white.
with_labels (boolean, optional) – Whether labels should be plotted on top of each node, True, or not, False. This is defaulted as False.
with_cbar (boolean, optional) – Whether a colorbar should be plotted right of the network plot, True, or not, False. This is defaulted as False.
with_background (boolean, optional) – Whether a background should be plotted with the network figure, True, or not, False. This is defaulted as True. If desiring to download the image with a transparent background, such as a PNG, then set this to False.
cbar_labelsize (float, optional) – Font size of the labels on the colorbar that can be attached in with_cbar being set to True, defaulted as 10.
vmin (float, optional) – Minimum value for coloring
vmax (float, optional) – Maximum value for coloring
edge_color (str, optional) – Node outline color of each node, defaulted as ‘k’ for black.
alpha (float) – Transparancy of each node, where 1 is perfectly opaque and 0 is perfectly transparent. This is primarly useful in draw_multi_measure, where plots are overlayed on top of each other.
cbar_title (str, optional) – Title of the colorbar that can be attached in with_cbar being set to True. This is defaulted as ‘’ to exclude a title.
cbar_label_pad (float, optional) – Padding for the colorbar labels, defaulted as 40.
ax (Matplotlib Axes object, optional) – Draw the network in the specified Matplotlib axes.
- encode_pfaf(root_node=<class 'bmorph.evaluation.simple_river_network.SegNode'>, level=0, max_level=42)[source]¶
Recursively encodes pfafstetter codes on a SimpleRiverNetwork.
- Parameters
root_node (SegNode) – A SegNode from which to start encoding.
level (int) – How many levels deep into recursion the method already is. By default, this starts at 0.
max_level (int) – The maximum number of levels encode_pfaf will run for before raising a RuntimeError. By default, this is set to the arbitrary number 42 as a safety mechanism.
- find_branch(node: bmorph.evaluation.simple_river_network.SegNode)[source]¶
Locates the nearest upstream branch.
Locates a node that branches into 2+ nodes, returning what node branches and any nodes prior to the branch that where in a row.
- Parameters
node (SegNode) – A SegNode to start searching from.
- Returns
branch (SegNode) – The SegNode found with an upstream branch.
sequential_nodes (list) – The list of nodes preceeding and including branch.
- find_like_pfaf(node: bmorph.evaluation.simple_river_network.SegNode, target_pfaf_digits: list, degree: int)[source]¶
Finds nodes based on pfaffstetter codes.
Finds all nodes with the matching digit at the exact same location in pfaf_code and returns all of them in a list.
- Parameters
node (SegNode) – A SegNode to start searching from.
target_pfaf_digits (list) – A list of pfaf_digit to search for in the SimpleRiverNetwork. This can be a list of one element.
degree (int) – How many pfafstetter levels deep should the function look for i.e. if you have degree 2, and a pfaf_code of 1234, it will examin “3” to check if it is a match.
- Returns
A list of odes with like pfaffstetter codes that match the target_pfaf_digits at the input degree.
- Return type
list
- find_node(target_id, node: bmorph.evaluation.simple_river_network.SegNode)[source]¶
Linear search of SimpleRiverNetwork for a specific SegNode.
Searches for and returns a node with the desired seg_id upstream of node.
- Parameters
target_id (int) – A seg_id to search for within the SimpleRiverNetwork.
node (SegNode) – A SegNode to start searching from. The seg_id of this SegNode is checked against target_id.
- Returns
SegNode with seg_id matching target_id. If the desired SegNode could not be found, None is returned.
- Return type
- find_tributary_basins(tributaries)[source]¶
Finds the four tributaries with the largest drainage areas.
- Parameters
tributaries (list) – A list of tributary SegNode’s to be searched.
- Returns
A list of the largest_tributaries found in the list of tributaries given.
- Return type
list
- force_upstream_area(node: bmorph.evaluation.simple_river_network.SegNode)[source]¶
Aggregates upstream basin area regardless of _end_marker.
Calculates the basin area upstream of node of interest, ignoring _end_marker. This does include the area of the node of interest.
- Parameters
node (SegNode) – A SegNode to start from and calculate both its and all upstream SegNode’s aggregate area.
- Returns
Aggregate upstream area.
- Return type
float
- generate_mainstream_map()[source]¶
Highlights the mainstream for plotting in draw_network.
Creates a list of which nodes are part of the mainstream in order of the seg_id_values.
- Returns
int booleans denoting whether a river segment is part of the mainstream, 1, or off the mainstream, 0, corresponding to each seg_id in seg_id_values.
- Return type
list
- generate_node_highlight_map(seg_ids: list)[source]¶
Highlight specific SegNode’s in a SimpleRiverNetwork.
Takes a list of seg_id’s and creates a pandas.Series that highlights the nodes in the list. This is best used as a diagnostic tool, finding where a specific river segment is located on a network map. Using a colormap that has notably different colors on the extremes, such as matplotlib’s “Reds”, is recommended to make highlighted nodes stand out.
- Parameters
seg_ids (list) – A list of seg_id values to mark specific SegNode’s apart from other SegNode’s.
- Returns
A list that will identify these highlighted nodes for draw_network by int booleans, 1 is to be higlighted while 0 is not.
- Return type
list
- generate_pfaf_codes()[source]¶
Creates a list of pfaf_code values corresponding to seg_id_values.
- Returns
pfaf_code values mapped to seg_id_values.
- Return type
list
- generate_pfaf_color_map()[source]¶
Extracts the first pfaffstetter digit of each code for colorcoding.
This prepares a color_measure for draw_network, where each unique first level pfaffstetter basin can be assigned it’s own color by a Colormap. If a SegNode has the pfaf_code “1234”, then its will have the value “1” in returned map. Using a qualitative colormap of 10 distinct colors, such as matplotlib’s “tab10”, is recommended since there are 10 unique pfafstetter digits, (0 to 9).
- Returns
Map of color codes for pfaf_code of each SegNode to the indicies of each SegNode corresponding to its seg_id in seg_id_values. The index should not be reassigned as it is used to match the correct nodes together in draw_network.
- Return type
pandas.Series
- generate_pfaf_map()[source]¶
List of pfaffstetter codes corresponding to seg_id_values.
Creates a list of pfaf_code values in the order of the seg_id_values, including the `seg_id_values. This is a little more cluttered, reccommended only for debugging purposes.
- Returns
pfaf_code values for each corresponding seg_id in seg_id_values.
- Return type
list
- generate_weight_map()[source]¶
Creates a list proportional upstream area ratios for each seg_id_values.
Creates a list of fractional weights equivalent to the node’s upstream area divided by the overall basin_area of the whole SimpleRiverNetwork. these are in order of the seg_id_values.
- Returns
(river segment’s cumulative upstream area)/(total basin area) for each river segment corresponding to the seg_id’s in seg_id_values.
- Return type
list
- net_upstream_area(node: bmorph.evaluation.simple_river_network.SegNode)[source]¶
Aggregates upstream basin area.
Calculates the basin area upstream of node. This does include the area of the node of interest
- Parameters
node (SegNode) – A SegNode to start from and calculate both its and all upstream nodes aggregate area.
- Returns
Aggregate upstream area.
- Return type
float
- parse_upstream(node: bmorph.evaluation.simple_river_network.SegNode)[source]¶
Constructs and connects SegNodes according to the network.
Recursively constructs network by searching what SegNodes are upstream of the current SegNode and updates the current SegNode’s upstream list while also building the adjacency matrix.
- Parameters
node (SegNode) – A SegNode to start building the network from.
- pfaf_aggregate()[source]¶
Aggregates the flow network by one pfafstetter level.
This “rolls up” a SimpleRiverNetwork to simplify the overall map, similar to decreasing the number of lines in a contour plot to make it more legible. If the longest pfaf_code in the network is four digits long, such as “1234” or “5678”, then all of the SegNodes sharing the first three digits will be replaced by a singular SegNode with the pfaf_code of those first three digits. For example: if you have pfaf_code’s “1231”, “1232”, “1233”, and “1234”, then they become a SegNode with the pfaf_code “123”. Basin area for each SegNode is summed to create the basin area of the new SegNode. Aggregating other properties is still in progress.
- reconstruct_adj_mat(node, adj_mat: numpy.ndarray)[source]¶
Rebuilds the adjacency matrix from an existing flow tree.
- Parameters
node (SegNode) – A SegNode to construct the adjacency matrix from
adj_mat (numpy.ndarray) – A square numpy ndarray of zeros originally, the size equal to len(seg_id_values) by len(seg_id_values), to be filled.
- Returns
The reconstructed adjacenecy matrix that needs to be set as the network’s adjacency matrix to actually alter the flow tree.
- Return type
numpy.ndarray
- sort_by_pfaf(nodes: list, degree=<class 'int'>)[source]¶
Sorts a list of SegNodes by pfaf_code.
Sorts a list of SegNode’s in decreasing order of a pfaffstetter digit at the given degree. For example, if you have degree 2, and a pfaf_code of 1234, it will use “3” in sorting.
- Parameters
nodes (list) – A list of SegNode’s to sort.
degree (int) – Which index in a pfaf_code the nodes are to be sorted by
- Returns
The list of sorted nodes.
- Return type
list
- sort_streams(node=<class 'bmorph.evaluation.simple_river_network.SegNode'>)[source]¶
Sorts the mainstream and tributary branches from each other.
Returns which branches are part of the mainstream and which are part of the tributaries based on aggregate upstream area. This is typically used to determine even and odd pfaffsetter basins for encoding.
- Parameters
node (SegNode) – A SegNode to start tracing the mainstream from. this is the “root” of the flow tree.
- Returns
mainstreams (list) – A list of mainstream SegNode’s determined by having the greatest upstream area.
tributaries (list) – A list of tributaries having upstream_area less than the mainstream but still encountered along the way.
- spawn_srn(spawn_outlet)[source]¶
Creates a new SimpleRiverNetwork from spawn_outlet and upstream of it.
A new SimpleRiverNetwork structure is generated from the current network. This is useful if modeling a large watershed and desire to focus on a specific element of it without having to reselect out all the nodes, for example: the Snake River Basin within the Columbia River Basin dataset.
- Parameters
spawn_outlet (int) – The `seg_id`of a SegNode in the current SimpleRiverNetwork to generate from. This creates an outlet that the new tree is to be spawned from.
- Returns
A new SimpleRiverNetwork with the outlet set to spawn_outlet. Properites are transferred from the pervious SimpleRiverNetwork to this one, but any SegNodes not upstream from spawn_outlet are not included in this new one.
- Return type
- update_node_area(node: bmorph.evaluation.simple_river_network.SegNode)[source]¶
Updates the desired node with basin area information.
- Parameters
node (SegNode) – A SegNode to change it and only its basin_area.