API Reference

This page provides an auto-generated summary of the bmorph API. For more details and examples, refer to the relevant chapters in the main part of the documentation.

Core

Workflows

bmorph.core.workflows.apply_blendmorph(raw_upstream_ts, raw_downstream_ts, train_upstream_ts, train_downstream_ts, ref_upstream_ts, ref_downstream_ts, apply_window, raw_train_window, ref_train_window, interval, overlap, blend_factor, raw_upstream_y=None, raw_downstream_y=None, train_upstream_y=None, train_downstream_y=None, ref_upstream_y=None, ref_downstream_y=None, n_smooth_long=None, n_smooth_short=5, bw=3, xbins=200, ybins=10, rtol=1e-06, atol=1e-08, method='hist', train_cdf_min=1e-06, **kwargs)[source]

Bias correction performed by blending bmorphed flows on user defined intervals.

Blendmorph is used to perform spatially consistent bias correction, this function does so on a user-defined interval. This is done by performing bmorph bias correction for each site’s timeseries according to upstream and downstream gauge sites (or proxies) where true flows are known. The upstream and downstream corrected timeseries are then multiplied by fractional weights, blend_factor, that sum to 1 between them so the corrected flows can be combined, or “blended,” into one, representative corrected flow series for the site. It is thereby important to specify upstream and downstream values so bias corrections are performed with values that most closely represent each site being corrected.

Parameters
  • raw_upstream_ts (pandas.Series) – Raw flow timeseries corresponding to the upstream flows.

  • raw_downstream_ts (pandas.Series) – Raw flow timerseries corresponding to the downstream flows.

  • train_upstream_ts (pandas.Series) – Flow timeseries to train the bias correction model with for the upstream flows.

  • train_downstream_ts (pandas.Series) – Flow timeseries to train the bias correction model with for the downstream flows.

  • ref_upstream_ts (pandas.Series) – Observed/reference flow timeseries corresponding to the upstream flows.

  • ref_downstream_ts (pandas.Series) – Observed/reference flow timeseries corresponding to the downstream flows.

  • raw_train_window (pandas.date_range) – Date range to train the bias correction model.

  • apply_window (pandas.date_range) – Date range to apply bmorph onto flow timeseries.

  • ref_train_window (pandas.date_range) – Date range to smooth elements in ‘raw_ts’ and ‘bmorph_ts’.

  • interval (pandas.DateOffset) – Difference between bmorph application intervals.

  • overlap (int) – Total overlap in number of days the CDF windows have with each other, distributed evenly before and after the application window.

  • blend_factor (numpy.array) – An array determining how upstream and downstream bmorphing is proportioned. This is determined by the fill_method used in mizuroute_utils. The blend_factor entries are the proportion of upstream multiplers and totals added with 1-blend_factor of downstream multipliers and totals.

  • n_smooth_long (int, optional) – Number of elements that will be smoothed in raw_ts and bmorph_ts. The nsmooth value in this case is typically much larger than the one used for the bmorph function itself. For example, 365 days.

  • n_smooth_short (int) – Number of elements that will be smoothed when determining CDFs used for the bmorph function itself.

  • raw_upstream_y (pandas.Series, optional) – Raw time series of the second time series variable for conditioning corresponding to upstream flows.

  • raw_downstream_y (pandas.Series, optional) – Raw time series of the second time series variable for conditioning corresponding to downstream flows.

  • train_upstream_y (pandas.Series, optional) – Training second time series variable for conditioning correpsonding to downstream flows.

  • train_downstream_y (pandas.Series, optional) – Training second time series variable for conditioning correpsonding to upstream flows.

  • ref_upstream_y (pandas.Series, optional) – Target second time series variable for conditioning corresponding to upstream flows.

  • ref_downstream_y (pandas.Series, optional) – Target second time series variable for conditioning corresponding to downtream flows.

  • bw (int, optional) – Bandwidth for KernelDensity. This should only be used if method=’kde’.

  • xbins (int, optional) – Bins for the first time series. This should only be used if method=’hist’.

  • ybins (int, optional) – Bins for the second time series. This should only be used if method=’hist’.

  • rtol (float, optional) – The desired relatie tolerance of the result for KernelDensity. This should only be used if method=’kde’.

  • atol (float, optional) – The desired absolute tolerance of the result for KernelDensity. This should only be used if method=’kde’.

  • method (str, optional) – Method to use for conditioning. Currently ‘hist’ using hist2D and ‘kde’ using kde2D are the only supported methods.

  • **kwargs – Additional keyword arguments. Mainly implemented for cross-compatibility with other methods so that a unified configuration can be used

Returns

  • bc_totals (pandas.Series) – Returns a time series of length of an interval in the bmoprh window with bmorphed values.

  • bc_multipliers (pandas.Series) – Returns a time series of equal length to bc_totals used to scale the raw flow values into the bmorphed values returned in bc_totals.

bmorph.core.workflows.apply_bmorph(raw_ts, train_ts, ref_ts, apply_window, raw_train_window, ref_train_window, condition_ts=None, raw_y=None, train_y=None, ref_y=None, interval=<DateOffset: years=1>, overlap=60, n_smooth_long=None, n_smooth_short=5, bw=3, xbins=200, ybins=10, rtol=1e-06, atol=1e-08, method='hist', train_cdf_min=1e-06, **kwargs)[source]

Bias correction is performed by bmorph on user-defined intervals.

Parameters
  • raw_ts (pandas.Series) – Raw flow timeseries.

  • train_ts (pandas.Series) – Flow timeseries to train the bias correction model with.

  • ref_ts (pandas.Series) – Observed/reference flow timeseries.

  • condition_ts (pandas.Series) – A timeseries with a variable to condition on. This will be used in place of raw_y, train_y, and ref_y. It is mainly added as a convenience over specifying the same timeseries for each of those variables.

  • apply_window (pandas.date_range) – Date range to apply bmorph onto flow timeseries.

  • raw_train_window (pandas.date_range) – Date range to train the bias correction model.

  • ref_train_window (pandas.date_range) – Date range to smooth elements in ‘raw_ts’ and ‘bmorph_ts’.

  • raw_y (pandas.Series, optional) – Raw time series of the second time series variable for conditioning.

  • train_y (pandas.Series, optional) – Training second time series.

  • ref_y (pandas.Series, optional) – Target second time series.

  • interval (pandas.DateOffset) – Difference between bmorph application intervals.

  • overlap (int) – Total number of days overlap CDF windows have with each other, distributed evenly before and after the application window.

  • n_smooth_long (int, optional) – Number of elements that will be smoothed in raw_ts and bmorph_ts. The nsmooth value in this case is typically much larger than the one used for the bmorph function itself. For example, 365 days.

  • n_smooth_short (int, optional) – Number of elements that will be smoothed when determining CDFs used for the bmorph function itself.

  • bw (int, optional) – Bandwidth for KernelDensity. This should only be used if method=’kde’.

  • xbins (int, optional) – Bins for the first time series. This should only be used if method=’hist’.

  • ybins (int, optional) – Bins for the second time series. This should only be used if method=’hist’.

  • rtol (float, optional) – The desired relatie tolerance of the result for KernelDensity. This should only be used if method=’kde’.

  • atol (float, optional) – The desired absolute tolerance of the result for KernelDensity. This should only be used if method=’kde’.

  • method (str) – Method to use for conditioning. Currently ‘hist’ using hist2D and ‘kde’ using kde2D are the only supported methods.

  • **kwargs – Additional keyword arguments. Mainly implemented for cross-compatibility with other methods so that a unified configuration can be used

Returns

  • bmorph_corr_ts (pandas.Series) – Returns a time series of length of an interval in the bmoprh window with bmorphed values.

  • bmorph_multipliers (pandas.Series) – Returns a time series of equal length to bc_totals used to scale the raw flow values into the bmorphed values returned in bc_totals.

bmorph.core.workflows.apply_scbc(ds, mizuroute_exe, bmorph_config, client=None, save_mults=False, **tqdm_kwargs)[source]

Applies Spatially Consistent Bias Correction (SCBC) by bias correcting local flows and re-routing them through mizuroute. This method can be run in parallel by providing a dask client.

Parameters
  • ds (xr.Dataset) – An xarray dataset containing time and seg dimensions and variables to be bias corrected. This will mostly likely come from the provided preprocessing utility, mizuroute_utils.to_bmorph

  • mizuroute_exe (str) – The path to the mizuroute executable

  • bmorph_config (dict) – The configuration for the bias correction. See the documentation on input specifications and selecting bias correction techniques for descriptions of the options and their choices.

  • client (dask.Client (optional)) – A client object to manage parallel computation.

  • save_mults (boolean (optional)) – Whether to save multipliers from bmorph for diagnosis. If True, multipliers are saved in the same directory as local flows. Defaults as False to not save multipliers.

  • **tqdm_kwargs (optional) – Keyword arguments for tqdm loops within apply_scbc.

Returns

region_totals – The rerouted, total, bias corrected flows for the region

Return type

xr.Dataset

BMORPH

bmorph: modify a time series by removing elements of persistent differences

(aka bias correction)

Persistent differences are inferred by comparing a ‘ref’ sample with a ‘training’ sample. These differences are then used to correct a ‘raw’ sample that is presumed to have the same persistent differences as the ‘training’ sample. The resulting ‘bmorph’ sample should then be consistent with the ‘ref’ sample.

bmorph.core.bmorph.bmorph(raw_ts, train_ts, ref_ts, raw_apply_window, raw_train_window, ref_train_window, raw_cdf_window, raw_y=None, ref_y=None, train_y=None, nsmooth=12, bw=3, xbins=200, ybins=10, rtol=1e-07, atol=0, method='hist', smooth_multipliers=True, train_cdf_min=1e-06)[source]

Morph raw_ts based on differences between ref_ts and train_ts

bmorph is an adaptation of the PresRat bias correction procedure from Pierce et al. (2015; http://dx.doi.org/10.1175/JHM-D-14-0236.1), which is itself an extension of the Equidistant quantile matching (EDCDFm) technique of Li et al. (2010; http://dx.doi.org/10.1029/94JD00483). The method as implemented here uses a multiplicative change in the quantiles of a CDF, followed by a simple correction to preserve changes in the long-term mean. No further frequency-based corrections are applied.

The method differs from PresRat in that it is not applied for fixed periods (but uses a moving window) to prevent discontinuities in the corrected time series and it does not apply a frequency-based correction.

The method also allows changes to be made through an adapted version of the EDCDFm technique or through the multiDimensional ConDitional EquiDistant CDF matching function if a second timeseries variable is passed.

Parameters
  • raw_ts (pandas.Series) – Raw time series that will be bmorphed

  • raw_cdf_window (slice) – Slice used to determine the CDF for raw_ts

  • raw_bmorph_window (slice) – Slice of raw_ts that will be bmorphed

  • ref_ts (pandas.Series) – Target time series. This is the time series with ref values that overlaps with train_ts and is used to calculated ref_cdf

  • train_ts (pandas.Series) – Training time series. This time series is generated by the same process as raw_ts but overlaps with ref_ts. It is used to calculate train_cdf

  • training_window (slice) – Slice used to subset ref_ts and train_ts when the mapping between them is created

  • nsmooth (int) – Number of elements that will be smoothed when determining CDFs

  • raw_y (pandas.Series) – Raw time series of the second time series variable for cqm

  • ref_y (pandas.Series) – Target second time series

  • train_y (pandas.Series) – Training second time series

  • bw (int) – bandwidth for KernelDensity

  • xbins (int) – Bins for the flow time series

  • ybins (int) – Bins for the second time series

  • train_cdf_min (float (optional)) – Minimum percentile allowed for train cdf. Defaults as 1e-4 to help handle data spikes in corrections caused by multipliers being too large from the ratio between reference and training flows being large.

Returns

bmorph_ts – Returns a time series of length raw_bmorph_window with bmorphed values

Return type

pandas.Series

bmorph.core.bmorph.bmorph_correct(raw_ts, bmorph_ts, correction_window, ref_mean, train_mean, nsmooth)[source]

Correct bmorphed values to preserve the ratio of change

Apply a correction to bmorphed values to preserve the mean change over a correction_window. This is similar to teh correction applied in the original PresRat algorithm; Pierce et al. 2015; http://dx.doi.org/10.1175/JHM-D-14-0236.1), except that we use a rolling mean to determine the correction to avoid discontinuities on the boundaries.

Parameters
  • raw_ts (pandas.series) – Series of raw values that have not been bmorphed

  • bmorph_ts (Bmorphed version of raw_ts) – Series of bmorphed values

  • correction_window (slice) – Slice of raw_ts and bmorph_ts over which the correction is applied

  • ref_mean (float) – Mean of target time series (ref_ts) for the base period

  • train_mean (float) – Mean of training time series (train_ts) for the base period

  • nsmooth (int) – Number of elements that will be smoothed in raw_ts and bmorph_ts. The nsmooth value in this case is typically much larger than the one used for the bmorph function itself. For example, 365 days.

Returns

bmorph_corrected_ts – Corrected series of length correction_window.

Return type

pandas.Series

bmorph.core.bmorph.cqm(raw_x: Series, train_x: Series, ref_x: Series, raw_y: Series, train_y: Series, ref_y: Optional[Series] = None, method='hist', xbins=200, ybins=10, bw=3, rtol=1e-07, atol=0, nsmooth=5, train_cdf_min=1e-06) Series[source]

Conditional Quantile Mapping

Multidimensional conditional equidistant CDF matching function:
ilde{x_{mp}} = x_{mp} + F^{-1}_{oc}(F_{mp}(x_{mp}|y_{mp})|y_{oc})
  • F^{-1}_{mc}(F_{mp}(x_{mp}|y_{mp})|y_{mc})

bmorph.core.bmorph.edcdfm(raw_x, raw_cdf, train_cdf, ref_cdf, train_cdf_min=1e-06)[source]

Calculate multipliers using an adapted version of the EDCDFm technique

This routine implements part of the PresRat bias correction method from Pierce et al. (2015; http://dx.doi.org/10.1175/JHM-D-14-0236.1), which is itself an extension of the Equidistant quantile matching (EDCDFm) technique of Li et al. (2010; http://dx.doi.org/10.1029/94JD00483). The part that is implemented here is the amended form of EDCDFm that determines multiplicative changes in the quantiles of a CDF.

In particular, if the value raw_x falls at quantile u_t (in raw_cdf), then the bias-corrected value is the value in ref_cdf at u_t (ref_x) multiplied by the model-predicted change at u_t evaluated as a ratio (i.e., model future (or raw_x) / model historical (or ref_x)). Thus, the bias-corrected value is raw_x multiplied by ref_x/train_x. Here we only return the multiplier ref_x/train_x. This method preserves the model-predicted median (not mean) change evaluated multiplicatively. Additional corrections are required to preserve the mean change. Inclusion of these additional corrections constitutes the PresRat method.

Parameters
  • raw_x (pandas.Series) – Series of raw values that will be used to determine the quantile u_t

  • raw_cdf (pandas.Series) – Series of raw values that represents the CDF that is used to determine the non-parametric quantile of raw_x

  • train_cdf (pandas.Series) – Series of training values that represents the CDF based on the same process as raw_cdf, but overlapping in time with ref_cdf

  • ref_cdf (pandas.Series) – Series of ref values that represents the ref CDF and that overlaps in time with train_cdf

Returns

multiplier – Multipliers for raw_x. The pandas.Series has the same index as raw_x

Return type

pandas.Series

bmorph.core.bmorph.hist2D(x, y, xbins, ybins, **kwargs)[source]

Create a 2 dimensional pdf vias numpy histogram2d

bmorph.core.bmorph.kde2D(x, y, xbins=200, ybins=10, **kwargs)[source]

Estimate a 2 dimensional pdf via kernel density estimation

bmorph.core.bmorph.marginalize_cdf(y_raw, z_raw, vals)[source]

Find the marginalized cdf by computing cumsum(P(x|y=val)) for each val

Local Flows

bmorph.core.local_flows.estimate_local_flow(ref_total_flow: Series, sim_total_flow: Series, sim_local_flow: Series, how: str = 'flow_fraction', method_kwargs: Dict[str, Any] = {}) Series[source]

Estimate the local flow for a single site.

Parameters
  • ref_total_flow – Reference total flow to calculate the reference_local_flow from

  • sim_total_flow – Simulated total flow

  • sim_local_flow – Simulated reference flow

  • how – How to estimate the local reference flow. Available options are flow_fraction

  • method_kwargs – Additional arguments to pass to the estimator.

Return type

the estimated local reference flow

bmorph.core.local_flows.find_local_segment(lats, lons, target_latlon, n_return=10, metric='euclidean', gridsearch=False) Dict[str, ndarray][source]

Finds the closest coordinates to a given target. Can return multiple coordinates, in ascending order from closest to furthest.

Parameters
  • lats – Latitudes to search through

  • lons – Longitudes to search through

  • target_latlon – Tuple of (lat, lon) that is being searched for

  • n_return – Number of closest coordinates to return

  • metric – Distance metric. Can be any valid metric from scipy.spatial.distance.cdist

  • gridsearch – Whether to create a meshgrid from the given lats and lons

Returns

coords: Coordinates of n_return nearest coordinates as (lat, lon) distances: Distances according to metric from target_latlon indices: Indices from lats and lons to the n_return nearest

Return type

dictionary containing

bmorph.core.local_flows.flow_fraction_multiplier(total_flow: Series, local_flow: Series, nsmooth: int = 30) Series[source]

Calculate a the ratio of local flow to total flow from timeseries after applying a rolling mean.

Parameters
  • total_flow – Total accumulated flow at the given location

  • local_flow – Portion of flow that is directly from the sub-basin (excludes upstream contributions)

  • nsmooth – Number of timesteps to use

bmorph.core.local_flows.quantile_regression(total_flow: Series, local_flow: Series, nsmooth: int = 30)[source]

TODO: Implement

Utilities

mizuRoute Utilities

bmorph.util.mizuroute_utils.calculate_blend_vars(routed: Dataset, topology: Dataset, reference: Dataset, gauge_sites=None, route_var='IRFroutedRunoff', fill_method='kldiv', min_kge=-0.41)[source]

Calculates a number of variables used in blendmorph and map_var_to_seg.

Parameters
  • routed (xr.Dataset) – The dataset that will be modified and returned ready for map_var_to_seg.

  • topology (xr.Dataset) – Contains the network topology with a “seg” dimension that identifies reaches, matching the routed dataset.

  • reference (xr.Dataset) – Contains reaches used for reference with dimension “site” and coordinate “seg”.

  • gauge_sites (list, optional) – Contains the gauge site names from the reference dataset to be used that are automatically pulled from reference if None are given.

  • route_var (str) – Variable name of flows used for fill_method purposes within routed. This is defaulted as ‘IRFroutedRunoff’.

  • fill_method (str) – While finding some upstream/downstream reference segs may be simple, (segs with ‘is_gauge’ = True are their own reference segs, others may be easy to find looking directly up or downstream), some river networks may have multiple options to select gauge sites and may fail to have upstream/downstream reference segs designated. ‘fill_method’ specifies how segs should be assigned upstream/downstream reference segs for bias correction if they are missed walking upstream or downstream.

    Currently supported methods:
    ‘leave_null’

    nothing is done to fill missing reference segs, np.nan values are replaced with a -1 seg designation and that’s it

    ‘forward_fill’

    xarray’s ffill method is used to fill in any np.nan values

    ‘r2’

    reference segs are selected based on which reference site that seg’s flows has the greatest r2 value with

    ‘kldiv’ (default)

    reference segs are selected based on which reference site that seg’s flows has the smallest KL Divergence value with

    ‘kge’

    reference segs are selected based on which reference site that seg’s flows has the greatest KGE value with

  • min_kge (float) – If not None, all upstream/downstream reference seg selections will be filtered according to the min_kge criteria, where seg selections that have a kge with the current seg that is less that min_kge will be set to -1 and determined unsuitable for bias correction. This is defaulted as -0.41.

Returns

routed – with the following added: ‘is_headwaters’ ‘is_gauge’ ‘down_seg’ ‘distance_to_up_gauge’ ‘distance_to_down_gauge’ ‘cdf_blend_factor’ ‘up_seg’ ‘up_ref_seg’ ‘down_ref_seg’

Return type

xr.Dataset

bmorph.util.mizuroute_utils.calculate_cdf_blend_factor(routed: Dataset, gauge_reference: Dataset, gauge_sites=None, fill_method='kldiv', min_kge=-0.41)[source]

Calculates the cumulative distribution function blend factor based on distance to a seg’s nearest up gauge site with respect to the total distance between the two closest guage sites to the seg.

Parameters
  • routed (xr.Dataset) – Contains flow timeseries data.

  • gauge_reference (xr.Dataset) – Contains reference flow timeseries data for the same watershed as the routed dataset.

  • gauge_sites (list, optional) – If None, gauge_sites will be taken as all those listed in gauge_reference.

  • fill_method (str) – See map_ref_sites for full description of how fill_method works.

    Because each fill_method selects reference segs differently, calculate_blend_vars needs to know how they were selected to create blend factors. Note that ‘leave_null’ is not supported for this method because there is no filling for this method. Currently supported:

    ‘forward_fill’
    cdf_blend_factor = distance_to_upstream /

    (distance_to_upstream + distance_to_downstream)

    ‘kldiv’

    cdf_blend_factor = kldiv_upstream / (kldiv_upstream + kldiv_downstream)

    ‘r2’

    cdf_blend_factor = r2_upstream / (r2_upstream + r2_downstream)

Returns

routed – The original routed dataset updated with ‘cdf_blend_factors’ used to combine upstream and downstream relative bias corrections. Each fill_method will also add or use upstream and downstream statistical measures calculated in map_ref_sites.

Return type

xr.Dataset

bmorph.util.mizuroute_utils.find_max_kge(ds, curr_seg_flow)[source]

Searches through ds to find which seg has the larges Kling-Gupta Efficiency (KGE) value with respect to curr_seg_flow. If no seg is found, max_kge = -np.inf and max_kge_ref_seg = -1.

Parameters
  • ds (xr.Dataset) – Contains the variable ‘reference_flow’ to compare curr_seg_flow against and the coordinate ‘seg’.

  • curr_seg_flow (int) – A numpy array containing flow values that KGE is to be maximized with respect to.

Returns

  • max_kge (float) – Maximum KGE value found.

  • max_kge_ref_seg – River segment designation corresponding to max_kge.

bmorph.util.mizuroute_utils.find_max_r2(ds, curr_seg_flow)[source]

Searches through ds to find which seg has the greatest r2 value with respect to curr_seg_flow. If no seg is found, max_r2 = 0 and max_r2_ref_seg = -1.

Parameters
  • ds (xr.Dataset) – Contains the variable ‘reference_flow’ to compare curr_seg_flow against and the coordinate ‘seg’.

  • curr_seg_flow (np.array) – A numpy array containing flow values that r2 is to be maximized with respect to.

Returns

  • max_r2 (float) – Magnitude of the maximum R squared value found.

  • max_r2_ref_seg (int) – River segement designation corresponding to the max_r2.

bmorph.util.mizuroute_utils.find_min_kldiv(ds, curr_seg_flow)[source]

Searches through ds to find which seg has the smallest Kullback-Leibler Divergence value with respect to curr_seg_flow. If no seg is found, min_kldiv = -1 and min_kldiv_ref_seg = -1. https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence

Parameters
  • ds (xr.Dataset) – contains the variable ‘reference_flow’ to compare curr_seg_flow against and the coordinate ‘seg’.

  • curr_seg_flow (np.array) – a numpy array containing flow values that KL Divergence is to be maximized with respect to.

Returns

  • min_kldiv (float) – Magnitude of the minimum KL Divergence found.

  • min_kldiv_ref_seg (int) – River segment designation corresponding to min_kldiv.

bmorph.util.mizuroute_utils.find_up(ds, seg, sel_method='first', sel_var='IRFroutedRunoff')[source]

Finds the segment directly upstream of seg given seg is not a headwater segment, (in which case np.nan is returned).

Parameters
  • ds (xr.Dataset) – Dataset containing river segments as ‘seg’, headwater segments by ‘is_headwaters’, and what is downstream of each seg in ‘down_seg’.

  • seg (int) – River segment designation to search from.

  • sel_method (str) – Method to use when selecting among multiple upstream segments.

  • sel_var (str) – Variable used when comparing segments amonth multiple upstream segments. Can be ‘forward_fill’, ‘r2’, or ‘kge’.

Returns

up_seg – Upstream segment designation found, or np.nan if seg is a headwater segement.

Return type

int

bmorph.util.mizuroute_utils.kling_gupta_efficiency(sim, obs)[source]

Calculates the Kling-Gupta Efficiency (KGE) between two flow arrays. https://agrimetsoft.com/calculators/Kling-Gupta%20efficiency

Parameters
  • sim (array-like) – Simulated flow array.

  • obs (array-like) – Observed flow array.

Returns

kge – Kling-Gupta Efficiency calculated between the two arrays.

Return type

float

bmorph.util.mizuroute_utils.map_headwater_sites(routed: Dataset)[source]

Boolean identifies whether a river segement is a headwater with ‘is_headwater’.

Parameters

routed (xr.Dataset) – Contains watershed river segments designated as the dimension ‘seg’. River segments are connected by referencing immediate downstream segments as ‘down_seg’ for each ‘seg’.

Returns

routed – The original routed dataset updated with which sites are headwaters.

Return type

xr.Dataset

bmorph.util.mizuroute_utils.map_met_hru_to_seg(met_hru, topo)[source]

Maps meterological data from hru to seg.

Parameters
  • met_hru (xr.Dataset) – A dataset of meteorological data to be mapped onto the stream segments to facilitate conditioning. All variables in this dataset will automatically be mapped onto the stream segments and returned.

  • topo (xr.Dataset) – Topology dataset for running mizuRoute. We expect this to have seg and hru dimensions.

Returns

met_seg – A dataset of meterological data mapped onto the stream segments to facilitate conditioning.

Return type

xr.Dataset

bmorph.util.mizuroute_utils.map_ref_sites(routed: Dataset, gauge_reference: Dataset, gauge_sites=None, route_var='IRFroutedRunoff', fill_method='r2', min_kge=-0.41)[source]

Assigns segs within routed boolean ‘is_gauge’ “identifiers” and what each seg’s upstream and downstream reference seg designations are.

Parameters
  • routed (xr.Dataset) – Contains the input flow timeseries data.

  • gauge_reference (xr.Dataset) – Contains reference flow timeseries data for the same watershed as the routed dataset.

  • gauge_sites (list, optional) – If None, gauge_sites will be taken as all those listed in gauge_reference.

  • route_var (str) – Variable name of flows used for fill_method purposes within routed. This is defaulted as ‘IRFroutedRunoff’.

  • fill_method (str) – While finding some upstream/downstream reference segs may be simple, (segs with ‘is_gauge’ = True are their own reference segs, others may be easy to find looking directly up or downstream), some river networks may have multiple options to select gauge sites and may fail to have upstream/downstream reference segs designated. ‘fill_method’ specifies how segs should be assigned upstream/downstream reference segs for bias correction if they are missed walking upstream or downstream.

    Currently supported methods:
    ‘leave_null’

    nothing is done to fill missing reference segs, np.nan values are replaced with a -1 seg designation and that’s it

    ‘forward_fill’

    xarray’s ffill method is used to fill in any np.nan values

    ‘r2’

    reference segs are selected based on which reference site that seg’s flows has the greatest r2 value with

    ‘kldiv’

    reference segs are selected based on which reference site that seg’s flows has the smallest KL Divergence value with

    ‘kge’

    reference segs are selected based on which reference site that seg’s flows has the greatest KGE value with

Returns

routed – Routed timeseries with reference gauge site river segments assigned to each river segement in the original routed.

Return type

xr.Dataset

bmorph.util.mizuroute_utils.map_segs_topology(routed: Dataset, topology: Dataset)[source]

Adds contributing_area, length, and down_seg to routed from topology.

Parameters
  • routed (xr.Dataset) – Contains streamflow timeseries mapped to river segments denoted as ‘seg’.

  • topology (xr.Dataset) – Contains topological data of the watershed that routed’s streamflow timeseries describe. River segment designations, lengths, and immeditate downstream segments are expected as ‘seg’, ‘Length’, and ‘Tosegment’.

Returns

routed – The input dataset routed updated with the topological data.

Return type

xr.Dataset

bmorph.util.mizuroute_utils.map_var_to_segs(routed: Dataset, map_var: DataArray, var_label: str, var_key: str, has_hru: DataArray, gauge_segs=None)[source]

Splits the variable into its up and down components to be used in blendmorph.

Parameters
  • routed (xr.Dataset) – the dataset that will be modified and returned having been prepared by calculate_blend_vars with the dimension ‘seg’

  • map_var (xr.DataArray) – contains the variable to be split into up and down components and can be the same as routed, (must also contain the dimension ‘seg’)

  • var_label (str) – suffix of the up and down parts of the variable

  • var_key (str) – variable name to access the variable to be split in map_var

  • has_hru (xr>dataArray) – contains the ‘has_hru’ variable in alliance with map_var

  • gauge_segs (list, optional) – List of the gauge segs that identify the reaches that are gauge sites, pulled from routed if None.

  • —-

Returns

routed – with the following added: f’down_{var_label}’ f’up_{var_label}’

Return type

xr.Dataset

bmorph.util.mizuroute_utils.to_bmorph(topo: Dataset, routed: Dataset, reference: Dataset, met_hru: Optional[Dataset] = None, route_var: str = 'IRFroutedRunoff', fill_method='r2', min_kge=None)[source]

Prepare mizuroute output for bias correction via the blendmorph algorithm. This allows an optional dataset of hru meteorological data to be given for conditional bias correction.

Parameters
  • topo (xr.Dataset) – Topology dataset for running mizuRoute. We expect this to have seg and hru dimensions.

  • routed (xr.Dataset) – The initially routed dataset from mizuRoute.

  • reference (xr.Dataset) – A dataset containing reference flows for bias correction. We expect this to have site and time dimensions with flows being stored in reference_flow.

  • met_hru (xr.Dataset, optional) – A dataset of meteorological data to be mapped onto the stream segments to facilitate conditioning. All variables in this dataset will automatically be mapped onto the stream segments and returned.

  • route_var (str) – Name of the variable of the routed runoff in the routed dataset. Defaults to IRFroutedRunoff.

  • fill_method (str) – While finding some upstream/downstream reference segs may be simple, (segs with ‘is_gauge’ = True are their own reference segs, others may be easy to find looking directly up or downstream), some river networks may have multiple options to select gauge sites and may fail to have upstream/downstream reference segs designated. ‘fill_method’ specifies how segs should be assigned upstream/downstream reference segs for bias correction if they are missed walking upstream or downstream.

    Currently supported methods:
    ‘leave_null’

    nothing is done to fill missing reference segs, np.nan values are replaced with a -1 seg designation and that’s it

    ‘forward_fill’

    xarray’s ffill method is used to fill in any np.nan values

    ‘r2’

    reference segs are selected based on which reference site that seg’s flows has the greatest r2 value with

    ‘kldiv’ (default)

    reference segs are selected based on which reference site that seg’s flows has the smallest KL Divergence value with

    ‘kge’

    reference segs are selected based on which reference site that seg’s flows has the greatest KGE value with

  • min_kge (float, optional) – See calculate_blend_vars for more information defaults None unless fill_method = ‘kge’.

Returns

A dataset with the required data for applying the blendmorph routines. See the blendmorph documentation for further information.

Return type

met_seg

bmorph.util.mizuroute_utils.trim_time(dataset_list: list)[source]

Trims all times of the xr.Datasets in the list to the shortest timeseries.

Parameters

dataset_list (List[xr.Dataset]) – Contains a list of xr.Datasets

Returns

Contains a list in the same order as dataset_list except with all items in the list having the same start and end time.

Return type

list

bmorph.util.mizuroute_utils.walk_down(ds, start_seg)[source]

Finds the nearest downstream gauge site and returns the distance traveled to reach it from start_seg.

Parameters
  • ds (xr.Dataset) – Dataset containing river segments, downstream segs, the length of the river segments, and which segs are gauge sites as ‘seg’, ‘down_seg’, ‘length’, and ‘is_gauge’, respectively.

  • start_seg (int) – River segment designation to start walking from to a downstream gauge site.

Returns

  • tot_length (float) – Total length traveled during walk, (e.g. cumulative river distance from start_seg to the downstream gauge site).

  • cur_seg (int) – River segment designation of the gauge site reached.

bmorph.util.mizuroute_utils.walk_up(ds, start_seg)[source]

Finds the nearest upstream gauge site and returns the distance traveled to reach it from start_seg.

Parameters
  • ds (xr.Dataset) – Dataset containing river segments, upstream segs, the length of the river segments, and which segs are gauge sites as ‘seg’, ‘up_seg’, ‘length’, and ‘is_gauge’, respectively.

  • start_seg (int) – River segment designation to start walking from to an upstream gauge site.

Returns

  • tot_length (float) – Total length traveled during walk, (e.g. cumulative river distance from start_seg to the downstream gauge site).

  • cur_seg (int) – River segment designation of the gauge site reached.

Evaluation

Plotting

bmorph.evaluation.plotting.anomaly_scatter2D(computations: dict, baseline_key: str, vert_key: str, horz_key: str, sites=[], multi=True, colors=['Maroon', 'Navy', 'Black', 'Orange', 'Yellow', 'Blue', 'Grey', 'Pink', 'Lavender'], show_legend=True)[source]

Plots two correction models against each other after Raw is subracted from each.

Parameters
  • computations (dict) – Expecting {“Correction Name”: correction pandas.DataFrame}.

  • baseline_key (str) – Dictionary key for the computations dictionary that accesses what baseline the corrections should be compared to. This is typically observations.

  • vert_key (str) – Dictionary key for the computations dictionary that accesses the model to be plotted on the vertical axis.

  • horz_key (str) – Dictionary key for the computations dictionary that accesses the model to be plotted on the horizontal axis.

  • sites (list) – Site(s) to be compared in the plot, can have a size of 1. If multi is set to False and this is not changed to a single site, then the first value in the list will be chosen.

  • multi (boolean, optional) – Whether the plot uses data from multiple sites or a single site.

  • colors (list, optional) – Colors as strings to be plotted from. Plotting colors are different for each correction DataFrame, but same across sites for a singular correction. An error will be thrown if there are more cor_keys then colors.

  • show_legend (boolean, optional) – Whether or not to display the legend, defaults as True.

bmorph.evaluation.plotting.calc_water_year(df: DataFrame)[source]

Calculates the water year.

Parameters

df (pandas.DataFrame) – Flow timeseries with a DataTimeIndex.

Returns

A pandas.DataFrame index grouped by water year.

Return type

pandas.DataFrame.index

bmorph.evaluation.plotting.color_code_nxgraph(graph: <module 'networkx.classes.graph' from '/home/docs/checkouts/readthedocs.org/user_builds/bmorph/envs/develop/lib/python3.7/site-packages/networkx/classes/graph.py'>, measure: ~pandas.core.series.Series, cmap=<matplotlib.colors.LinearSegmentedColormap object>, vmin=None, vmax=None) dict[source]

Creates a dictionary mapping of nodes to color values.

Parameters
  • graph (networkx.graph) – Graph to be color coded

  • measure (pandas.Series) – Contains river segment ID’s as the index and desired measures as values.

  • cmap (matplotlib.colors.LinearSegmentedColormap, optional) – Colormap to be used for coloring the SimpleRiverNewtork plot. This defaults as ‘coolwarm_r’.

  • vmin (float, optional) – Minimum value for coloring

  • vmax (float, optional) – Maximum value for coloring

Returns

Dictionary of {i:color} where i is the index of the river segment.

Return type

dict

bmorph.evaluation.plotting.compare_CDF(flow_dataset: Dataset, plot_sites: list, raw_var: str, raw_name: str, ref_var: str, ref_name: str, bc_vars: list, bc_names: list, plot_colors: list, logit_scale=True, logarithm_base='10', units='Q [$m^3/s$]', markers=['o', 'x', '*', '*'], figsize=(20, 20), sharex=False, sharey=True, fontsize_title=40, fontsize_labels=30, fontsize_tick=20, markersize=1, alpha=0.3)[source]

Compare probability distribution functions on a logit scale.

Plots the CDF’s of the raw, reference, and bias corrected flows.

Parameters
  • flow_dataset (xarray.Dataset) – Contatains raw, reference, and bias corrected flows.

  • gauges_sites (list) – Gauge sites to be plotted contained in flow_dataset.

  • raw_var (str) – Accesses the raw flows in flow_dataset for flows before bias correction.

  • raw_name (str) – Label for the raw flows in the legend, corresponding to raw_var.

  • ref_var (str) – Accesses the reference flows in flow_dataset for true flows.

  • ref_name (str) – Label for the reference flows in the legend, corresponding to ref_var.

  • bc_vars (list) – Accesses the bias corrected flows in flow_dataset, where each element accesses its own bias corrected flows. Can be a size of 1.

  • bc_names (list) – Label for the bias corrected flows in the legend for each entry in bc_var, assumed to be in the same order. Can be a size of 1.

  • logit_scale (boolean, optional) – Whether to plot the vertical scale on a logit axis (True) or not (False). Defaults as True.

  • logarithm_base (str, optional) – The logarthimic base to use for the horizontal scale. Only the following are currently supported:

    ‘10’ to use a log10 horizontal scale (default) ‘e’ to use a natural log horizontal scale

  • units (str, optional) – The horizontal axis’s label for units, defaults as r’Q [$m^3/s$]’.

  • plot_colors (list, optional) – Colors to be plotted for the flows corresponding to raw_var, ref_var, and bc_var, defaulting as [‘grey’, ‘black’, ‘blue’, ‘red’], assuming there are two entries in bc_var.

  • markers (list, optional) – Markers to be plotted for the flows corresponding to raw_var, ref_var, and bc_var, defaulting as [‘o’, ‘x’, ‘*’, ‘*’], assuming there are two entries in bc_var.

  • figsize (tuple, optional) – Figure size following matplotlib notation, defaults as (20, 20).

  • sharex (boolean or str, optional) – Whether horizontal axis should be shared amongst subplots, defaulting as False.

  • sharey (boolean or str, optional) – Whether vertical axis should be shared amongst subplots, defaulting as True.

  • fontsize_title (int, optional) – Font size of the title, defaults as 40.

  • fontsize_labels (int, optional) – Font size of the labels, defaults as 30.

  • fontsize_tick (int, optional) – Font size of the ticks, defaults as 20.

  • markersize (float, optional) – Size of the markers plotted, defaults as 1.

  • alpha (float, optional) – Transparancy of the markers plotted, defaults as 0.3 to help see where markers clump together.

Return type

matplotlib.figure, matplotlib.axes

bmorph.evaluation.plotting.compare_CDF_all(flow_dataset: Dataset, plot_sites: list, raw_var: str, raw_name: str, ref_var: str, ref_name: str, bc_vars: list, bc_names: list, plot_colors: list, logit_scale=True, logarithm_base='10', units='Q [$m^3/s$]', figsize=(20, 20), fontsize_title=40, fontsize_labels=40, fontsize_tick=40, markersize=1, alpha=0.3)[source]

Compare probability distribution functions as a summary statistic.

Plots the CDF’s of the raw, reference, and bias corrected flows with data from all sites in plot_sites combined for a summary statistic.

Parameters
  • flow_dataset (xarray.Dataset) – Contains raw (uncorrected), reference (true), and bias corrected flows.

  • plot_sites (list) – Gauge sites to be plotted.

  • raw_var (str) – Accesses the raw (uncorrected) flows in flow_dataset.

  • raw_name (str) – Label for the raw flows in the legend, corresponding to raw_var.

  • ref_var (str) – Accesses the reference (true) flows in flow_dataset.

  • ref_name (str) – Label for the reference flows in the legend, corresponding to ref_var.

  • bc_vars (list) – Accesses the bias corrected flows in flow_dataset. Can be a size of 1.

  • bc_names (list) – Label for the bias corrected flows in the legend, corresponding to bc_var. Can be a size of 1.

  • plot_colors (list, optional) – Colors to be plotted for the flows corresponding to raw_var, ref_var, and bc_var, defaulting as [‘grey’, ‘black’, ‘blue’, ‘red’], assuming there are two entries in bc_var.

  • logit_scale (True, optional) – Whether to plot the vertical scale on a logit axis (True) or not (False). Defaults as True.

  • logarithm_base (str, optional) – The logarthimic base to use for the horizontal scale. Only the following are currently supported:

    ‘10’ to use a log10 horizontal scale (default) ‘e’ to use a natural log horizontal scale

  • units (str, optional) – The horizontal axis’s label for units, defaults as r’Q [$m^3/s$]’.

  • figsize (tuple, optional) – Figure size following matplotlib connventions, defaults as (20, 20).

  • fontsize_title (int, optional) – Font size of the title, defaults as 40.

  • fontsize_labels (int, optional) – Font size of the labels, defaults as 40.

  • fontsize_tick (int, optional) – Font size of the ticks, defaults as 40.

  • markersize (float, optional) – Size of the markers plotted, defaults as 1. Linewidth is half of this value.

  • alpha (float, optional) – Transparancy of the lines and markers, defaults as 0.3.

Return type

matplotlib.figure, matplotlib.axes

bmorph.evaluation.plotting.compare_PDF(flow_dataset: ~xarray.core.dataset.Dataset, gauge_sites=<class 'list'>, raw_var='raw_flow', ref_var='reference_flow', bc_var='bias_corrected_total_flow', raw_name='Mizuroute Raw', ref_name='NRNI Reference', bc_name='BMORPH BC', fontsize_title=40, fontsize_labels=30, fontsize_tick=20)[source]

Compare probability distribution functions.

Plots the PDF’s of the raw, reference, and bias corrected flows for each gauge site.

Parameters
  • flow_dataset (xarray.Dataset) – Contatains raw, reference, and bias corrected flows.

  • gauges_sites (list) – Gauge sites to be plotted as used in the flow_dataset.

  • raw_var (str, optional) – The string to access the raw flows in flow_dataset for flows before bias correction. Defaults as ‘raw_flow’.

  • ref_var (str, optional) – The string to access the reference flows in flow_dataset for true flows. Defaults as ‘reference_flow’.

  • bc_var (str, optional) – The string to access the bias corrected flows in flow_dataset for flows after bias correction. Defaults as ‘bias_corrected_total_flow’.

  • raw_name (str, optional) – Label for the raw flows before bias correction. Defaults as ‘Mizuroute Raw’.

  • ref_name (str, optional) – Label for the reference flows. Defaults as ‘NRNI Reference’.

  • bc_name (str, optional) – Label for the bias corrected flows after bias correction. Defaults as ‘BMORPH BC’.

  • fontsize_title (int, optional) – Fontsize of the title. Defaults as 40.

  • fontsize_labels (int, optional) – Fontsize of the lables. Defaults as 30.

  • fontsize_tick (int, optional) – Fontsize of the ticks. Defaults as 20.

bmorph.evaluation.plotting.compare_correction_scatter(flow_dataset: Dataset, plot_sites: list, raw_var='raw', raw_name='Mizuroute Raw', ref_var='ref', ref_name='Reference', bc_vars=[], bc_names=[], plot_colors=['blue', 'purple', 'orange', 'red'], title='Absolute Error in Flow$(m^3/s)$', fontsize_title=80, fontsize_legend=68, alpha=0.05, fontsize_subplot=60, fontsize_tick=45, fontcolor='black', pos_cone_guide=False, neg_cone_guide=False, symmetry=True)[source]

Difference from reference flows before and after correction.

Plots differences between the raw and reference flows on the horizontal and differences between the bias corrected and refrerence on the vertical. This compares corrections needed before and after the bias correction method is applied.

Parameters
  • flow_dataset (xarray.Dataset) – contains raw, reference, and bias corrected flows.

  • plot_sites (list) – Sites to be plotted, expected as the seg coordinate in flow_dataset.

  • raw_var (str, optional) – The string to access the raw flows in flow_dataset, defaults as raw.

  • raw_name (str, optional) – Label for the raw flows in the legend, defaults as ‘Mizuroute Raw’.

  • ref_var (str, optional) – The string to access the reference flows in ref, defaults as ‘upstream_ref_flow’.

  • ref_name (str, optional) – Label for the reference flows in the legend, defaults as ‘Reference’.

  • bc_vars (list) – The strings to access the bias corrected flows in flow_dataset.

  • bc_names (list) – Labels for the bias corrected flows in the legend, expected in the same order as bc_vars.

  • plot_colors (list, optional) – Colors to be plotted for each site in plot sites. Defaults as [‘blue’, ‘purple’, ‘orange, ‘red’].

  • fontsize_title (int, optional) – Fontsize of the title, defaults as 80.

  • fontsize_legend (int, optional) – Fontsize of the legend, defaults as 68.

  • fontsize_subplot (int, optional) – Fontsize of the subplots, defaults as 60.

  • fontsize_tick (int, optional) – Fontsize of the ticks, defaults as 45.

  • fontcolor (str, optional) – Color of the font, defaults as ‘black’.

  • pos_cone_guide (boolean, optional) – If True, plots a postive 1:1 line through the origin for reference.

  • neg_cone_guide (boolean, optional) – If True, plots a negative 1:1 line through the origin for reference.

  • symmetry (boolean, optional) – If True, the plot axis are symmetrical about the origin (default). If False, plotting limits will minimize empty space while not losing any data.

bmorph.evaluation.plotting.compare_mean_grouped_CPD(flow_dataset: Dataset, plot_sites: list, grouper_func, raw_var: str, raw_name: str, ref_var: str, ref_name: str, bc_vars: list, bc_names: list, plot_colors: list, subset_month=None, units='Mean Annual Flow [$m^3/s$]', figsize=(20, 20), sharex=False, sharey=False, pp_kws={'postype': 'cunnane'}, fontsize_title=80, fontsize_legend=68, fontsize_subplot=60, fontsize_tick=45, fontsize_labels=80, linestyles=['-', '-', '-'], markers=['.', '.', '.'], markersize=30, alpha=1, legend_bbox_to_anchor=(1, 1), fig=None, axes=None, start_ax_index=0, tot_plots=None)[source]
Cumulative Probability Distributions

plots the CPD’s of the raw, reference, and bias corrected flows on a probability axis

plot_siteslist

A list of sites to be plotted.

grouper_func

Function to group a pandas.DataFrame index by to calculate the mean of the grouped values.

raw_varstr

The string to access the raw flows in flow_dataset.

raw_namestr

Label for the raw flows in the legend.

ref_varstr

The string to access the reference flows in flow_dataset.

ref_namestr

Label for the reference flows in the legend.

bc_varslist

List of strings to access the bias corrected flows in flow_dataset.

bc_nameslist

List of labels for the bias corrected flows in the legend.

plot_colorslist

Contains the colors to be plotted for raw_var, ref_var, and bc_vars, respectively.

subset_month: int, optional

The integer date of a month to subset out for plotting, (ex: if you want to subset out January, enter 1). Defaults as None to avoid subsetting and use all the data in the year.

unitsstr, optional

Vertical axis’s label for units, defaults as r’Mean Annual Flow [$m^3/s$]’.

pp_kwsdict, optional

Plotting position computation as specified by https://matplotlib.org/mpl-probscale/tutorial/closer_look_at_plot_pos.html. Defaults as dict(postype=’cunnane’) for cunnane plotting positions.

fontsize_titleint, optional

Font size for the plot title, defaults as 80.

fontsize_legendint, optional

Font size for the plot legend, defaults as 68.

fontsize_subplotint, optional

Font size for the plot subplot text, default as 60.

fontsize_tickint, optional

Font size for the plot ticks, defaults as 45.

fontsize_labelsint, optional

Font size for the horizontal and vertical axis labels, defaults as 80.

linestyleslist, optional

Linestyles for ploting raw_var, ref_var, and bc_vars, respectively. Defaults as [‘-‘,’-‘,’-‘], expecting one of each.

markerslist, optional

Markers for ploting raw_var, ref_var, and bc_vars, respectively. Defaults as [‘.’,’.’,’.’], expecting one of each.

markersizeint, optional

Size of the markers for plotting, defaults as 30.

alphafloat, optional

Alpha transparency value for plotting, where 1 is opaque and 0 is transparent.

legend_bbox_to_anchortuple, optional

Box that is used to position the legend to the final axes. Defaults as (1,1). Modify this is the legend does not plot where you desire it to be.

figmatplotlib.figure, optional

matplotlib figure object to plot on, defaulting as None and creating a new object unless otherwise specified.

axesmatplotlib.axes, optional

Array-like of matplotlib axes objet to plot multiple plots on, defaulting as None and creating a new object unless otherwise specified.

start_ax_indexint, optional

If the plots should not be plotted starting at the first ax in axes, specifiy the index that plotting should begin on. Defaults as None, assuming plotting should begin from the first ax.

tot_plotsint, optional

If more plotting is to be done than with the total data to be provided, describe how many total plots there should be. Defalts as None, assuming plotting should begin form the first ax.

bmorph.evaluation.plotting.correction_scatter(site_dict: dict, raw_flow: DataFrame, ref_flow: DataFrame, bc_flow: DataFrame, colors: list, title='Flow Residuals', fontsize_title=80, fontsize_legend=68, fontsize_subplot=60, fontsize_tick=45, fontcolor='black', pos_cone_guide=False, neg_cone_guide=False)[source]

Difference from reference flows before and after correction.

Plots differences between the raw and reference flows on the horizontal and differences between the bias corrected and refrerence on the vertical. This compares corrections needed before and after the bias correction method is applied.

Parameters
  • site_dict (dict) – Expects {subgroup name: list of segments in subgroup} how sites are to be seperated.

  • raw_flow (pandas.DataFrame) – Contains flows before correction.

  • ref_flow (pandas.DataFrame) – Contains the reference flows to compare raw_flow and bc_flow.

  • bc_flow (pandas.DataFrame) – Contains flows after correction.

  • colors (list) – Colors to be plotted for each site in site_dict.

  • title (str, optional) – Title label for the plot, defaults as ‘Flow Residuals’.

  • fontsize_title (int, optional) – Fontsize of the title, defaults as 80.

  • fontsize_legend (int, optional) – Fontsize of the legend, defaults as 68.

  • fontsize_subplot (int, optional) – Fontsize of the subplots, defaults as 60.

  • fontsize_tick (int, optional) – Fontsize of the ticks, defaults as 45.

  • fontcolor (str, optional) – Color of the font, defaults as ‘black’.

  • pos_cone_guide (boolean, optional) – If True, plots a postive 1:1 line through the origin for reference.

  • neg_cone_guide (boolean, optional) – If True, plots a negative 1:1 line through the origin for reference.

bmorph.evaluation.plotting.create_adj_mat(topo: Dataset) ndarray[source]

Forms the adjacency matrix for the graph of the topography.

Note that this is independent of whatever the segments are called, it is a purely a map of the relative object locations. :Parameters: topo (xarray.Dataset) – Describes the topograph of the river network.

Returns

An adjacency matrix describing the river network.

Return type

numpy.ndarray

bmorph.evaluation.plotting.create_nxgraph(adj_mat: ndarray) Graph[source]

Creates a NetworkX Graph object given an adjacency matrix.

Parameters

adj_mat (numpy.ndarray) – Adjacency matrix describing the river network.

Returns

NetworkX Graph of respective nodes.

Return type

networkx.graph

bmorph.evaluation.plotting.custom_legend(names: List, colors=['Maroon', 'Navy', 'Black', 'Orange', 'Yellow', 'Blue', 'Grey', 'Pink', 'Lavender'])[source]

Creates a list of patches to be passed in as handles for the plt.legends function.

Parameters
  • names (list) – Legend names.

  • colors (list) – A list of the colors corresponding to names.

Returns

Handle parameter for matplotlib.legend.

Return type

handles

bmorph.evaluation.plotting.determine_row_col(n: int, pref_rows=True)[source]

Determines rows and columns for rectangular subplots

Calculates a rectangular subplot layout that contains at least n subplots, some may need to be turned off in plotting. If a square configuration is possible, then a square configuration will be proposed. This helps automate the process of plotting a variable number of subplots.

Parameters
  • n (int) – Total number of plots.

  • pref_rows (boolean) – If True, and only a rectangular arrangment is possible, then put the longer dimension in n_rows. If False, then it is placed in the n_columns.

Returns

  • n_rows (int) – Number of rows for matplotlib.subplot.

  • n_columns (int) – Number of columns for matplotlib.subplot.

bmorph.evaluation.plotting.diff_maxflow_plotter(observed: DataFrame, names: list, colors: list, *models: DataFrame)[source]

Plots box plots of numerous models grouped by site.

Parameters
  • observed (pandas.Dataframe) – a dataframe containing observations

  • names (list) – List of the model names.

  • colors (list) – List of colors to be plotted.

  • *models (List[pandas.DataFrame]) – Any number of pandas.DataFrame objects to be evaluated.

Return type

matplotlib.figure, matplotlib.axes

bmorph.evaluation.plotting.diff_maxflow_sites(observed: DataFrame, predicted: DataFrame)[source]

Calculates difference in maximum flows on a hydrologic year and site-by-site basis.

Parameters
  • observed (pandas.DataFrame) – Dataframe containing all observations.

  • predicted (pandas.DataFrame) – Dataframe containing all predictions.

Returns

DataFrame containing the difference in maximum flows.

Return type

pandas.DataFrame

bmorph.evaluation.plotting.draw_dataset(topo: ~xarray.core.dataset.Dataset, color_measure: ~pandas.core.series.Series, cmap=<matplotlib.colors.LinearSegmentedColormap object>)[source]

“Plots the river network through networkx.

Draws a networkx graph from a topological xrarray.Dataset and color codes it based on a pandas.Series.

Parameters
  • topo (xarray.Dataset) – Contains river segment identifications and relationships.

  • color_measure (pandas.Series) – Indicies are concurrent with the number of segs in topo. Typically this contains statistical information about the flows that will be color coded by least to greatest value.

  • cmap (matplotlib.colors.LinearSegmentedColormap, optional) – Colormap to be used for coloring the SimpleRiverNewtork plot. This defaults as ‘coolwarm_r’.

bmorph.evaluation.plotting.find_all_upstream(topo: Dataset, segID: int, return_segs: list = []) ndarray[source]

Finds all upstream river segments for a given river segment from the xarray.Dataset.

Parameters
  • topo (xarray.Dataset)

  • segID (int)

  • return_segs (list)

Return type

numpy.ndarray

bmorph.evaluation.plotting.find_index_water_year(data: DataFrame) int[source]

Finds the index of the first hydrologic year.

Parameters

data (pd.DataFrame) – Flow timeseries with a DateTime index.

Returns

Index of the first hydrologic year.

Return type

int

bmorph.evaluation.plotting.find_upstream(topo: Dataset, segID: int, return_segs: list = [])[source]

Finds what river segment is directly upstream from the xarray.Dataset.

Parameters
  • topo (xarray.Dataset) – Contains river network topography. Expecting each river segment’s immeditate downstream river segment is designated by ‘Tosegment’/

  • segID (int) – Current river segment identification number.

  • return_segs (list) – River segment identification numbers upstream from segID. This defaults as an empty list to be filled by the method.

bmorph.evaluation.plotting.kl_divergence_annual_compare(flow_dataset: Dataset, sites: list, raw_var: str, raw_name: str, ref_var: str, ref_name: str, bc_vars: list, bc_names: list, plot_colors: list, title='Annual KL Diveregence Before/After Bias Correction', fontsize_title=40, fontsize_tick=30, fontsize_labels=40, fontsize_legend=30, showfliers=False, sharex=True, sharey='row', TINY_VAL=1e-06, figsize=(30, 20), show_y_grid=True)[source]

Kullback-Liebler Divergence compared before and after bias correction as boxplots.

Plots the KL divergence for each year per site as KL(P_{ref} || P_{raw}) and KL( P_{ref} || P_{bc}).

Parameters
  • flow_dataset (xarray.Dataset) – Contains raw (uncorrected), reference (true), and bias corrected flows.

  • sites (list) – Contains all the sites to be plotted as included in flow_dataset, (note that if the number of sites to be plotted is square or rectangular, the last site will not be plotted to save room for the legend).

  • raw_var (str) – Accesses the raw flows in flow_dataset.

  • raw_name (str) – Label for the raw flows in the legend and horizontal labels corresponding to raw_var.

  • ref_var (str) – Accesses the reference flows in flow_dataset.

  • ref_name (str) – Label the reference flows in the legend and horizontal labels corresponding to ref_var.

  • bc_vars (list) – String(s) to access the bias corrected flows in flow_dataset.

  • bc_names (list) – Label(s) for the bias corrected flows in the legend and horizontal labels, corresponding to each element in bc_vars.

  • plot_colors (list) – Colors to be plotted for the raw and the bias corrected flows, respectively.

  • title (str, optional) – Title to be plotted, defaults as “Annual KL Diveregence Before/After Bias Correction”.

  • fontsize_title (int, optional) – Font size of the title, defaults as 40.

  • fontsize_tick (int, optional) – Font size of the ticks, defaults as 30.

  • fontsize_labels (int, optional) – Font size of the labels, defaults as 40.

  • fontsize_legend (int, optional) – Font size of the legend text, defaults as 30.

  • showfliers (boolean, optional) – Whether to include fliers in the boxplots, defaults as False.

  • sharex (boolean or str, optional) – Whether the horizontal axis is shared, defaults as True.

  • sharey (boolean or str, optional) – Whether the vertical axis is shared, defaults as ‘row’ to share the vertical axis in the same row.

  • TINY_VAL (float, optional) – Used to ensure there are no zero values in the data because zero values cause unsual behavior in calculating the KL Divergence. Defaults as 1E-6.

  • figsize (tuple, optional) – Figure size following maptlotlib connventions, defaults as (30,20).

  • show_y_grid (boolean, optional) – Whether to plot y grid lines, defaults as True.

Return type

maptlotlib.figure, matplotlib.axes

bmorph.evaluation.plotting.log10_1p(x: ndarray)[source]

Return the log10 of one plus the input array, element-wise.

Parameters

x (numpy.ndarray) – An array of values greater than -1. If values are less than or equal to -1, then a domain error will occur in computing the logarithm.

Returns

y – Array of the values having the log10(element+1) computer.

Return type

numpy.ndarray

bmorph.evaluation.plotting.norm_change_annual_flow(sites: list, before_bc: ~pandas.core.frame.DataFrame, after_bc: ~pandas.core.frame.DataFrame, colors=<class 'list'>, fontsize_title=60, fontsize_labels=40, fontsize_tick=30)[source]

Normalized change in annual flow volume.

Plots a series of subplots containing bar charts that depict the differnece in normalized annual flow volume due to bias correction.

Parameters
  • sites (list) – String names of all the sites to be plotted, matching sites contained in the DataFrames before_bc and after_bc.

  • before_bc (pandas.DataFrame) – Contains flows, (not aggregated), before bias correction is applied.

  • after_bc (pandas.DataFrame) – Contains flows, (not aggregated), after bias correction is applied.

  • colors (list) – Ccolors to be used for each site’s subplot in the same order as sites, (does not have to be unique).

  • fontsize_title (int, optional) – Font size of the title. Defaults as 60.

  • fontsize_labels (int, optional) – Font size of the labels. Defaults as 40.

  • fontsize_tick (int, optional) – Font size of the ticks. Defaults as 30.

bmorph.evaluation.plotting.organize_nxgraph(topo: Graph)[source]

Orders the node positions into a hierarchical structure.

Based on the “dot” layout and given topography. :Parameters: topo (xarray.Dataset) – Contains river segment identifications and relationships.

Return type

networkx.positions

bmorph.evaluation.plotting.pbias_compare_hist(sites: list, raw_flow: ~pandas.core.frame.DataFrame, ref_flow: ~pandas.core.frame.DataFrame, bc_flow: ~pandas.core.frame.DataFrame, grouper=TimeGrouper(freq=<YearEnd: month=12>, axis=0, sort=True, closed='right', label='right', how='mean', convention='e', origin='start_day'), total_bins=None, title_freq='Yearly', fontsize_title=90, fontsize_subplot_title=60, fontsize_tick=40, fontsize_labels=84, x_extreme=150)[source]

Histograms comparing percent bias before/after bias correction.

Creates a number of histogram subplots by each given sites that plot percent bias both before and after bias correction.

Parameters
  • sites (list) – Sites corresponding to the columns of the flow DataFrames, raw_flow, ref_flow, and bc_flow, to be plotted.

  • raw_flow (pandas.DataFrame) – Flows before bias correction.

  • ref_flow (pandas.DataFrame) – Reference flows for comparison as true values.

  • bc_flow (pandas.DataFrame) – Flows after bias correction.

  • grouper (pandas.TimeGrouper) – How flows should be grouped for bias correction, defaults as yearly.

  • total_bins (int, optional) – Number of bins to use in the histogram plots. if none specified, defaults to the floored square root of the number of pbias difference values.

  • title_freq (str, optional) – An adjective description of the frequency with which the flows are grouped corresponding to grouper. Defaults as ‘Yearly’.

  • fontsize_title (int, optional) – Font size of the title, defaulting as 90.

  • fontsize_subplot_title (int, optional) – Font size of the subplot title, defaulting as 60.

  • fontsize_ticler (int, optional) – Font size of the ticks, defaulting as 40.

  • fontsize_labels (int, optional) – Font size of the labels, defaulting as 84.

  • x_extreme (float, optional) – Greatest magnitude on the horizontal axis to specify the range, defaulting as 150, which results in a range of (-150, 150). This is useful if desiring to zoom in closer to the origin and exclued outlying percent biases.

bmorph.evaluation.plotting.pbias_diff_hist(sites: list, colors: list, raw_flow: ~pandas.core.frame.DataFrame, ref_flow: ~pandas.core.frame.DataFrame, bc_flow: ~pandas.core.frame.DataFrame, grouper=TimeGrouper(freq=<MonthEnd>, axis=0, sort=True, closed='right', label='right', how='mean', convention='e', origin='start_day'), total_bins=None, title_freq='Monthly', fontsize_title=90, fontsize_subplot_title=60, fontsize_tick=40, fontsize_labels=84)[source]

Histograms of differences in percent bias before/after bias correction.

Creates a number of histogram subplots by each given site that plot the difference in percent bias before and after bias correction.

Parameters
  • sites (list) – Sites that are the columns of the flow DataFrames, raw_flow, ref_flow, and bc_flow.

  • colors (list) – Colors to plot the sites with, (do not have to be different), that are used in the same order as the list of sites.

  • raw_flow (pandas.DataFrame) – Contains flows before bias correction.

  • ref_flow (pandas.DataFrame) – Contains reference flows for comparison as true values.

  • bc_flow (pandas.DataFrame) – Contains flows after bias correction.

  • grouper (pandas.TimeGrouper, optional) – How flows should be grouped for bias correction. This defaults as monthly.

  • total_bins (int, optional) – Number of bins to use in the histogram plots. If none specified, defaults to the floored square root of the number of pbias difference values.

  • title_freq (str, optional) – An adjective description of the frequency with which the flows are grouped, should align with grouper, although there is no check to verify this. This defaults as ‘Monthly’.

  • fontsize_title (int, optional) – Font size of the title. Defaults as 90.

  • fontsize_subplot_title (int, optional) – Font size of the subplots. Defaults as 60.

  • fontsize_tick (int, optional) – Font size of the ticks. Defaults as 40.

  • fontsize_labels (int, optional) – Font size of the labels. Defaults as 84.

bmorph.evaluation.plotting.pbias_plotter(observed: DataFrame, names: list, colors: list, *models: DataFrame)[source]

Plots box plots of numerous models grouped by site.

Parameters
  • observed (pandas.Dataframe) – Dataframe containing observations.

  • names (list) – List of the model names.

  • colors (list) – List of colors to be plotted.

  • *models (List[pandas.DataFrame]) – Any number of pandas.DataFrame objects to be evaluated.

Return type

matplotlib.figure, matplotlib.axes

bmorph.evaluation.plotting.pbias_sites(observed: DataFrame, predicted: DataFrame)[source]

Calculates percent bias on a hydrologic year and site-by-site basis.

Parameters
  • observed (pandas.DataFrame) – Dataframe containing all observations.

  • predicted (pandas.DataFrame) – Dataframe containing all predictions.

Returns

Dataframe contain the percent bias computed.

Return type

pandas.DataFrame

bmorph.evaluation.plotting.plot_reduced_flows(flow_dataset: ~xarray.core.dataset.Dataset, plot_sites: list, reduce_func=<function mean>, interval='day', statistic_label='Mean', units_label='$(m^3/s)$', title_label='Annual Mean Flows', raw_var='IRFroutedRunoff', raw_name='Mizuroute Raw', ref_var='upstream_ref_flow', ref_name='upstream_ref_flow', bc_vars=[], bc_names=[], fontsize_title=24, fontsize_legend=20, fontsize_subplot=20, fontsize_tick=20, fontcolor='black', figsize_width=20, figsize_height=12, plot_colors=['grey', 'black', 'blue', 'red'], return_reduced_flows=False)[source]

Creates a series of subplots plotting statistical day of year flows per gauge site.

Parameters
  • flow_dataset (xarray.Dataset) – Contatains raw, reference, and bias corrected flows.

  • plot_sites (list) – Sites to be plotted.

  • reduce_func (function, optional) – A function to apply to flows grouped by interval, defaults as np.mean.

  • interval (str, optional) – What time interval annual reduce_func should be computed on. Currently supported is day for dayofyear (default), week for weekofyear, and month for monthly.

  • statistic_label (str, optional) – Label for the statistic representing the reduce_func, defaults as ‘Mean’ to fit reduce_func as np.mean.

  • units_label (str, optional) – Label for the units of flow, defaults as r`$(m^3/s)$`.

  • title_label (str) – Lable for the figure title representing the reduce_func, defaults as f’Annual Mean Flows’ to fit reduce_func as np.mean.

  • raw_var (str, optional) – The string to access the raw flows in flow_dataset, defaults as ‘IRFroutedRunoff’.

  • raw_name (str, optional) – Label for the raw flows in the legend, defaults as ‘Mizuroute Raw’.

  • ref_var (str, optional) – The string to access the reference flows in flow_dataset, defaults as ‘upstream_ref_flow’.

  • ref_name (str, optional) – Label for the reference flows in the legend, defaults as ‘upstream_ref_flow’.

  • bc_vars (list) – The strings to access the bias corrected flows in flow_dataset.

  • bc_names (list) – Labels for the bias corrected flows in the legend, expected in the same order as bc_vars.

  • plot_colors (list, optional) – Colors to be plotted for raw_var, ref_var, bc_vars respectively. Defaults as [‘grey’, ‘black’, ‘blue’, ‘red’].

  • return_reduced_flows (boolean, optional) – If True, returns the reduced flows as calculated for plotting, defaults as False. This is typically used for debugging purposes.

  • fontsize_title (int, optional) – Font size of the plot title, defaults as 80.

  • fontsize_legend (int, optional) – Font size of the plot legend, defaults as 68.

  • fontsize_subplot (int, optional) – Font size for the subplots, defaults as 60.

  • fontsize_tick (int, optional) – Font size of the ticks, defaults 45.

  • fontcolor (str, optional) – Color of the font, defaults as ‘black’

  • figsize_width (int, optional) – Width of the figure, defaults as 70.

  • figusize_height (int, optional) – Height of the figure, defaults as 30.

Returns

If return_reduced_flows is False, matplotlib.figure and matplotlib.axes, otherwise the reduced_flows are returned as the xarray.Dataset.

Return type

xarray.Dataset or (matplotlib.figure, matplotlib.axes)

bmorph.evaluation.plotting.plot_residual_overlay(flows: DataFrame, upstream_sites: list, downstream_site: str, start_year: int, end_year: int, ax=None, fontsize_title=40, fontsize_labels=60, fontsize_tick=30, linecolor='k', alpha=0.3)[source]

Plots flow upstream/downstream residuals overlayed across one year span.

Plots residuals from each hydrologic year on top of each other with a refence line at zero flow. Residuals are calculated as downstream flows - sum(upstream flows).

Parameters
  • flows (pandas.DataFrame) – All flows to be used in plotting.

  • upstream_sites (list) – Strings of the site names stored in flows to aggregate.

  • downstream_sites (str) – Name of the downstream site stored in flows that will have the upstream_sites subtracted from it.

  • start_year (int) – The starting year to plot.

  • end_year (int) – The year to conclude on.

  • ax (matplotlib.axes, optional) – Axes to plot on. If none is specified, a new one is created.

  • fontsize_title (int, optional) – Font size of the title.

  • fontsize_labels (int, optional) – Font size of the labels.

  • fontsize_tick (int, optional) – Font size of the ticks.

  • linecolor (str, optional) – Color of the lines plotted. Defaults as ‘k’ for black.

  • alpha (float, optional) – Transparency of the lines plotted. Defaults as 0.3 to help see how residuals line up across many years.

Return type

matplotlib.axes

bmorph.evaluation.plotting.plot_spearman_rank_difference(flow_dataset: ~xarray.core.dataset.Dataset, gauge_sites: list, start_year: str, end_year: str, relative_locations_triu: ~pandas.core.frame.DataFrame, basin_map_png, cmap=<matplotlib.colors.LinearSegmentedColormap object>, blank_plot_color='w', fontcolor='black', fontsize_title=60, fontsize_tick=30, fontsize_label=45)[source]

Creates a site-to-site rank correlation difference comparison plot with a map of the basin.

Parameters
  • flow_dataset (xarray.Dataset) – Contains raw as ‘raw_flow’ and bias corrected as ‘bias_corrected_total_flow’ flow values, times of the flow values, and the names of the sites where those flow values exist

  • gauge_sites (list) – Gauge sites to be plotted.

  • start_year (str) – String formatted as ‘yyyy-mm-dd’ to start rank correlation window.

  • end_year (str) – String formatted as ‘yyyy-mm-dd’ to end rank correlation window.

  • relative_locations_triu (pandas.DataFrame) – Denotes which sites are connected with a ‘1’ and has the lower triangle set to ‘0’.

  • basin_map_png (png file) – The basin map with site values marked.

  • cmap (matplotlib.colors.LinearSegmentedColormap, optional) – Colormap to be used for coloring the SimpleRiverNewtork plot. This defaults as ‘coolwarm_r’.

  • blank_plot_color (str, optional) – Color to set the lower extremes in cmap should be to keep extreme values from skewing the color map and hiding values. This is defaulted as ‘w’ for white. It should appear that this color matches the background plot color to make it appear as if no value is plotted here.

  • font_color (str, optional) – Color of the font, defaulted as ‘black’.

  • fontsize_title (int, optional) – Font size of the title, defaults as 60.

  • fontsize_tick (int, optional) – Font size of the ticks, defaults as 30.

  • fontsize_label (int, optional) – Font size of the labels, defaults as 45.

bmorph.evaluation.plotting.rmseFracPlot(data_dict: dict, obs_key: str, sim_keys: list, sites=[], multi=True, colors=['Maroon', 'Navy', 'Black', 'Orange', 'Yellow', 'Blue', 'Grey', 'Pink', 'Lavender'])[source]

Root mean square values calculated by including descending values one-by-one.

Parameters
  • data_dict (dict) – Expecting {“Data Name”: data pandas.DataFrame}.

  • obs_key (str) – Dictionary key for the computations dictionary that accesses the observations to be used as true in calculating root mean squares.

  • sim_keys (list) – Dictionary keys accessing the simulated DataFrames in computations, used in predictions in calculating root mean squares.

  • sites (list) – Site(s) to be compared in the plot, can have a size of 1. If multi is set to False and this is not changed to a single site, then the first value in the list will be chosen.

  • multi (boolean, optional) – Whether the plot uses data from multiple sites or a single site.

  • colors (list, optional) – Colors as strings to be plotted from. Plotting colors are different for each correction DataFrame, but same across sites for a singular correction. An error will be thrown if there are more sim_keys then colors.

bmorph.evaluation.plotting.scatter_series_axes(data_x, data_y, label: str, color: str, alpha: float, ax=None) axes[source]

Creates a scatter axis for plotting.

Parameters
  • data_x (array-like) – Data for the x series.

  • data_y (array-like) – Data for the y series.

  • label (str) – Name for the axes.

  • color (str) – Color for the markers.

  • alpha (float) – Transparency for the markers.

Return type

matplotlib.axes

bmorph.evaluation.plotting.site_diff_scatter(predictions: dict, raw_key: str, model_keys: list, compare: dict, compare_key: str, site: str, colors=['Maroon', 'Navy', 'Black', 'Orange', 'Yellow', 'Blue', 'Grey', 'Pink', 'Lavender'])[source]

Creates a scatter plot of Raw-BC versus some measure.

Parameters
  • predictions (dict) – Expects {‘Prediction Names’ : Prediction pandas.DataFrame}. ‘Prediction Names’ will be printed in the legend.

  • raw_key (str) – The key for the predictions dictionary that directs to the raw data that each model will be subtracting.

  • model_keys (list) – A list of dictoionary keys pertaining to the correction models that are wanting to be plotted.

  • compare (dict) – Expecting {‘Measure name’ : measure pandas.DataFrame}. These are what is being plotted against on the horizontal-axis.

  • compare_key (str) – The dictionary key for the measure desired in the compare dictionary. ‘compare_key’ will be printed on the horizontal axis.

  • site (str) – A single site designiation to be examined in the plot. This will be listed as the title of the plot.

  • colors (List[str], optional) – Colors as strings to be plotted from.

Return type

matplotlib.figure, matplotlib.axes

bmorph.evaluation.plotting.spearman_diff_boxplots_annual(raw_flows: DataFrame, bc_flows: DataFrame, site_pairings, fontsize_title=40, fontsize_tick=30, fontsize_labels=40, subtitle=None, median_plot_color='red')[source]

Annual difference in spearman rank as boxplots.

Creates box plots for each stide pairing determing the difference in spearman rank for each year between the raw and bias corrected data.

Parameters
  • raw_flows (pandas.DataFrame) – Raw flows before bias correction with sites in the columns and time in the index.

  • bc_flows (pandas.DataFrame) – Bias corrected flows with sites in the columns and time in the index.

  • site_pairings (List[List[str]]) – List of list of string site pairs e.g. [[‘downstream_name’,’upstream_name’],…]. This is used to organize which sites should be paired together for computing the spearman rank difference.

  • fontsize_title (int, optional) – Font size of the title, defaults as 40.

  • fontsize_tick (int, optional) – Font size of the ticks, defaults as 30.

  • fontsize_labels (int, optional) – Font size of the labels, defaults as 40.

  • subtitle (str, optional) – Subtitle to include after “Annual Chnage in Speraman Rank: “. If no subtitle is is specified, none is included and only the title is plotted.

  • median_plot_color (str) – Color to plot the boxplot’s median as, defaults as ‘red’.

bmorph.evaluation.plotting.spearman_diff_boxplots_annual_compare(flow_dataset: Dataset, site_pairings, raw_var: str, bc_vars: list, bc_names: list, plot_colors: list, showfliers=True, fontsize_title=40, fontsize_tick=25, fontsize_labels=30, figsize=(20, 20), sharey='row')[source]

Annual difference in spearman rank as boxplots.

Creates box plots for each site pairing determining the difference in spearman rank for each year between the raw and the bias corrected data.

Parameters
  • flow_dataset (xarray.Dataset) – Contains raw (uncorrected), reference (true), and bias corrected flows.

  • site_pairings (List[List[str]]) – List of list of string site pairs e.g. [[‘downstream_name’,’upstream_name’],…]. This is used to organize which sites should be paired together for computing the spearman rank difference.

  • raw_var (str) – Accesses the raw (uncorrected) flows in flow_dataset.

  • bc_vars (list) – Strings to access the bias corrected flows in flow_dataset.

  • bc_names (list) – Labels for the bias corrected flows from flow_dataset, corresponding to each element in bc_vars in the same order.

  • plot_colors (list) – Colors that are in the same order as the bc_vars and bc_names to be used in plotting.

  • showfliers (boolean, optional) – Whether to show fliers on the boxplots, defaults as True.

  • fontsize_title (int, optional) – Font size of the title, defaults as 40.

  • fontsize_tick (int, optional) – Font size of the ticks, defaults as 25.

  • fontsize_lables (int, optional) – Font size of the labels, defaults as 30.

  • figsize (tuple) – Figure size following matplotlib connventions, defaults as (20, 20).

  • sharey (boolean or str, optional) – Whether or how the vertical axis are to be shared, defaults as ‘row’ to have vertical axis in the same row shared.

Return type

matplotlib.figure, matplotlib.axes

bmorph.evaluation.plotting.stat_corrections_scatter2D(computations: dict, baseline_key: str, cor_keys: list, uncor_key: str, sites=[], multi=True, colors=['Maroon', 'Navy', 'Black', 'Orange', 'Yellow', 'Blue', 'Grey', 'Pink', 'Lavender'])[source]

Creates a scatter plot of the flow before/after corrections relative to observations.

Parameters
  • computations (dict) – Expecting {“Correction Name”: correction pandas.DataFrame}.

  • baseline_key (str) – Contains the dictionary key for the computations dictionary that accesses what baseline the corrections should be compared to. This is typically observations.

  • cor_keys (list) – Dictionary keys accessing the correction DataFrames in computations. These will be printed in the legend.

  • uncor_key (str) – The dictionary key that accesses the uncorrected data in computations.

  • sites (list) – Site(s) to be compared in the plot, can have a size of 1. If multi is set to False and this is not changed to a single site, then the first value in the list will be chosen.

  • multi (boolean, optional) – Determines whether the plot uses data from multiple sites or a single site, defaults as True.

  • colors (List[str], optional) – Colors as strings to be plotted from. Plotting colors are different for each correction DataFrame, but same across sites for a singular correction. An error will be thrown if there are more cor_keys then colors.

Return type

matplotlib.figure, matplotlib.axes

Simple River Network

class bmorph.evaluation.simple_river_network.SegNode(seg_id, pfaf_code)[source]

River segment node used in SimpleRiverNetwork.

Creates a node of a segment to be used in the simple river network.

Variables
  • pfaf_code (int) – Pfafstetter code for this river segment.

  • seg_id (int) – Idenification for this river segment.

  • upstream (List[SegNode]) – Containing what is directly upstream from this SegNode.

  • basin_area (float) – Summative Basin Area for this seg.

  • _end_marker (boolean) – TRUE if this SegNode marks the end of an interbasin during simple_river_network.ecode_pfaf. This variable is only used during the encoding of pfaffstetter codes for the SimpleRiverNetwork and is set to FALSE when not in use. Changing this variable could interfere with SimpleRiverNetwork operations.

  • encoded (boolean) – TRUE if this SegNode has been fully given a unique pfaf_code within the SimpleRiverNetwork, otherwise it is FALSE.

class bmorph.evaluation.simple_river_network.SimpleRiverNetwork(topo: ~xarray.core.dataset.Dataset, pfaf_seed=<class 'int'>, outlet_index=0, max_pfaf_level=42)[source]

Psuedo-physical visualization of watershed models.

The SimpleRiverNetwork maps nodes within a given topography to visualize their arragments and simplify different parts of the network to track statistics propogating through the network. This tree network has the root as outlet, parsing upstream for all operations, opposite of the direction of flow.

Variables
  • topo (xarray.Dataset) – Dataset describing the topography of the river network.

  • seg_id_values (list) – All seg_id’s being used in the network that identify river segments in the watershed.

  • outlet (SegNode) – The end of the river network and start of the SimpleRiverNetwork, aka “root” of the network.

  • adj_mat (numpy.array) – A square adjacency matrix size N, where N is the length of seg_id_values, that can be used to graph the SimpleRiverNetwork, where the row/column index i corresponds to seg_id i in seg_id_values.

  • network_graph (networkx.graph) – A networkx graph part of the visual component of the network. More information about NetworkX can be found at https://networkx.org/

  • network_positions (dictionary) – A dictionary with networkx nodes as keys and positions as values, using in the plotting of the network.

draw_network(label_map=[], color_measure=None)[source]

Plots the river network through networkx.

draw_multi_measure(color_dict, label_map=[])[source]

Overlays multiple network plots to compare multiple measures at once.

find_node(target_id, node: SegNode)[source]

Linear search of SimpleRiverNetwork for a specific SegNode.

find_like_pfaf(node: SegNode, target_pfaf_digits: list, degree: int)[source]

Finds nodes based on pfaffstetter codes.

collect_upstream_nodes(node: SegNode)[source]

Finds all upstream SegNode’s.

generate_pfaf_map()[source]

List of pfaffstetter codes corresponding to seg_id_values.

generate_weight_map()[source]

Creates a list proportional upstream area ratios for each seg_id_values.

generate_mainstream_map()[source]

Highlights the mainstream for plotting in draw_network.

generate_pfaf_color_map()[source]

Extracts the first pfaffstetter digit of each code for colorcoding.

generate_node_highlight_map(seg_ids: list)[source]

Highlight specific SegNode’s in a SimpleRiverNetwork.

pfaf_aggregate()[source]

Aggregates the flow network by one pfafstetter level.

spawn_srn(spawn_outlet)[source]

Creates a new SimpleRiverNetwork from spawn_outlet and upstream of it.

aggregate_measure(dataset_seg_ids: ndarray, variable: ndarray, aggregation_function) Series[source]

This is a preliminary function.

Aggregates the measure value for the given variable based on how SimpleRiverNetwork has been aggregated and provides a pandas.Series to plot on the SimpleRiverNetwork.

Parameters
  • dataset_seg_ids (numpy.ndarray) – Contains all the seg_id values according to the original topology. This should be in the same order seg order as variable.

  • variable (numpy.ndarray) – Contains all the variable values according to the original topology. This should be in the same order seg order as dataset_seg_ids.

  • aggregation_function (numpy function) – A function to be passed in on how the aggregated segs should have this variable combined, recommended as a numpy function like np.sum, np.median, np.mean …

Returns

a pd.Series formated as

Return type

(seg_id_values_index, aggregated measure)

aggregate_measure_max(dataset_seg_ids: ndarray, variable: ndarray) Series[source]

This is a preliminary function.

Determines the maximum measure value for the given variable based on how SimpleRiverNetwork has been aggregated and provides a pandas.Series to plot on the SimpleRiverNetwork.

Parameters
  • dataset_seg_ids (numpy.ndarray) – Contains all the seg_id values according to the original topology. This should be in the same order seg order as variable.

  • variable (numpy.ndarray) – Contains all the variable values according to the original topology. This should be in the same order seg order as dataset_seg_ids.

Returns

A pandas.Series formated as: (seg_id_values_index, aggregated measure)

Return type

pandas.Series

aggregate_measure_mean(dataset_seg_ids: ndarray, variable: ndarray) Series[source]

This is a preliminary function.

Determines the mean measure value for the given variable based on how SimpleRiverNetwork has been aggregated and provides a pandas.Series to plot on the SimpleRiverNetwork.

Parameters
  • dataset_seg_ids (numpy.ndarray) – Contains all the seg_id values according to the original topology. This should be in the same order seg order as variable.

  • variable (numpy.ndarray) – Contains all the variable values according to the original topology. This should be in the same order seg order as dataset_seg_ids.

Returns

A pandas.Series formated as: (seg_id_values_index, aggregated measure)

Return type

pandas.Series

aggregate_measure_median(dataset_seg_ids: ndarray, variable: ndarray) Series[source]

This is a preliminary function.

Determines the median measure value for the given variable based on how SimpleRiverNetwork has been aggregated and provides a pandas.Series to plot on the SimpleRiverNetwork.

Parameters
  • dataset_seg_ids (numpy.ndarray) – Contains all the seg_id values according to the original topology. This should be in the same order seg order as variable.

  • variable (numpy.ndarray) – Contains all the variable values according to the original topology. This should be in the same order seg order as dataset_seg_ids.

Returns

A pandas.Series formated as: (seg_id_values_index, aggregated measure).

Return type

pandas.Series

aggregate_measure_min(dataset_seg_ids: ndarray, variable: ndarray) Series[source]

Determines the minimum measure value for the given variable based on how SimpleRiverNetwork has been aggregated and provides a pandas.Series to plot on the SimpleRiverNetwork.

Parameters
  • dataset_seg_ids (numpy.ndarray) – Contains all the seg_id values according to the original topology. This should be in the same order seg order as variable.

  • variable (numpy.ndarray) – Contains all the variable values according to the original topology. This should be in the same order seg order as dataset_seg_ids.

Returns

A pandas.Series formated as: (seg_id_values_index, aggregated measure)

Return type

pandas.Series

aggregate_measure_sum(dataset_seg_ids: ndarray, variable: ndarray) Series[source]

This is a preliminary function.

Determines the sum measure value for the given variable based on how SimpleRiverNetwork has been aggregated and provides a pandas.Series to plot on the SimpleRiverNetwork.

Parameters
  • dataset_seg_ids (numpy.ndarray) – Contains all the seg_id values according to the original topology. This should be in the same order seg order as variable.

  • variable (numpy.ndarray) – Contains all the variable values according to the original topology. This should be in the same order seg order as dataset_seg_ids.

Returns

A pandas.Series formated as: (seg_id_values_index, aggregated measure)

Return type

pandas.Series

append_pfaf(node: SegNode, pfaf_digit: int)[source]

Adds a pfaffstetter code digit to all upstream nodes.

Parameters
  • node (SegNode) – A SegNode to designate the root of the flow tree.

  • pfaf_digit (int) – The digit to be added to the pfaffstetter codes.

append_sequential(sequence, base='')[source]

Adds odd digits for unbranching stream segments.

Adds odd digits to the pfaf_codes of SegNode’s in a row, or in sequence. this ensures all SegNode’s within the SimpleRiverNetwork have a unique code.

Parameters
  • sequence (list) – A list of SegNode’s in a sequence, typically aggregated from find_branch.

  • base (str) – An addition to the pfaf_code if needing to append the pfafstetter code being appended.

check_upstream_end_marking(node: SegNode)[source]

Checks if any directly upstream nodes are marked by _end_marker.

Checks if any nodes directly upstream are _end_marker’s and returns True if so.

Parameters

node (SegNode) – A SegNode to check directly upstream from.

Returns

If any directly upstream nodes are marked by _end_marker.

Return type

boolean

clear_end_markers(node)[source]

Sets all upstream _end_marker’s to False.

Sets all end_mark in nodes at and upstream of node to False.

Parameters

node (SegNode) – SegNode to start setting _end_marker to False and moving upstream from.

clear_network()[source]

Sets the network to only the oulet SegNode.

Sets the adjacency matrix to an empty array and sets the upstream designation of the outlet to an empty list, clearing the shape of the network upstream of the outlet. This does not reset the topograpghy or seg_id_values, so the original shape can be rebuilt.

collect_upstream_nodes(node: SegNode)[source]

Finds all upstream SegNode’s.

Finds all nodes upstream of a node and returns a list of them.

Parameters

node (SegNode) – A SegNode to collect upstream nodes from.

Returns

All upstream nodes of node.

Return type

list

color_network_graph(measure, cmap, vmax=None, vmin=None)[source]

Creats a dictionary and colorbar depicting measure.

Parameters
  • measure (pandas.Series, optional) – Describes how colors for each SegNode should be allocated relative to a linear colormap. The index is expected to match the indicies of seg_id_values as a 0:len(seg_id_values)-1 array. If no measure is specified, then colors will be assigned sequentially in order of seg_id_values.

  • cmap (matplotlib.colors.LinearSegmentedColormap) – Colormap to be used for coloring the SimpleRiverNewtork plot.

  • vmin (float, optional) – Minimum value for coloring

  • vmax (float, optional) – Maximum value for coloring

Returns

  • color_dict (dict) – Dictionary of {i:color} where i is the index of the SegNode’s seg_id in seg_id_values.

  • color_bar (ScalarMappable) – A color bar used to plot color values determined by measure for plotting in draw_network.

count_net_upstream(node: SegNode)[source]

Inclusively counts number of upstream nodes.

Counts the number of SegNode’s upstream of node, including the original node.

Parameters

node (SegNode) – A SegNode to begin counting from.

Returns

Number of SegNode’s upstream.

Return type

int

draw_multi_measure(color_dict, label_map=[], node_size=200, font_size=8, font_weight='bold', node_shape='s', linewidths=2, font_color='w', with_labels=False, with_cbar=False, with_background=True)[source]

Overlays multiple network plots to compare multiple measures at once.

Plots several networkx plots of user specified transparency for a single SimpleRiverNetwork to compare mutliple measures at once. For example, if dataset_1 is “Blues” and dataset_2 is “Reds”, then a bivariate colormap can be used where shades of purple would represent the combinations of dataset_1 and dataset_2.

Parameters
  • color_dict (dict) – Expected as {name: [pandas.Series, cmap, alpha]} to organize which colormap applies to which data.

  • label_map (list, optional) – Text to be plotted on top of each node in the same order as seg_id_values. There must be a value for each seg_id in seg_id_values and the values must be unique, otherwise an error will arise in plotting.

  • node_size (float, optional) – Plotting size the nodes, defaulting at 200.

  • font_size (float, optional) – Font size of the text from label_map on top of each node, defaulted at 8.

  • font_weight (str, optional) – Font weight of the text from label_map on top of each node, defaulted as bold.

  • node_shape (str, optional) – Shape of the plotted nodes, defaults as ‘s’ for square. Networkx uses can use any one of ‘so^>v<dph8’.

  • linewidths (float, optional) – Width of the connecting lines between nodes, defaults as 2.

  • font_color (str, optional) – Font color of the text from label_map on top of each node, defaulted as w for white.

  • with_labels (boolean, optional) – Whether labels should be plotted on top of each node, True, or not, False. This is defaulted as False.

  • with_cbar (boolean, optional) – Whether a colorbar should be plotted right of the network plot, True, or not, False. This is defaulted as False.

  • with_background (boolean, optional) – Whether a background should be plotted with the network figure, True, or not, False. This is defaulted as True. If desiring to download the image with a transparent background, such as a PNG, then set this to False.

draw_network(label_map=[], color_measure=None, cmap=<matplotlib.colors.LinearSegmentedColormap object>, node_size=200, font_size=8, font_weight='bold', node_shape='s', linewidths=2, font_color='w', node_color=None, with_labels=False, with_cbar=False, with_background=True, cbar_labelsize=10, vmin=None, vmax=None, edge_color='k', alpha=1, cbar_title='', cbar_label_pad=40, ax=None)[source]

Plots the river network through networkx.

Plots the visual component of the SimpleRiverNetwork where spatial connections between river segments can be seen. This graphical tool may not match the topographical shape of the actual river network, but it should be similar. Visualizng how the river segments are connected virtually can help find errors in the construction of large models or locate where analysis only associated with the seg_id of a river segment corresponds to the pseudo-physical network. Plotting the network with labels, highlighting specific nodes, color coding by pfafstetter basin, and other coloring can help visually connect this plot with a topographical plot.

Parameters
  • label_map (list, optional) – Text to be plotted on top of each node in the same order as seg_id_values. There must be a value for each seg_id in seg_id_values and the values must be unique, otherwise an error will arise in plotting.

  • color_measure (pandas.Series, optional) – Describes how colors for each SegNode should be allocated relative to a linear colormap. The index is expected to match the indicies of seg_id_values as a 0:len(seg_id_values)-1 array.

  • cmap (matplotlib.colors.LinearSegmentedColormap, optional) – Colormap to be used for coloring the SimpleRiverNewtork plot. This is defaulted as matplotlib.cm.get_cmap(‘hsv’), a vibrant set of colors to alert that a more specific colormap has not been specified.

  • node_size (float, optional) – Plotting size the nodes, defaulting at 200.

  • font_size (float, optional) – Font size of the text from label_map on top of each node, defaulted at 8.

  • font_weight (str, optional) – Font weight of the text from label_map on top of each node, defaulted as bold.

  • node_shape (str, optional) – Shape of the plotted nodes, defaults as ‘s’ for square. Networkx uses can use any one of ‘so^>v<dph8’.

  • linewidths (float, optional) – Width of the connecting lines between nodes, defaults as 2.

  • font_color (str, optional) – Font color of the text from label_map on top of each node, defaulted as w for white.

  • with_labels (boolean, optional) – Whether labels should be plotted on top of each node, True, or not, False. This is defaulted as False.

  • with_cbar (boolean, optional) – Whether a colorbar should be plotted right of the network plot, True, or not, False. This is defaulted as False.

  • with_background (boolean, optional) – Whether a background should be plotted with the network figure, True, or not, False. This is defaulted as True. If desiring to download the image with a transparent background, such as a PNG, then set this to False.

  • cbar_labelsize (float, optional) – Font size of the labels on the colorbar that can be attached in with_cbar being set to True, defaulted as 10.

  • vmin (float, optional) – Minimum value for coloring

  • vmax (float, optional) – Maximum value for coloring

  • edge_color (str, optional) – Node outline color of each node, defaulted as ‘k’ for black.

  • alpha (float) – Transparancy of each node, where 1 is perfectly opaque and 0 is perfectly transparent. This is primarly useful in draw_multi_measure, where plots are overlayed on top of each other.

  • cbar_title (str, optional) – Title of the colorbar that can be attached in with_cbar being set to True. This is defaulted as ‘’ to exclude a title.

  • cbar_label_pad (float, optional) – Padding for the colorbar labels, defaulted as 40.

  • ax (Matplotlib Axes object, optional) – Draw the network in the specified Matplotlib axes.

encode_pfaf(root_node=<class 'bmorph.evaluation.simple_river_network.SegNode'>, level=0, max_level=42)[source]

Recursively encodes pfafstetter codes on a SimpleRiverNetwork.

Parameters
  • root_node (SegNode) – A SegNode from which to start encoding.

  • level (int) – How many levels deep into recursion the method already is. By default, this starts at 0.

  • max_level (int) – The maximum number of levels encode_pfaf will run for before raising a RuntimeError. By default, this is set to the arbitrary number 42 as a safety mechanism.

find_branch(node: SegNode)[source]

Locates the nearest upstream branch.

Locates a node that branches into 2+ nodes, returning what node branches and any nodes prior to the branch that where in a row.

Parameters

node (SegNode) – A SegNode to start searching from.

Returns

  • branch (SegNode) – The SegNode found with an upstream branch.

  • sequential_nodes (list) – The list of nodes preceeding and including branch.

find_like_pfaf(node: SegNode, target_pfaf_digits: list, degree: int)[source]

Finds nodes based on pfaffstetter codes.

Finds all nodes with the matching digit at the exact same location in pfaf_code and returns all of them in a list.

Parameters
  • node (SegNode) – A SegNode to start searching from.

  • target_pfaf_digits (list) – A list of pfaf_digit to search for in the SimpleRiverNetwork. This can be a list of one element.

  • degree (int) – How many pfafstetter levels deep should the function look for i.e. if you have degree 2, and a pfaf_code of 1234, it will examin “3” to check if it is a match.

Returns

A list of odes with like pfaffstetter codes that match the target_pfaf_digits at the input degree.

Return type

list

find_node(target_id, node: SegNode)[source]

Linear search of SimpleRiverNetwork for a specific SegNode.

Searches for and returns a node with the desired seg_id upstream of node.

Parameters
  • target_id (int) – A seg_id to search for within the SimpleRiverNetwork.

  • node (SegNode) – A SegNode to start searching from. The seg_id of this SegNode is checked against target_id.

Returns

SegNode with seg_id matching target_id. If the desired SegNode could not be found, None is returned.

Return type

SegNode

find_tributary_basins(tributaries)[source]

Finds the four tributaries with the largest drainage areas.

Parameters

tributaries (list) – A list of tributary SegNode’s to be searched.

Returns

A list of the largest_tributaries found in the list of tributaries given.

Return type

list

force_upstream_area(node: SegNode)[source]

Aggregates upstream basin area regardless of _end_marker.

Calculates the basin area upstream of node of interest, ignoring _end_marker. This does include the area of the node of interest.

Parameters

node (SegNode) – A SegNode to start from and calculate both its and all upstream SegNode’s aggregate area.

Returns

Aggregate upstream area.

Return type

float

generate_mainstream_map()[source]

Highlights the mainstream for plotting in draw_network.

Creates a list of which nodes are part of the mainstream in order of the seg_id_values.

Returns

int booleans denoting whether a river segment is part of the mainstream, 1, or off the mainstream, 0, corresponding to each seg_id in seg_id_values.

Return type

list

generate_node_highlight_map(seg_ids: list)[source]

Highlight specific SegNode’s in a SimpleRiverNetwork.

Takes a list of seg_id’s and creates a pandas.Series that highlights the nodes in the list. This is best used as a diagnostic tool, finding where a specific river segment is located on a network map. Using a colormap that has notably different colors on the extremes, such as matplotlib’s “Reds”, is recommended to make highlighted nodes stand out.

Parameters

seg_ids (list) – A list of seg_id values to mark specific SegNode’s apart from other SegNode’s.

Returns

A list that will identify these highlighted nodes for draw_network by int booleans, 1 is to be higlighted while 0 is not.

Return type

list

generate_pfaf_codes()[source]

Creates a list of pfaf_code values corresponding to seg_id_values.

Returns

pfaf_code values mapped to seg_id_values.

Return type

list

generate_pfaf_color_map()[source]

Extracts the first pfaffstetter digit of each code for colorcoding.

This prepares a color_measure for draw_network, where each unique first level pfaffstetter basin can be assigned it’s own color by a Colormap. If a SegNode has the pfaf_code “1234”, then its will have the value “1” in returned map. Using a qualitative colormap of 10 distinct colors, such as matplotlib’s “tab10”, is recommended since there are 10 unique pfafstetter digits, (0 to 9).

Returns

Map of color codes for pfaf_code of each SegNode to the indicies of each SegNode corresponding to its seg_id in seg_id_values. The index should not be reassigned as it is used to match the correct nodes together in draw_network.

Return type

pandas.Series

generate_pfaf_map()[source]

List of pfaffstetter codes corresponding to seg_id_values.

Creates a list of pfaf_code values in the order of the seg_id_values, including the `seg_id_values. This is a little more cluttered, reccommended only for debugging purposes.

Returns

pfaf_code values for each corresponding seg_id in seg_id_values.

Return type

list

generate_weight_map()[source]

Creates a list proportional upstream area ratios for each seg_id_values.

Creates a list of fractional weights equivalent to the node’s upstream area divided by the overall basin_area of the whole SimpleRiverNetwork. these are in order of the seg_id_values.

Returns

(river segment’s cumulative upstream area)/(total basin area) for each river segment corresponding to the seg_id’s in seg_id_values.

Return type

list

net_upstream_area(node: SegNode)[source]

Aggregates upstream basin area.

Calculates the basin area upstream of node. This does include the area of the node of interest

Parameters

node (SegNode) – A SegNode to start from and calculate both its and all upstream nodes aggregate area.

Returns

Aggregate upstream area.

Return type

float

parse_upstream(node: SegNode)[source]

Constructs and connects SegNodes according to the network.

Recursively constructs network by searching what SegNodes are upstream of the current SegNode and updates the current SegNode’s upstream list while also building the adjacency matrix.

Parameters

node (SegNode) – A SegNode to start building the network from.

pfaf_aggregate()[source]

Aggregates the flow network by one pfafstetter level.

This “rolls up” a SimpleRiverNetwork to simplify the overall map, similar to decreasing the number of lines in a contour plot to make it more legible. If the longest pfaf_code in the network is four digits long, such as “1234” or “5678”, then all of the SegNodes sharing the first three digits will be replaced by a singular SegNode with the pfaf_code of those first three digits. For example: if you have pfaf_code’s “1231”, “1232”, “1233”, and “1234”, then they become a SegNode with the pfaf_code “123”. Basin area for each SegNode is summed to create the basin area of the new SegNode. Aggregating other properties is still in progress.

reconstruct_adj_mat(node, adj_mat: ndarray)[source]

Rebuilds the adjacency matrix from an existing flow tree.

Parameters
  • node (SegNode) – A SegNode to construct the adjacency matrix from

  • adj_mat (numpy.ndarray) – A square numpy ndarray of zeros originally, the size equal to len(seg_id_values) by len(seg_id_values), to be filled.

Returns

The reconstructed adjacenecy matrix that needs to be set as the network’s adjacency matrix to actually alter the flow tree.

Return type

numpy.ndarray

size_network_graph(measure)[source]

TODO: Implement

sort_by_pfaf(nodes: list, degree=<class 'int'>)[source]

Sorts a list of SegNodes by pfaf_code.

Sorts a list of SegNode’s in decreasing order of a pfaffstetter digit at the given degree. For example, if you have degree 2, and a pfaf_code of 1234, it will use “3” in sorting.

Parameters
  • nodes (list) – A list of SegNode’s to sort.

  • degree (int) – Which index in a pfaf_code the nodes are to be sorted by

Returns

The list of sorted nodes.

Return type

list

sort_streams(node=<class 'bmorph.evaluation.simple_river_network.SegNode'>)[source]

Sorts the mainstream and tributary branches from each other.

Returns which branches are part of the mainstream and which are part of the tributaries based on aggregate upstream area. This is typically used to determine even and odd pfaffsetter basins for encoding.

Parameters

node (SegNode) – A SegNode to start tracing the mainstream from. this is the “root” of the flow tree.

Returns

  • mainstreams (list) – A list of mainstream SegNode’s determined by having the greatest upstream area.

  • tributaries (list) – A list of tributaries having upstream_area less than the mainstream but still encountered along the way.

spawn_srn(spawn_outlet)[source]

Creates a new SimpleRiverNetwork from spawn_outlet and upstream of it.

A new SimpleRiverNetwork structure is generated from the current network. This is useful if modeling a large watershed and desire to focus on a specific element of it without having to reselect out all the nodes, for example: the Snake River Basin within the Columbia River Basin dataset.

Parameters

spawn_outlet (int) – The `seg_id`of a SegNode in the current SimpleRiverNetwork to generate from. This creates an outlet that the new tree is to be spawned from.

Returns

A new SimpleRiverNetwork with the outlet set to spawn_outlet. Properites are transferred from the pervious SimpleRiverNetwork to this one, but any SegNodes not upstream from spawn_outlet are not included in this new one.

Return type

SimpleRiverNetwork

update_node_area(node: SegNode)[source]

Updates the desired node with basin area information.

Parameters

node (SegNode) – A SegNode to change it and only its basin_area.