User Guide#
Overview of regionalization workflow#
The regionalization workflow includes the following steps:
STEP 0: run calibration and collect formulation prameters and calibration/validation statistics
STEP 1: formulation & parameter regionalization (via nwm-region-mgr)
STEP 2: regionalized NGEN simulation setup (via nwm-mswm-mgr) and execution
STEP 3: evaluation of regionalized simulations (via nwm-verf and nwm-eval-mgr)

Run regionalization with NWM-RTE on INT/EA/UAT Clusters#
On the INT/EA/UAT clusters, all software dependencies for regionalization are installed and managed through
NWM-RTE (Run Time Environment, /ngencerf-app/nwm-rte). Regionalization workflows are executed via docker containers using an
nwm-rte image.
Test the sample regionalization workflow#
Navigate to your preferred working directory (e.g.,
/ngen-oe/$USER/run_region,/ngen-dev/$USER/run_region, or~/run_region).Copy sample config files from
/ngencerf-app/nwm-region-mgr/configs/to your working directory., e.g.,
cd /ngen-oe/$USER/run_region # or your preferred working directory
cp -r /ngencerf-app/nwm-region-mgr/configs .
Run the three regionalization steps below sequentially using one of the two scripts in nwm-rte.
sbatch_run_region.sh, for submitting jobs to compute nodes on INT/EA/UAT via SBATCH (recommended)run_region.sh, for running directly on the controller node or local AWS workspace (only for small regions or testing purposes)
Step 1. Run regionalization#
a) Run formulation regionalization alone (no parreg):
Typically this step can be skipped since parameter regionalization also runs formulation regionalization as a prerequisite. Prior to running, configure the settings in configs/config_general.yaml and configs/config_formreg.yaml.
# submit to compute nodes on INT/EA/UAT
/ngencerf-app/nwm-rte/sbatch_run_region.sh configs formreg
# or run directly in controller node or local AWS workspace
time /ngencerf-app/nwm-rte/run_region.sh -c configs --formreg
b) Run parameter regionalization (formreg is also run as a prerequisite):
Prior to running, configure the settings in configs/config_general.yaml, configs/config_formreg.yaml and configs/config_parreg.yaml.
# submit to compute nodes on INT/EA/UAT
/ngencerf-app/nwm-rte/sbatch_run_region.sh configs parreg
# or run directly in controller node or local AWS workspace
time /ngencerf-app/nwm-rte/run_region.sh -c configs --parreg
Step 2. Run NGEN#
Run NGEN simulations. Prior to running, configure the settings in configs/config_general.yaml and configs/config_ngen.yaml.
# submit to compute nodes on INT/EA/UAT
/ngencerf-app/nwm-rte/sbatch_run_region.sh configs ngen
# or run directly in controller node or local AWS workspace
time /ngencerf-app/nwm-rte/run_region.sh -c configs --ngen
Step 3. Run Evaluation#
Run an evaluation. Prior to running, configure the settings in configs/config_eval.yaml.
# submit to compute nodes on INT/EA/UAT
/ngencerf-app/nwm-rte/sbatch_run_region.sh configs eval
# or run directly in controller node or local AWS workspace
time /ngencerf-app/nwm-rte/run_region.sh -c configs --eval
Run all steps in one command#
Users may prefer running the above steps sequencially so they can inspect the outputs from each step before proceeding to the next step. However, it is possible to run all three steps in one command as shown below:
# submit to compute nodes on INT/EA/UAT
/ngencerf-app/nwm-rte/sbatch_run_region.sh configs parreg ngen eval
# or run directly in controller node or local AWS workspace
time /ngencerf-app/nwm-rte/run_region.sh -c configs --parreg --ngen --eval
Note: When using run_region.sh, the short flags -f, -p, -n, and -e can also be used in place of --formreg,
--parreg, --ngen, and --eval, respectively. The short flags are not supported when using sbatch_run_region.sh.
# run all steps with short flags (only for run_region.sh)
time /ngencerf-app/nwm-rte/run_region.sh -c configs -f -p -n -e
Run regionalization with a specific RTE image tag#
By default, the nwm-rte image with tag latest will be used to run the regionalization workflow. To use a specific
image tag (e.g., for testing with a new image), set the variable image_tag in the script as shown below:
# run all steps with sample configle and a specific image tag
/ngencerf-app/nwm-rte/sbatch_run_region.sh /ngencerf-app/nwm-region-mgr/configs parreg ngen eval --image-tag pr-22-build
Customize and run your own regionalization workflow#
Prepare input data files. Refer to the Input Data subsection for details.
Calibration/validation statistics can be collected from earlier ngenCERF calibration runs using the commands below, which will generate a csv file (in your current run directory) containing the statistics for all specified calibration job IDs, along with another csv file listing the corresponding calibrated parameter sets. These files can then be used in the regionalization configuration files.
ngencerf regionalization 609 610 # where 609 and 610 are example calibration job IDs # or specify a list of calibration job IDs in a text file ngencerf regionalization --id-file job_ids.txt
Adjust configuration files in
configs/to set up your desired regionalization experiment. Refer to the Configuration tab for details on each config file and available options.Follow Steps 1-3 above to execute the regionalization workflow.
Example application: comparing different regionalization methods#
In this section, we will walk through an example application where we compare gower vs. kmeans clustering for parameter regionalization in VPU 09, using selected ngen and StreamCat attributes.
0. Prepare configuration files#
We will start from the sample workflow above. First copy the configuration files to a new folder to avoid overwriting the original files, e.g.:
cp -r configs/ test1_configs/
0.1 Update test1_configs/config_general.yaml#
Set general.vpu_list to [‘09’]
Set general.run_id to a new name: test1. This will be used to name the output folder for this experiment (e.g.,
outputs/region/test1/)
0.2 Update test1_configs/config_parreg.yaml#
Set general.attr_dataset_list to [‘ngen’,’streamcat’] as the attribute datasets for computing catchment similarity
Set general.algorithm_list to [‘gower’, ‘kmeans’]. This will run parameter regionalization using both algorithms sequentially.
Set donor.buffer_km to 100 (instead of 200) to use a smaller donor search neighbourhood (around the VPU) for this experiment
Select specific attributes from each dataset using the attribution selection file
copy sample attribute selection files from
/ngencerf-app/nwm-region-mgr/data/inputs/region/attr_config/to working directory
cp -r /ngencerf-app/nwm-region-mgr/data/inputs/region/attr_config .
For ngen: use the file
attr_selection_ngen.csv. Set the select column to 1 for desired attributes and to 0 for others. Here we select all available ngen attributes except for centroid_x, centroid_y, impervious, ISLTPY, and IVGTYP. Then upate the field attr_datasets.ngen.attr_select_file to reflect the new location of this file (e.g.,{base_dir}/attr_config/attr_selection_ngen.csv).For streamcat: use the file
attr_selection_streamcat.csv. Set the select column to 1 for desired attributes and to 0 for others. Here we select the following attributes: BFI, DamDens, Perm, RckDep, WtDep, PctCarbResid, PctEolCrs, PctWater, Precip, Tmax, Tmean, Tmin, RdDens, Runoff, Clay, Sand, Precip_Minus_EVT. Then update the field attr_datasets.streamcat.attr_select_file to reflect the new location of this file (e.g.,{base_dir}/attr_config/attr_selection_streamcat.csv).Alternatively, we can also specify selected attributes directly in the config file by editing the fields attr_datasets.ngen.attr_list and attr_datasets.streamcat.attr_list, respectively, for ngen and StreamCat.
Set donor.metric_eval_period.value to ‘valid’ to use validation period statistics for donor selection
Set snow_cover.threshold to 10 to define catchment snowiness category based on 10% (mean annual) snowfall
Edit output.params.plots.columns_to_plot to include a couple of CFE parameters to visualize spatial patterns (e.g., ‘b’ and ‘slope’)
Edit output.attr_data_final.plots.columns_to_plot to include some selected attributes to visualize spatial patterns. Specifically,
remove the HLR attributes, since HLR is not chosen for this experiment
change streamcat_Elev to streamcat_Perm, since Elev is not selected in attr_datasets.streamcat.attr_list
add a few ngen attributes: ngen_slope, ngen_aspect, ngen_elevation Note here the attribute names should be prefixed by their dataset names (e.g., ‘ngen_’ or ‘streamcat_’).
Set algorithm.algo_general.max_spa_dist to 1000 to limit the maximum spatial distance for donor selection to 1000 km
Set algorithm.gower.max_attr_dist to 0.3 to allow a larger maximum attribute distance for donor selection when using gower method. Attribute distances ranges from 0 to 1, with smaller values indicating higher similarity.
Set algorithm.kmeans.n_init to 5 to increase the number of random initializations for more robust clustering results (with slightly increased computational cost).
1. Run regionalization#
Run the regionalization step as in Step 1 above, using the updated configuration files in test1_configs/.
/ngencerf-app/nwm-rte/sbatch_run_region.sh test1_configs parreg
Execution time will take 10-20 minutes depending on available computational resources. While running,
intermediate log messages will be printed to the terminal, while also being written to the log file
outputs/region/test1.log, as specified in config_general.yaml.
After completion, check the output folder outputs/region/test1/, which contains sub-folders for
attr_data_final/: files and plots for catchment attributes used in regionalizationformulations/: regionalized formulation files and diagnostic plotsparams/: regionalized formulation and parameter files and plots for each algorithm, with file names indicating the algorithm usedpairs/: donor-receiver pair files for each algorithmspatial_distance/: matrices of spatial distances between receiver (row) and donor (column) catchmentssummary_score/: summary score for all donor candidatesconfig_formreg_final.yamlandconfig_parreg_final.yaml: the final (expanded) configuration files used in this run.
See the Output Directory Structure subsection in the Technical Reference tab for details on the output directory structure.
See the Output Tables and Output Plots subsections in the Technical Reference tab for details on output files and plots.
2. Run NGEN simulations#
In this experiment, we will run NGEN simulations using the parameter sets derived from both gower and kmeans methods, respectively.
First, update the test1_configs/config_ngen.yaml file as follows:
Set algorithm_list to
['gower']for the first runSet start_time and end_time to define the simulation period (e.g., ‘2022-10-01T00:00:00’ to ‘2022-10-10T00:00:00’). Here for demonstration purposes we use a 10-day period in October 2022.
The other fields can remain unchanged.
Run the NGEN simulation step as in Step 2 above.
/ngencerf-app/nwm-rte/sbatch_run_region.sh test1_configs ngen
After completion, the simulation outputs will be saved in the folder
outputs/ngen/regionalization/test1_gower/vpu09/Output/, where the streamflow outputs can be found in the file troute_output_202210010000.nc. Note that the sub-folder name test1_gower includes the run_name (here test1) and the algorithm used (here gower).
Next, update the test1_configs/config_ngen.yaml file again to set algorithm_list to ['kmeans'] for the second run, while keeping other fields unchanged. Run the NGEN simulation step again.
After completion, the simulation outputs will be saved in the folder
outputs/ngen/regionalization/test1_kmeans/vpu09/Output/, where the streamflow outputs can be found in the file troute_output_202210010000.nc.
Depending on available computational resources, each NGEN simulation may take up to an hour or more to complete.
Alternatively, you can also run NGEN simulation for both gower and kmeans methods in a single run by setting algorithm_list to ['gower', 'kmeans'].
3. Run evaluation#
Finally, we will evaluate the two NGEN simulations against observed streamflow data.
Update the configs/config_eval.yaml file as follows:
Set general.location_set_name to vpu_09
Set general.dataset_name to [test1_kmeans, test1_gower]. This defines the names of the two datasets to be evaluated and intercompared, corresponding to the two algorithms used in parameter regionalization.
Set general.nwm_version to [ngen, ngen]. Both simulations use the ngen configuration.
Set general.fcst_start_date and general.fcst_end_date to define the simulation period (e.g.,
'2022-10-01T00:00:00'to'2022-10-10T00:00:00'), consistent with the simulation period used above. Both fields should be lists with the same length as dataset_name, e.g.,forecast_start_date: ['2012-10-01 00:00:00', '2012-10-01 00:00:00'] forecast_end_date: ['2012-10-10 00:00:00', '2012-10-10 00:00:00']
Set general.eval_start_date and general.eval_end_date to define the evaluation period (e.g.,
'2022-10-03T00:00:00'to'2022-10-10T00:00:00'). Here we use an 8-day evaluation period starting from October 3, 2022, to allow a 2-day spin-up period. Both fields should be lists with the same length as dataset_name.Set file_paths.output_dir to point to the directory where evaluation outputs should be saved. Here we add the run_name from regionalization
test1(e.g.,'{base_dir}/outputs/eval/test1/{location_set_name}'), to ensure evaluation outputs are also organized by regionalization runs.Update fields in metics and plotting sections as desired. Here we will compute and plot a set of default evaluation metrics: KGE (Kling-Gupta Efficiency), NSE (Nash-Sutcliffe Efficiency), NNSE (Normalized NSE), and Correlation (CORR). Note the lead_times fields are not applicable here since we are evaluating simulations.
Note: if you would like to explore other configuration options for evaluation, refer to the nwm.verf documentation
Run the evaluation step as in Step 3 above.
/ngencerf-app/nwm-rte/sbatch_run_region.sh test1_configs eval
After completion, evaluation results will be saved in the folder data/outputs/eval/test1/vpu_09/, including
joined/: combined observed and simulated streamflow data for all locations in parquet format; each file corresponds to one dataset (i.e., algorithm)metrics/: evaluation metrics tables for all locations in parquet format; each file corresponds to one dataset (i.e., algorithm)plots/ngen_simulation/: evaluation plots for all locations, comparing the two algorithmsboxplot/: boxplots of evaluation metrics across all locationshistogram/: histograms of evaluation metrics across all locationsspatial_map/: spatial maps of evaluation metrics for each algorithm
test1_gower/ngen_simulation/: streamflow time series data for all locations using gower methodtest1_kmeans/ngen_simulation/: streamflow time series data for all locations using kmeans methodusgs/: observed streamflow time series data for all locationsnwm_verf_config_expanded.yaml: the final (expanded) configuration file used in this run.
Check the metrics and plots to compare/analyze the performance of the two algorithms in parameter regionalization.
Run nwm_region_mgr in local environment#
The steps below walk through package installation and workflow configuration in local (non-containerized) environments.
Installation#
Installing nwm_region_mgr requires
Python 3.11
Python venv (typically included with Python)
git
Since nwm_region_mgr is not currently on PyPI, it must be installed from source. To download this repository, run
git clone https://github.com/NGWPC/nwm-region-mgr.git
cd nwm-region-mgr
To get the most up-to-date code, switch to the development branch.
git checkout development
Next, create a virtual environment to isolate the dependencies of this library from your base Python environment.
python3.11 -m venv .venv
source .venv/bin/activate
You will then be able to install nwm_region_mgr. There are a few download variants that users may be interested in.
# Regular package install
pip install .
# Install the package in edit mode (for development)
pip install -e .
# Install the additional dependencies for parameter regionalization
pip install .[parreg]
STEP 1: Run regionalization to produce regionalized parameters and formulations#
1) Set up configuration yaml files#
Three yaml config files are needed to run regionalization
config_general.yaml: general settings for the overall regionalization process.
onfig_formreg.yaml: specific settings for the formulation regionalization process.
config_parreg.yaml: specific settings for the parameter regionalization process.
Follow the sample config files (nwm_region_mgr/configs) to set up the configurations for your regionalization application as needed.
Sample input data can be downloaded from s3://ngwpc-dev/regionalization/inputs
2) Run the regionalization script#
python -m nwm_region_mgr [COFIG_DIR] [REG_TYPE]
Where:
[COFIG_DIR] refers to the directory containing the config files as noted in 1)
[REG_TYPE] refers to the type of regionalization to run, either ‘formreg’ (formulation regionalization only) or ‘parreg’ (parameter regionalization, which also runs formulation regionalization first if not done already). If not specified, the default is ‘parreg’.
python -m nwm_region_mgr configs formreg # to run formulation regionalization only
python -m nwm_region_mgr configs parreg # to run parameter regionalization (and formulation regionalization if not done already)
STEP 2: Run NGEN simulation with regionalized parameters#
To void complications from building ngen and its submodules locally, we recommend you always run NGEN simulation with regionalized parameters and formulations from a Docker container. Follow instructions from the Docker Run Time Environment (RTE) section above.
STEP 3: Evaluate NGEN simulation with nwm.verf#
1) Donwload and install nwm.verf#
It is recommentded you install nwm.verf in its own venv. Note nwm.eval needs to installed as a dependency
2) Set up configurations for evaluation#
Follow example config at config_eval.yaml
Check out what metrics are currently supported here
Sample input data can be downloaded from s3://ngwpc-dev/regionalization/data/inputs/eval
3) Activate venv for nwm.verf#
source ~/repos/nwm-verf/venv/bin/activate
4) Run evaluation#
python -m nwm.verf config_eval.yaml
5) Check outputs#
Outputs from evaluation can be found in [output_dir] as specified in config_eval.yaml
Helpful tips and notes#
Running regionalization on INT/EA/UAT clusters#
Compute resources#
Currently, each regionalization job can only run on a single compute node on the INT/EA/UAT clusters. Two partitions
are available on these clusters: c5n-9xlarge and r8a-12xlarge. Each partition contains 50 compute nodes, with 18
CPUs per node for c5n-9xlarge and 48 CPUs per node for r8a-12xlarge.
Regionalization jobs are submitted to a partition based on the number of parallel processes (n_procs) specified in
the config file config_general.yaml, as follows:
If n_procs <= 18, the job is submitted to the
c5n-9xlargepartitionIf 18 < n_procs <= 48, the job is submitted to the
r8a-12xlargepartitionIf n_procs > 48, the job is not submitted and an error message is raised, since a single compute node supports a maximum of 48 CPUs.
To fully utilize available computational resources, it is recommended to set n_procs to match the number of CPUs per node:
Use n_procs = 18 for the c5n-9xlarge partition.
Use n_procs = 48 for the r8a-12xlarge partition.
Note the partition configuration on these clusters may change in the future (use sinfo to check the current configuration).
Job submission#
Regionalization jobs are submitted via the /ngencerf-app/nwm-rte/sbatch_run_region.sh script. There are multiple options to
customize the job submission (see the header of the script for usage details). You can adapt the following bash script
for your needs:
#!/bin/bash
# required argument
CONFIG_DIR="./configs_test"
# Optional arguments to override the defaults
image_tag="pr-22-build" # default: latest. Check available image tags at: https://github.com/NGWPC/nwm-rte/pkgs/container/nwm-rte
pull_image=false #default: false
workflow_options=(parreg ngen eval) #default: parreg. Valid options: formreg, parreg, ngen, eval
dry_run=false #default: false
delete_runtime_dir=false #default: false
# ==== Typically no need to modify lines below ====
SCRIPT_TO_RUN="/ngencerf-app/nwm-rte/sbatch_run_region.sh"
# Build optional arguments
extra_args=()
if [ "$pull_image" = true ]; then
extra_args+=(--pull-image)
fi
if [ "$dry_run" = true ]; then
extra_args+=(--dry-run)
fi
if [ "$delete_runtime_dir" = true ]; then
extra_args+=(--delete-runtime-dir)
fi
"$SCRIPT_TO_RUN" \
"$CONFIG_DIR" \
"${workflow_options[@]}" \
--image-tag "$image_tag" \
"${extra_args[@]}" \
"$@"
Monitoring job status#
After a SLURM job is submitted, you can monitor the job status using:
squeue -u $USER
The job will typically remain in ‘CF’ (configuring) state for a few minutes. Once the job status changes to “R”
(running), you can monitor the progress by reviewing the log file logs/region-${JOB_SUFFIX}-%j.log,
where,
${JOB_SUFFIX}is a string formed by joining the workflows being run with “-”%jis the SLURM job ID.
tail -f logs/region-parreg-ngen-eval-1124.log
Viewing regionalization outputs#
By default, regionalization outputs are saved in parquet format to increase storage and runtime efficiencies, which can be conviently viewed using an extension (e.g., Parquet Explorer) in VS Code. However, on INT/EA/UAT clusters, these tools are not readily available.
There are a couple of options for users to view the outputs:
Option 1: specify csv format for outputs in the config files
config_formreg.yamlandconfig_parreg.yaml(e.g.,output.pairs.format: 'csv'), which will allow you to save the outputs directly in csv format, e.g.,
output.pairs.format: 'csv'
output.params.format: 'csv'
Option 2: use the utility script
view_parquet.shin nwm-region-mgr to view parquet files. You can copy this script to your working directory, e.g.,:
cp /ngencerf-app/nwm-region-mgr/util_scripts/view_parquet.sh .
The script allows you to:
preview the parquet file
query the file with SQL commands
convert the parquet file to csv format etc.
Check the header of the script for usage instructions.
Formulation regionalization#
Configuration for formulation regionalization is specified in
config_general.yamlandconfig_formreg.yaml.The python module
nwm_region_mgr.formregcontains functions for performing formulation regionalization.Formulation regionalization can be run independently, without requiring parameter regionalization. However, parameter regionalization requires formulation regionalization to be completed first.
If
calib_basins_onlyis set to True in the configuration file, only calibrated catchments will be assigned formulations. During parameter regionalization, donors will be selected for uncalibrated catchments without any formulation constraints, i.e., any calibrated catchment is eligible as a donor. Othwerwise, ifcalib_basins_onlyis set to False, eligible donors will be limited to only those calibrated catchments that share the same formulation as the uncalibrated catchment.Currently, formulation regionalization relies on calibration/validation statistics only. In the future, additional criteria (e.g., physiographic similarity) may be incorporated into the formulation selection process.
Parameter regionalization#
Configuration for parameter regionalization is specified in
config_general.yamlandconfig_parreg.yaml.The python module
nwm_region_mgr.parregcontains functions for performing parameter regionalization.Currently, three attribute datasets are supported:
NextGen attributes (available for all domains)
Hydrologic Landscape Regions (HLR) attributes(only available for conus, ak, and hi domains)
StreamCat attributes (only available for conus)
Parameter regionalization is carried out separately for each individual VPU, to avoid potential memory issues and algorithm inefficiency.
Parameter regionalization requires formulation regionalization to be completed first. Hence, for each parameter regionalization run, the workflow will first check if the required outputs from formulation regionalization for the relevant VPUs already exist; if not, the workflow will run formulation regionalization for the relevant VPUs before proceeding with parameter regionalization.
Parameter regionalization for a given VPU may also rely on formulation-regionalization outputs from neighboring VPUs, depending on whether calibration basins from those VPUs fall within the buffer distance specified in the configuration.
Other notes#
AK domain regionalization#
Configuration for the AK domain is slightly different in a few fields:
config_general.yaml:id_col.huc12should be set tohuc12(vs.huc_12for other domains)config_general.yaml:layer_name.huc12should be set toWBDHU12(vsWBDSnapshot_Nationalfor other domains)config_formreg.ymal.huc12_hydrofabric_fileshould be set to'{static_data_dir}/region/NHDPlusV21/NHD_H_Alaska_State_GPKG.gpkg'
Output cleanup#
Each run of ngen over an VPU will generate many catchment and nexus csv files in the output folder (e.g., cat-*.csv,
nex-*.csv), which can take up a lot of storage space. It is recommended to clean up these intermediate files after each run of regionalization, unless if you want to keep them for debugging or other purposes. The followup step eval only requires the t-route output file from NGEN simulations. The following bash script can be adapted to clean up the intermediate csv files while keeping the t-route files for evaluation. Note that you should run this script separately for each algorithm (e.g., gower and kmeans) if you have run parameter regionalization with multiple algorithms. Make sure to update the path in the cd command to point to the correct output folder for each algorithm.
# update with your VPU, run name, and algorithm
VPU=03S
RUN_NAME=test1
ALGORITHM=gower
cd outputs/ngen/regionalization/${RUN_NAME}_${ALGORITHM}/vpu_${VPU}
mv -f output output_backup
mkdir output
mv -f output_backup/t-route* output/
rm -rf output_backup
Manual pairings#
Manual pairings can be specified in the config file config_parreg.yaml to override the algorithm-based donor
selection process for certain receiver catchments. This can be useful when users want to enforce specific
donor-receiver pairs based on their expert knowledge or other considerations. To specify manual pairings,
set the field manual_pairs_file to point to a comma-delimited csv file containing the manual pairings, with
one of the following columns pairs:
receiver_divide_id, donor_divide_id
receiver_divide_id, donor_gage_id
receiver_gage_id, donor_gage_id
receiver_gage_id, donor_divide_id
Each row in the file should be populated with exactly one valid receiver column and one valid donor column.
See sample files in nwm_region_mgr/data/inputs/region/manual_pairs/ for examples of formatting the manual pairings
file. The following examples are all valid formats for the manual pairings file:
receiver_divide_id,receiver_gage_id,donor_divide_id,donor_gage_id
cat-410687,,cat-423550,
cat-410688,,cat-423550,
cat-423248,,,023177483
,02207385,,02314500
,02217475,cat-412526,
receiver_divide_id,donor_divide_id
cat-410687,cat-423550
cat-410688,cat-423550
receiver_gage_id,donor_gage_id
02207385,02314500
Note that the specified manual pairings will be used to update the final donor-receiver pair files, for each allgorithm
specified in general.algorithm_list in config_parreg.yaml. A distSptatial column will be added to indicate the
spatial distance between the donor and receiver catchments for each manual pair, and a tag column will be added to
indicate that these pairs are manually specified. The original pair and parameter files will be saved to backup files
with _original added to the filename. Specifically, the following files will be updated with manual pairings:
pairs/pairs_[ALGORITHM]_[DOMAIN]_[VPU].parquet: thereceiver catchment/dividevsdonor catchment/dividepair filepairs/pairs_[ALGORITHM]_[DOMAIN]_[VPU]_mswm.csv: thedonor gagevsreceiver catchment/dividepair file required by MSWM in the ngen simulation step.params/formulation_params_[ALGORITHM]_[DOMAIN]_[VPU].csv: the donor gage formulation and parameter file required by MSWM in the ngen simulation step.