API Reference

Classes

Data

class prms_python.Data(base_file=None, na_rep=-999)[source]

Object to access or create a PRMS data file with ability to load/assign it to a date-time indexed pandas.DataFrame for data management, analysis and visualization. It can be used to build a new PRMS data file from user defined metadata and a pandas.DataFrame of PRMS datetime-indexed climatic forcing and observation variables.

The class properties metadata and data_frame can be later assigned if no base_file is given on initialization, allowing for the creation of PRMS climatic forcing file in a Python environment.

Keyword Arguments
  • base_file (str, optional) – path to standard PRMS data file

  • na_rep (int, optional) – how to represent missing values default = -999

date_header

date and time header for PRMS data file

Type

list

valid_input_variables

valid hydro-climate variables for PRMS data file

Type

tuple

Note

If using the Data class to create a new data file, it is up to the user to ensure that the metadata and pandas.DataFrame assigned are correct and compatible.

property data_frame

A property that gets and sets the climatic forcing data for a standard PRMS climate input data file as a pandas.DataFrame.

Example

d is a Data instance, calling

>>> d.data_frame
    input variables runoff 1 runoff 2 runoff 3 precip tmax tmin
    date
    1996-12-27      0.54    1.6     NaN     0.0     46      32.0
    1996-12-28      0.65    1.6     NaN     0.0     45      24.0
    1996-12-29      0.80    1.6     NaN     0.0     44      28.0
    1996-12-30      0.90    1.6     NaN     0.0     51      33.0
    1996-12-31      1.00    1.7     NaN     0.0     47      32.0

shows the date-indexed pd.DataFrame of the input data that is created when a Data object is initiated if given a valid base_file, i.e. file path to a PRMS climate data file.

Raises
  • ValueError – if attribute is accessed before either assigning a PRMS data file on Data initialization or not assigning a compatabile date-indexed pandas.DataFrame of hydro-climate variables.

  • TypeError – if a data type other than pandas.DataFrame is assigned.

property metadata

A property that gets and sets the header information from a standard PRMS climate input data file held in a Python dictionary. As a property it can be assigned directly to overwrite or create a new PRMS data file. As such the user is in control and must supply the correct syntax for PRMS standard data files, e.g. text lines before header should begin with “//”. Here is an example of the information gathered and held in this attribute:

Example

>>> data.metadata
    {
     'data_startline' : 6,
     'data_variables' : ['runoff 1', 'runoff 2', 'tmin', 'tmax', 'ppt']
     'text_before_header' : "Title of data file \n //some comments\nrunoff 2
                             \ntmin 1\ntmax 1\nppt 1\nrunoff 2\ntmin 1
                             \ntmax 1\nppt 1\n
                             ########################################\n"
    }

Note

When assigning or creating a new data file, the Data.write method will assign the appropriate date header that follows the line of number signs “#”.

Raises
  • ValueError – if data in metadata is accessed before data is assigned, e.g. if accessed to write a PRMS data file from a Data instance that was initialized without a valid PRMS data file.

  • TypeError – if an object that is not a Python dictionary is assigned.

Type

dict

modify(func, vars_to_adjust)[source]

Apply a user defined function to one or more variable(s) in the data file.

The modify method allows for inplace modification of one or more time series inputs in the data file based on a user defined function.

Parameters
  • func (function) – function to apply to each variable in vars_to_adjust

  • vars_to_adjust (list or tuple) – collection of variable names to apply func to.

Returns

None

Example

Here is an example of loading a data file, modifying the temperature inputs (tmin and tmax) by adding two degrees to each element, and rewritting the modified data to disk,

>>> d = Data('path_to_data_file')
>>> def f(x):
        return x + 2
>>> d.modify(f,['tmax','tmin'])
>>> d.write('data_temp_plus_2')
write(out_path)[source]

Writes the current state of the Data to PRMS text format utilizing the Data.metadata and Data.data_frame instance properties. If Data.data_frame was never accessed or assigned new values then this method simply copies the original PRMS data file to out_path.

Parameters

out_path (str) – full path to save or copy the current PRMS data in PRMS text format.

Returns

None

Raises

ValueError – if the write method is called without assigning either an initial data (base_file) path or assigning correct metadata and data_frame properties.

Parameters

class prms_python.Parameters(base_file)[source]

Disk-based representation of a PRMS parameter file.

For the sake of memory efficiency, we only load parameters from base_file that get modified through item assignment or accessed directly. Internally, a reference is kept to only previously accessed parameter data, so when write is called most copying is from base_file directly. When parameters are accessed or modified using the dictionary-like syntax, a np.ndarray representation of the parameter is returned. As a result numpy mathematical rules including efficient vectorization of math applied to arrays can be applied to modify parameters directly. The Parameter objects user methods allow for visualization of most PRMS parameters, function based modification of parameters, and a write function that writes the data back to PRMS text format.

Parameters

base_file (str) – path to PRMS parameters file

base_file

path to PRMS parameters file

Type

str

base_file_reader

file handle of PRMS parameters file

Type

file

dimensions

dictionary with parameter dimensions as defined in parameters file loaded on initialization

Type

collections.OrderedDict

base_params

list of dictionaries of parameter metadata loaded on initialization e.g. name, dimension(s), data type, length of data array, and lines where data starts and ends in file

Type

list of dicts

param_arrays

dictionary with parameteter names as keys and numpy.array and numpy.ndarray representations of parameter values as keys. Initially empty, uses getter and setter functions.

Type

dict

Example

>>> p = Parameters('path/to/a/parameter/file')
>>> p['jh_coef'] = p['jh_coef']*1.1
>>> p.write('example_modified_params')

will read parameter information from the params file to check that jh_coef is present in the parameter file, read the lines corresponding to jh_coef data and assign the new value as requested. Calling the write method next will copy all parameters except jh_coef to the new parameter file and append the newly modified jh_coef to the end of the new file from the modified values stored in the parameter instance p.

plot(nrows, which='all', out_dir=None, xlabel=None, ylabel=None, cbar_label=None, title=None, mpl_style=None)[source]

Versatile method that plots most parameters in a standard PRMS parameter file assuming the PRMS model was built on a uniform spatial grid.

Plots parameters as line plots for series or 2D spatial grid depending on parameter dimension. The PRMS parameter file is assumed to hold parameters for a model that was set up on a uniform rectangular grid with the spatial index of HRUs starting in the upper left corner and moving left to right across columns and down rows. Default function is to print four files, each with plots of varying parameter dimensions as explained under Kwargs which and more detailed explanation in the example Jupyter notebook.

Parameters

nrows (int) – The number of rows in the PRMS model grid for plotting spatial parameters. Will only work correctly for rectangular gridded models with HRU indices starting in the upper left cell moving left to right across columns and down across rows.

Keyword Arguments
  • which (str) – name of PRMS parameter to plot or ‘all’. If ‘all’ then the function will print 3 multipage pdfs, one for nhru dimensional parameters, one for nhru by monthly parameters, one for other parameters of length > 1, and one html file containing single valued parameters.

  • out_dir (str) – path to an output dir, default current directory

  • xlabel (str) – x label for plot(s)

  • ylabel (str) – y label for plot(s)

  • cbar_label (str) – label for colorbar on spatial plot(s)

  • title (str) – plot title

  • mpl_style (str, list) – name or list of names of matplotlib style sheets to use for plot(s).

Returns

None

Examples

If the plot method is called with the keyword argument which set to a parameter that has length one, i.e. single valued it will simply print out the value e.g.:

>>> p = Parameters('path/to/parameters')
>>> p.plot(nrows=10, which='radj_sppt')
    radj_sppt is single valued with value: 0.4924942352224324

The default action is particularly useful which makes four multi-page pdfs of most PRMS parameters where each file contains parameters of different dimensions e.g.:

>>> p.plot(nrows=10, which='all', mpl_style='ggplot')

will produce the following four files named by parameters of certain dimensions:

>>> import os
>>> os.listdir(os.getcwd()) # list files in current directory
    nhru_param_maps.pdf
    nhru_by_nmonths_param_maps.pdf
    non_spatial_param_plots.pdf
    single_valued_params.html
write(out_name)[source]

Writes current state of Parameters to disk in PRMS text format

To reduce memory usage the write method copies parameters from the initial base_file parameter file for all parameters that were never modified.

Parameters

out_name (str) – path to write Parameters data to PRMS text format.

Returns

None

Simulation

class prms_python.Simulation(input_dir=None, simulation_dir=None)[source]

Class that runs and manages file structure for a single PRMS simulation.

The Simulation class provides low-level managment of a PRMS simulation by copying model input files from input_dir argument to an output dir simulation_dir. The file stucture for an individual simulation after calling the run method is simple, two subdirectories “inputs” and “outputs” are created under simulation_dir and the respective input and output files from the current PRMS simulation are transfered there after the Simulation.run() method is called which executes the PRMS model, (see examples below in Simulation.run()).

A Simulation instance checks that all required PRMS inputs (control, parameters, data) exist in the expected locations. If simulation_dir is provided and does not exist, it will be created. If it does exist it will be overwritten.

Keyword Arguments
  • input_dir (str) – path to directory that contains control, parameter, and data files for the simulation

  • simulation_dir (str) – directory path to bundle inputs and outputs

Example

see Simulation.run()

Raises

RuntimeError – if input directory does not contain a PRMS data, parameters, and control file.

classmethod from_data(data, parameters, control_path, simulation_dir)[source]

Create a Simulation from a Data and Parameter object, plus a path to the control file, and providing a simulation_dir where the simulation should be run.

Parameters
  • data (Data) – Data object for simulation

  • parameters (Parameters) – Parameters object for simulation

  • control_path (str) – path to control file

  • simulation_dir (str) – path to directory where simulations will be run and where input and output will be stored. If it exists it will be overwritten.

Returns

Simulation ready to be run using simulation_dir for

inputs and outputs

Example

>>> d = Data('path_to_data_file')
>>> p = Parameters('path_to_parameters_file')
>>> c = 'path_to_control_file'
>>> sim_dir = 'path_to_create_simulation'
>>> sim = Simulation.from_data(d, p, c, sim_dir)
>>> sim.run()
Raises

TypeError – if data and parameters arguments are not of type Data and Parameters

run(prms_exec='prms')[source]

Run a Simulation instance using PRMS input files from input_dir and copy to the Simulation file structure under simulation_dir if given, otherwise leave PRMS input output unstructured and in input_dir

This method runs a single PRMS simulation from a Simulation instance, waits until the process has completed and then transfers model input and output files to respective newly created directories. See example of the file structure that is created under different workflows of the run method below.

Keyword Arguments

prms_exec (str) – name of PRMS executable on $PATH or path to executable

Examples

If we create a Simulation instance by only assigning the input_dir argument and call its run method the model will be run in the input_dir and all model input and output files will remain in input_dir,

>>> import os
>>> input_dir = os.path.join(
                              'PRMS-Python',
                              'prms_python',
                              'models',
                              'lbcd'
                            )
>>> os.listdir(input_dir)
    ['data',
     'data_3deg_upshift',
     'parameters',
     'parameters_adjusted',
     'control']
>>> sim = Simulation(input_dir)
>>> sim.run()
>>> os.listdir(input_dir) # all input and outputs in input_dir
    ['data',
     'data_3deg_upshift',
     'parameters',
     'parameters_adjusted',
     'control',
     'statvar.dat',
     'prms_ic.out',
     'prms.out' ]

Instead if we assigned a path for simulation_dir keyword argument and then called run, i.e.

>>> sim = Simulation(input_dir, 'path_simulation')
>>> sim.run()

the files structure for the PRMS simulation created by Simulation.run() would be:

path_simulation
├── inputs
│   ├── control
│   ├── data
│   └── parameters
└── outputs
    ├── data_3deg_upshift
    ├── parameters_adjusted
    ├── prms_ic.out
    ├── prms.out
    └── statvar.dat

Note

As shown in the last example, currently the Simulation.run routine only recognizes the data, parameters. and control file as PRMS inputs, all other files found in input_dir before and after normal completion of the PRMS simulation will be transferred to simulation_dir/outputs/.

SimulationSeries

class prms_python.SimulationSeries(simulations)[source]

Series of simulations all to be run through a common interface.

Utilizes multiprocessing.Pool class to parallelize the execution of series of PRMS simulations. SimulationSeries also allows the user to define the PRMS executable command which is set to “prms” as default. It is best to add the prms executable to your $PATH environment variable. Each simulation that is run through SimulationSeries will follow the strict file structure as defined by Simulation.run(). This class is useful particularly for creating new programatic workflows not provided yet by PRMS-Python.

Parameters

simulations (list or tuple) – list of Simulation objects to be run.

Example

Lets say you have already created a series of PRMS models by modifying the input climatic forcing data, e.g. you have 100 data files and you want to run each using the same control and parameters file. For simplicity lets say there is a directory that contains all 100 data files e.g. data1, data2, … or whatever they are named and nothing else. This example also assumes that you want each simulation to be run and stored in directories named after the data files as shown.

>>> data_dir = 'dir_that_contains_all_data_files'
>>> params = Parameters('path_to_parameter_file')
>>> control_path = 'path_to_control'
>>> # a list comprehension to make multiple simulations with
>>> # different data files, alternatively you could use a for loop
>>> sims = [
            Simulation.from_data
              (
                Data(data_file),
                params,
                control_path,
                simulation_dir='sim_{}'.format(data_file)
              )
            for data_file in os.listdir(data_dir)
            ]

Next we can use SimulationSeries to run all of these simulations in parrallel. For example we may use 8 logical cores on a common desktop computer.

>>> sim_series = SimulationSeries(sims)
>>> sim_series.run(nprocs=8)

The SimulationSeries.run() method will run all 100 simulations where chunks of 8 at a time will be run in parrallel. Inputs and outputs of each simulation will be sent to each simulation’s simulation_dir following the file structure of Simulation.run().

Note

The Simulation and SimulationSeries classes are low-level in that they alone do not create metadata for PRMS simulation scenarios. In other words they do not produce any additional files that may help the user know what differs between individual simulations.

outputs_iter()[source]

Return a generator of directories with the path to the simulation_dir as well as paths to the statvar.dat output file, and data and parameters input files used in the simulation.

Yields

dict

dictionary of paths to simulation directory,

input, and output files.

Example

>>> ser = SimulationSeries(simulations)
>>> ser.run()
>>> g = ser.outputs_iter()

Would return something like

>>> print(g.next())
    {
       'simulation_dir': 'path/to/sim/',
       'statvar': 'path/to/statvar',
       'data': 'path/to/data',
       'parameters': 'path/to/parameters'
     }
run(prms_exec='prms', nproc=None)[source]

Method to run multiple Simulation objects in parrallel.

Keyword Arguments
  • prms_exec (str) – name of PRMS executable on $PATH or path to executable

  • nproc (int or None) – number of logical or physical processors for parrallel execution of PRMS simulations.

Example

see SimulationSeries

Note

If nproc is not assigned the deault action is to use half of the available processecors on the machine using the Python multiprocessing module.

Scenario

class prms_python.Scenario(base_dir, scenario_dir, title=None, description=None)[source]

Container for the process in which one modifies input parameters then runs a simulation while tracking metadata.

Metadata includes a title and description, if provided, plus start/end datetime, and parameter names of parameters that were modified including string representations of the Python modification functions that were applied to each parameter. The metadata file is in json format making it conveniently read as a Python dictionary.

Parameters
  • base_dir (str) – path to directory that contains initial control, parameter, and data files to be used for Scenario. The parameters file in base_dir will not be modifed instead will be copied to scenario_dir and them modified.

  • scenario_dir (str) – directory path to bundle inputs and outputs

  • title (str, optional) – title of Scenario, if given will be added to Scenario.metadata attribute as well as the metadata.json file in scenario_dir written after calling the Scenario.build() and Scenario.run() methods.

  • description (str, optional) – description of Scenario, also is added to Scenario.metadata as title.

metadata

a dictionary-like class in prms_python.scenario that tracks Scenario and ScenarioSeries imformation including user-defined parameter modifications and descriptions, and file structure.

Type

scenario.ScenarioMetadata

Examples

This example is kept simple for clarity, here we adjust a single PRMS parameter tmin_lapse by using a single arbitrary mathematical function. We use the example PRMS model included with PRMS-Python for this example,

>>> input_dir = 'PRMS-Python/test/data/models/lbcd'
>>> scenario_directory = 'scenario_testing'
>>> title = 'Scenario example'
>>> desc = 'adjust tmin_lapse using sine wave function'
>>> # create Scenario instance
>>> scenario_obj = Scenario
        (
          base_dir=input_dir,
          scenario_dir=scenario_directory,
          title=title,
          description=desc
        )

Next we need to build a dictionary to modify, in this case tmin_lapse, here we use a vectorized sine function

>>> # build the modification function and dictionary
>>> def a_func(arr):
        return 4 + np.sin(np.linspace(0,2*np.pi,num=len(arr)))
>>> # make dictionary with parameter names as keys and modification
>>> # function as values
>>> param_mod_dic = dict(tmin_lapse=a_func)
>>> scenario_obj.build(param_mod_funs=param_mod_dic)

After building a Scenario instance the input files are copied to scenario_dir which was assigned ‘scenario_testing’:

scenario_testing
├── control
├── data
└── parameters

After calling build the input files from input_dir were first copied to scenario_dir and then the functions in param_mod_dic are applied the the parameters names (key) in param_mod_dic. To run the Scenario use the the run method

>>> scenario_obj.run()

Now the simulation is run and the metadata.json file is created, the final file structure will be similar to this:

scenario_testing
├── inputs
│   ├── control
│   ├── data
│   └── parameters
├── metadata.json
└── outputs
    ├── prms_ic.out
    ├── prms.out
    └── statvar.dat

Finally, here is what is contained in metadata.json for this example which is also updates in the Scenario.metadata

>>> scenario_obj.metadata
    {
      'title': 'Scenario example',
      'description': 'adjust tmin_lapse using sine wave function',
      'start_datetime': '2018-09-01T19:20:21.723003',
      'end_datetime': '2018-09-01T19:20:31.117004',
      'mod_funs_dict': {
                         'tmin_lapse': 'def parab(arr):
                                            return 4 + np.sin(np.linspace(0,2*np.pi,num=len(arr)))'
                       }
    }

As shown the metadata retirieved the parameter modification function as a string representation of the exact Python function(s) used for modifying the user-defined parameter(s).

Note

The main differentiator between Scenario and ScenarioSeries is that Scenario is designed for modifying one or more parameters of a single parameters file whereas ScenarioSeries is designed for modifying and tracking the modification of one or more parameters in multiple PRMS parameters files, therefore resulting in multiple PRMS simulations.

build(param_mod_funs=None)[source]

Take a user-defined dictionary with param names as keys and Python functions as values, copy the original input files as given when initializing a Scenario instance to the simulation_dir then apply the functions in the user-defined dictionary to the parameters there. The build method must be called before running the Scenario (calling Scenario.run() ).

Keyword Arguments

param_mod_funs (dict) – dictionary with parameter names as keys and Python functions as values to apply to the names (key)

Returns

None

Example

see Scenario for a full example.

Note

If the scenario_dir that was assigned for the current instance already exists, it will be overwritten when build is invoked.

run(prms_exec='prms')[source]

Run the PRMS simulation for a built Scenario instance.

Keyword Arguments

prms_exec (str) – name of PRMS executable on $PATH or path to executable

Returns

None

Examples

see Scenario for full example

Raises

RuntimeError – if the Scenario.build() method has not yet been called.

ScenarioSeries

class prms_python.ScenarioSeries(base_dir, scenarios_dir, title=None, description=None)[source]

Create and manage a series of model runs where parameters are modified.

First initialize the series with an optional title and description. Then to build the series the user provides a list of dictionaries with parameter-function key-value pairs, and optionally a title and description for each dictionary defining the individual scenario.

The ScenarioSeries’ build method creates a file structure under the series directory (scenarios_dir) where each subdirectory is named with a uuid which can be later matched to its title using the metadata in scenario_dir/series_metadata.json (see json). In the future we may add a way for the user to access the results of the scenario simulations directly through the ScenarioSeries instance, but for now the results are written to disk. Therefore each scenario’s title metadata can be used to refer to which parmaters were modified and how for post-processing and analysis. One could also use the description metadata for this purpose.

Parameters
  • base_dir (str) – path to base inputs; ‘control’, ‘parameters’, and ‘data’ must be present there

  • scenarios_dir (str) – directory where scenario data will be written to; will be overwritten or created if it does not exist

Keyword Arguments
  • title (str, optional) – title of the ScenarioSeries instance

  • description (str, optional) – description of the ScenarioSeries instance

metadata

dictionary with title, description, and UUID map dictionary for individiual Scenario output directories, the UUID dictionary (uuid_title_map)is left empty until calling ScenarioSeries.build().

Type

dict

scenarios

empty list that will be filled with ``Scenario``s after defining them by calling ScenarioSeries.build().

Type

list

Example

There are three steps to both Scenario and ScenarioSeries, first we initialize the object

>>> series = ScenarioSeries(
>>>      base_dir = 'dir_with_input_files',
>>>      scenarios_dir = 'dir_to_run_series',
>>>      title = 'title_for_group_of_scenarios',
>>>      description = 'description_for_scenarios'
>>> )

The next step is to “build” the ScenarioSeries by calling the ScenarioSeries.build() method which defines which parameters to modify, how to modify them, and then performs the modification which readies the series to be “run” (the last step). See the ScenarioSeries.build() method for the next step example.

Also see Scenario & ScenarioSeries for full example

build(scenarios_list)[source]

Build the scenarios from a list of scenario definitions in dicitonary form.

Each element of scenarios_list can have any number of parameters as keys with a function for each value. The other two acceptable keys are title and description which will be passed on to each individual Scenario’s metadata in series_metadata.json for future lookups. The build method also creates a file structure that uses UUID values as individiual Scenario subdirectories as shown below.

Parameters

scenarios_list (list) – list of dictionaries with key-value pairs being parameter-function definition pairs or title-title string or description-description string.

Returns

None

Examples

Following the initialization of a ScenarioSeries instance as shown the example docstring there, we “build” the series by defining a list of param named-keyed function-valued dictionaries. This example uses arbitrary functions on two PRMS parameters snowinfil_max and snow_adj,

>>> def _function1(x): #Note, function must start with underscore
        return x * 0.5
>>> def _function2(x):
        return x + 5
>>> dic1 = {'snowinfil_max': _function1, 'title': 'scenario1'}
>>> dic2 = {'snowinfil_max': _function2,
            'snow_adj': function1,
            'title': 'scenario2',
            'description': 'we adjusted two snow parameters'
           }
>>> example_scenario_list = [dic1, dic2]
>>> # now we can build the series
>>> series.build(example_scenario_list)

In this example that follows from ScenarioSeries example the file structure that is created by the build method is as follows:

dir_to_run_series
├── 670d6352-2852-400a-997e-7b12ba34f0b0
│   ├── control
│   ├── data
│   └── parameters
├── base_inputs
│   ├── control
│   ├── data
│   └── parameters
├── ee9526a9-8fe6-4e88-b357-7dfd7111208a
│   ├── control
│   ├── data
│   └── parameters
└── series_metadata.json

As shown the build method has copied the original inputs from the base_dir given on initialization of ScenarioSeries to a new subdirectory of the scenarios_dir, it also applied the modifications to the parameters for both scenarios above and move the input files to their respective directories. At this stage the metadata will not have updated the UUID map dictionary to each scenarios subdirectory because they have not yet been run. See the ScenarioSeries.run() method for further explanation including the final file structure and metadata file contents.

classmethod from_parameters_iter(base_directory, parameters_iter, title=None, description=None)[source]

Alternative way to initialize and build a ScenarioSeries in one step.

Create and build a ScenarioSeries by including the param-keyed- function-valued dictionary (parameters_iter) that is otherwise passed in ScenarioSeries.build().

Parameters
  • base_directory (str) – directory that contains model input files

  • parameters_iter (list of dicts) – list of dictionaries for each Scenario as described in Scenario and ScenarioSeries.build().

  • title (str) – title for group of scenarios

  • description (str) – description for group of scenarios

Returns

None

run(prms_exec='prms', nproc=None)[source]

Run a “built” ScenarioSeries and make final updates to file structure and metadata.

Keyword Arguments
  • prms_exec (str) – name of PRMS executable on $PATH or path to executable. Default = ‘prms’

  • nproc (int or None) – number of processceors available to parallelize PRMS simulations, if None (default) then use half of what the multiprocessing detects on the machine.

Returns

None

Examples

This example starts where the example ends in ScenarioSeries.build(), calling run will run the models for all scenarios and then update the file structure

as well as create individual Scenario metadata files as such:

dir_to_run_series
├── 5498c21d-d064-45f4-9912-044734fd230e
│   ├── inputs
│   │   ├── control
│   │   ├── data
│   │   └── parameters
│   ├── metadata.json
│   └── outputs
│       ├── prms_ic.out
│       ├── prms.out
│       └── statvar.dat
├── 9d28ec5a-b570-4abb-8000-8dac113cbed3
│   ├── inputs
│   │   ├── control
│   │   ├── data
│   │   └── parameters
│   ├── metadata.json
│   └── outputs
│       ├── prms_ic.out
│       ├── prms.out
│       └── statvar.dat
├── base_inputs
│   ├── control
│   ├── data
│   └── parameters
└── series_metadata.json

As we can see the file structure follows the combined structures as defined by Simulation and Scenario. The content of the top-level metadata file series_metadata.json is as such:

{
  "title": "title_for_group_of_scenarios",
  "description": "description_for_scenarios",
  "uuid_title_map": {
    "5498c21d-d064-45f4-9912-044734fd230e": "scenario1",
    "9d28ec5a-b570-4abb-8000-8dac113cbed3": "scenario2"
  }
}

Therefore one can use the json file to track between UUID’s and individual scenario titles. The json files are read as a Python dictionary which makes them particularly convenient. The contents of an individual scenarios metadata.json file included a string representation of the function(s) that were applied to the paramter(s):

{
    "description": null,
    "end_datetime": "2018-09-03T00:00:40.793817",
    "mod_funs_dict": {
        "snowinfil_max": "def _function1(x):
                              return x * 0.5"
    },
    "start_datetime": "2018-09-03T00:00:30.421353",
    "title": "scenario1"
}

Note

As shown, it is important to give appropriate scenario titles when building a ScenarioSeries dictionary in order to later understand how parameters were modified in each scenario. If not one would have to rely on the individual metadata.json files in each scenario directory which may be more cumbersome.

Optimizer

class prms_python.Optimizer(parameters, data, control_file, working_dir, title, description=None)[source]

Container for PRMS parameter optimization and related routines.

Currently the monte_carlo method provides random parameter resampling routines using uniform and normal random variables.

Example

>>> from prms_python import Data, Optimizer, Parameters
>>> params = Parameters('path/to/parameters')
>>> data = Data('path/to/data')
>>> control = 'path/to/control'
>>> work_directory = 'path/to/create/simulations'
>>> optr = Optimizer(
                     params,
                     data,
                     control,
                     work_directory,
                     title='the title',
                     description='desc')
>>> measured = 'path/to/measured/csv'
>>> statvar_name = 'basin_cfs' # or any other valid statvar
>>> params_to_resample = ['dday_intcp', 'dday_slope'] # list of params
>>> optr.monte_carlo(measured, params_to_resample, statvar_name)
monte_carlo(reference_path, param_names, statvar_name, stage, n_sims=10, method='uniform', mu_factor=1, noise_factor=0.1, nproc=None)[source]

The monte_carlo method of Optimizer performs parameter random resampling techniques to a set of PRMS parameters and executes and manages the corresponding simulations.

Parameters
  • reference_path (str) – path to measured data for optimization

  • param_names (list) – list of parameter names to resample

  • statvar_name (str) – name of statisical variable output name for optimization

  • stage (str) – custom name of optimization stage e.g. ‘ddsolrad’

Keyword Arguments
  • n_sims (int) – number of simulations to conduct parameter optimization/uncertaitnty analysis.

  • method (str) – resampling method for parameters (normal or uniform)

  • mu_factor (float) – coefficient to scale mean of the parameter(s) to resample from when using the normal distribution to resample i.e. a value of 1.5 will sample from a normal rv with mean 50% higher than the original parameter mean

  • noise_factor (float) – scales the variance of noise to add to parameter values when using normal rv (method=’normal’)

  • nproc (int) – number of processors available to run PRMS simulations

Returns

None

plot_optimization(freq='daily', method='time_series', plot_vars='both', plot_1to1=True, return_fig=False, n_plots=4)[source]

Basic plotting of current optimization results with limited options. Plots measured, original simluated, and optimization simulated variabes Not recommended for plotting results when n_sims is very large, instead use options from an OptimizationResult object, or employ a user-defined method using the result data.

Keyword Arguments
  • freq (str) – frequency of time series plots, value can be ‘daily’ or ‘monthly’ for solar radiation

  • method (str) – ‘time_series’ for time series sub plot of each simulation alongside measured radiation. Another choice is ‘correlation’ which plots each measured daily solar radiation value versus the corresponding simulated variable as subplots one for each simulation in the optimization. With coefficients of determiniationi i.e. square of pearson correlation coef.

  • plot_vars (str) – what to plot alongside simulated srad: ‘meas’: plot simulated along with measured swrad ‘orig’: plot simulated along with the original simulated swrad ‘both’: plot simulated, with original simulation and measured

  • plot_1to1 (bool) – if True plot one to one line on correlation scatter plot, otherwise exclude.

  • return_fig (bool) – flag whether to return matplotlib figure

Returns

If kwarg return_fig=True, then return

copy of the figure that is generated to the user.

Return type

f (matplotlib.figure.Figure)

OptimizationResult

class prms_python.OptimizationResult(working_dir, stage)[source]

The OptimizationResult object serves to collect and manage output from an Optimizer method. Upon initialization and a given optimization stage that was used when running the Optimizer method, e.g. monte_carlo, the class gathers all JSON metadata that was produced for the given stage. The OptimizationResult has three main user methods: first result_table which returns the top n simulations according to four model performance metrics (Nash-Sutcliffe efficiency (NSE), root-mean squared-error (RMSE), percent bias (PBIAS), and the coefficient of determination (COEF_DET) as calculated against measured data. For example the table may look like:

>>> ddsolrad_res = OptimizationResult(work_directory, stage=stage)
>>> top10 = ddsolrad_res.result_table(freq='monthly',top_n=10)
>>> top10
    ========================  ========  =======  =========   ========
    ddsolrad parameters       NSE       RMSE     PBIAS       COEF_DET
    ========================  ========  =======  =========   ========
    orig_params               0.956267  39.4725  -0.885715   0.963116
    tmax_index_54.2224631748  0.921626  47.6092  -0.849256   0.94402
    tmax_index_44.8823940703  0.879965  58.9194  5.79603     0.922021
    tmax_index_47.6835387480  0.764133  82.5918  -4.78896    0.837582
    ========================  ========  =======  =========   ========

Second, the get_top_ranked_sims which returns a dictionary that map key information about the top n ranked simulations, an example returned dictionary may look like:

>>> {
      'dir_name' : ['pathToSim1', 'pathToSim2'],
      'param_path' : ['pathToSim1/input/parameters', 'pathToSim2/input/parameters'],
      'statvar_path' : ['pathToSim1/output/statvar.dat', 'pathToSim2/output/statvar.dat'],
      'params_adjusted' : [[param_names_sim1], [param_names_sim2]]
    }

The third method of OptimizationResult is archive which essentially opens all parameter and statvar files from each simulation of the given stage and archives the parameters that were modified and their modified values and the statistical variable (PRMS time series output) that is associated with the optimization stage. Other Optimizer simulation metadata is also gathered and new JSON metadata containing only this information is created and written within a newly created “archived” subdirectory within the same directory that the Optimizer routine managed simulations. The OptimizationResult.archive method then recursively deletes the simulation data for each of the given stage.

archive(remove_sims=True, remove_meta=False, metric_freq='daily')[source]

Create archive directory to hold json files that contain information of adjusted parameters, model output, and performance metrics for each Optimizer simulation of the OptimizationResult.stage in the OptimizationResult.working_dir.

Keyword Arguments
  • remove_sims (bool) – If True recursively delete all folders and files associated with original simulations of the OptimizationResult.stage in the OptimizationResult.working_dir, if False do not delete simulations.

  • remove_meta (bool) – Whether to delete original Optimizer JSON metadata files, default is False.

  • metric_freq (Str) – Frequency of output metric computation for recording of model performance. Can be ‘daily’ (default) or ‘monthly’. Note, other results can be computed later with archived results.

Returns

None

Select helper functions

load_data

prms_python.load_data(data_file)[source]

Read the data file and load into a datetime indexed Pandas dataframe object.

Parameters

data_file (str) – data file path

Returns

Pandas dataframe of input time series data

from data file with datetime index

Return type

df (pandas.DataFrame)

load_statvar

prms_python.load_statvar(statvar_file)[source]

Read the statvar file and load into a datetime indexed Pandas dataframe object

Parameters

statvar_file (str) – statvar file path

Returns

(pandas.DataFrame) Pandas DataFrame of PRMS variables date indexed

from statvar file

modify_params

prms_python.modify_params(params_in, params_out, param_mods=None)[source]

Given a parameter file in and a dictionary of param_mods, write modified parameters to params_out.

Parameters
  • params_in (str) – location on disk of the base parameter file

  • params_out (str) – location on disk where the modified parameters will be written

Keyword Arguments

param_mods (dict) – param name-keyed, param modification function-valued

Returns

None

Example

Below we modify the monthly jh_coef parameter by increasing it 10% for every month,

>>> params_in = 'models/lbcd/parameters'
>>> params_out = 'scenarios/jh_coef_1.1/params'
>>> scale_10pct = lambda x: x * 1.1
>>> modify_params(params_in, params_out, {'jh_coef': scale_10pct})

So param_mods is a dictionary of with keys being parameter names and values a function that operates on a single value. Currently we only accept functions that operate without reference to any other parameters. The function will be applied to every cell, month, or cascade routing rule for which the parameter is defined.

resample_param

prms_python.optimizer.resample_param(params, param_name, how='uniform', mu_factor=1, noise_factor=0.1)[source]

Resample PRMS parameter by shifting all values by a constant that is taken from a uniform distribution, where the range of the uniform values is equal to the difference between the min and max of the allowable range. The parameter min and max are set in Optimizer.param_ranges. If the resampling method (how argument) is set to ‘normal’, randomly sample a normal distribution with mean = mean(parameter) X mu_factor and sigma = param allowable range multiplied by noise_factor. If parameters have array length <= 366 then individual parameter values are resampled otherwise resample all param values at once, e.g. by taking a single random value from the uniform distribution. If they are taking all at once using the normal method then the original values are scaled by mu_factor and a normal random variable with mean=0 and std dev = parameter range X noise_factor.

Parameters
Keyword Arguments
  • how (str) – distribution to resample parameters from in the case that each parameter element can be resampled (len <=366) Currently works for uniform and normal distributions.

  • noise_factor (float) – factor to multiply parameter range by, use the result as the standard deviation for the normal rand. variable used to add element wise noise. i.e. higher noise_factor will result in higher variance. Must be > 0.

Returns

ndarray of param after resampling

Return type

ret (numpy.ndarry)

Raises
  • KeyError – if param_name not a valid parameter name

  • ValueError – if the parameter range has not been set in Optimizer.param_ranges

nash_sutcliffe

prms_python.nash_sutcliffe(observed, modeled)[source]

Calculates the Nash-Sutcliffe Goodness-of-fit

Parameters

Indices and tables