API Reference¶
Classes¶
Data¶
-
class
prms_python.
Data
(base_file=None, na_rep=-999)[source]¶ Object to access or create a PRMS data file with ability to load/assign it to a date-time indexed pandas.DataFrame for data management, analysis and visualization. It can be used to build a new PRMS data file from user defined metadata and a
pandas.DataFrame
of PRMS datetime-indexed climatic forcing and observation variables.The class properties
metadata
anddata_frame
can be later assigned if nobase_file
is given on initialization, allowing for the creation of PRMS climatic forcing file in a Python environment.- Keyword Arguments
Note
If using the
Data
class to create a new data file, it is up to the user to ensure that the metadata andpandas.DataFrame
assigned are correct and compatible.-
property
data_frame
¶ A property that gets and sets the climatic forcing data for a standard PRMS climate input data file as a
pandas.DataFrame
.Example
d is a Data instance, calling
>>> d.data_frame input variables runoff 1 runoff 2 runoff 3 precip tmax tmin date 1996-12-27 0.54 1.6 NaN 0.0 46 32.0 1996-12-28 0.65 1.6 NaN 0.0 45 24.0 1996-12-29 0.80 1.6 NaN 0.0 44 28.0 1996-12-30 0.90 1.6 NaN 0.0 51 33.0 1996-12-31 1.00 1.7 NaN 0.0 47 32.0
shows the date-indexed
pd.DataFrame
of the input data that is created when aData
object is initiated if given a validbase_file
, i.e. file path to a PRMS climate data file.- Raises
ValueError – if attribute is accessed before either assigning a PRMS data file on
Data
initialization or not assigning a compatabile date-indexedpandas.DataFrame
of hydro-climate variables.TypeError – if a data type other than
pandas.DataFrame
is assigned.
-
property
metadata
¶ A property that gets and sets the header information from a standard PRMS climate input data file held in a Python dictionary. As a property it can be assigned directly to overwrite or create a new PRMS data file. As such the user is in control and must supply the correct syntax for PRMS standard data files, e.g. text lines before header should begin with “//”. Here is an example of the information gathered and held in this attribute:
Example
>>> data.metadata { 'data_startline' : 6, 'data_variables' : ['runoff 1', 'runoff 2', 'tmin', 'tmax', 'ppt'] 'text_before_header' : "Title of data file \n //some comments\nrunoff 2 \ntmin 1\ntmax 1\nppt 1\nrunoff 2\ntmin 1 \ntmax 1\nppt 1\n ########################################\n" }
Note
When assigning or creating a new data file, the
Data.write
method will assign the appropriate date header that follows the line of number signs “#”.- Raises
ValueError – if data in metadata is accessed before data is assigned, e.g. if accessed to write a PRMS data file from a
Data
instance that was initialized without a valid PRMS data file.TypeError – if an object that is not a Python dictionary is assigned.
- Type
-
modify
(func, vars_to_adjust)[source]¶ Apply a user defined function to one or more variable(s) in the data file.
The
modify
method allows for inplace modification of one or more time series inputs in the data file based on a user defined function.- Parameters
- Returns
None
Example
Here is an example of loading a data file, modifying the temperature inputs (tmin and tmax) by adding two degrees to each element, and rewritting the modified data to disk,
>>> d = Data('path_to_data_file') >>> def f(x): return x + 2 >>> d.modify(f,['tmax','tmin']) >>> d.write('data_temp_plus_2')
-
write
(out_path)[source]¶ Writes the current state of the
Data
to PRMS text format utilizing theData.metadata
andData.data_frame
instance properties. IfData.data_frame
was never accessed or assigned new values then this method simply copies the original PRMS data file toout_path
.- Parameters
out_path (str) – full path to save or copy the current PRMS data in PRMS text format.
- Returns
None
- Raises
ValueError – if the
write
method is called without assigning either an initial data (base_file
) path or assigning correctmetadata
anddata_frame
properties.
Parameters¶
-
class
prms_python.
Parameters
(base_file)[source]¶ Disk-based representation of a PRMS parameter file.
For the sake of memory efficiency, we only load parameters from
base_file
that get modified through item assignment or accessed directly. Internally, a reference is kept to only previously accessed parameter data, so whenwrite
is called most copying is frombase_file
directly. When parameters are accessed or modified using the dictionary-like syntax, anp.ndarray
representation of the parameter is returned. As a resultnumpy
mathematical rules including efficient vectorization of math applied to arrays can be applied to modify parameters directly. TheParameter
objects user methods allow for visualization of most PRMS parameters, function based modification of parameters, and a write function that writes the data back to PRMS text format.- Parameters
base_file (str) – path to PRMS parameters file
-
base_file_reader
¶ file handle of PRMS parameters file
- Type
file
-
dimensions
¶ dictionary with parameter dimensions as defined in parameters file loaded on initialization
-
base_params
¶ list of dictionaries of parameter metadata loaded on initialization e.g. name, dimension(s), data type, length of data array, and lines where data starts and ends in file
- Type
list of dicts
-
param_arrays
¶ dictionary with parameteter names as keys and
numpy.array
andnumpy.ndarray
representations of parameter values as keys. Initially empty, uses getter and setter functions.- Type
Example
>>> p = Parameters('path/to/a/parameter/file') >>> p['jh_coef'] = p['jh_coef']*1.1 >>> p.write('example_modified_params')
will read parameter information from the params file to check that jh_coef is present in the parameter file, read the lines corresponding to jh_coef data and assign the new value as requested. Calling the
write
method next will copy all parameters except jh_coef to the new parameter file and append the newly modified jh_coef to the end of the new file from the modified values stored in the parameter instancep
.-
plot
(nrows, which='all', out_dir=None, xlabel=None, ylabel=None, cbar_label=None, title=None, mpl_style=None)[source]¶ Versatile method that plots most parameters in a standard PRMS parameter file assuming the PRMS model was built on a uniform spatial grid.
Plots parameters as line plots for series or 2D spatial grid depending on parameter dimension. The PRMS parameter file is assumed to hold parameters for a model that was set up on a uniform rectangular grid with the spatial index of HRUs starting in the upper left corner and moving left to right across columns and down rows. Default function is to print four files, each with plots of varying parameter dimensions as explained under Kwargs
which
and more detailed explanation in the example Jupyter notebook.- Parameters
nrows (int) – The number of rows in the PRMS model grid for plotting spatial parameters. Will only work correctly for rectangular gridded models with HRU indices starting in the upper left cell moving left to right across columns and down across rows.
- Keyword Arguments
which (str) – name of PRMS parameter to plot or ‘all’. If ‘all’ then the function will print 3 multipage pdfs, one for nhru dimensional parameters, one for nhru by monthly parameters, one for other parameters of length > 1, and one html file containing single valued parameters.
out_dir (str) – path to an output dir, default current directory
xlabel (str) – x label for plot(s)
ylabel (str) – y label for plot(s)
cbar_label (str) – label for colorbar on spatial plot(s)
title (str) – plot title
mpl_style (str, list) – name or list of names of matplotlib style sheets to use for plot(s).
- Returns
None
Examples
If the plot method is called with the keyword argument
which
set to a parameter that has length one, i.e. single valued it will simply print out the value e.g.:>>> p = Parameters('path/to/parameters') >>> p.plot(nrows=10, which='radj_sppt') radj_sppt is single valued with value: 0.4924942352224324
The default action is particularly useful which makes four multi-page pdfs of most PRMS parameters where each file contains parameters of different dimensions e.g.:
>>> p.plot(nrows=10, which='all', mpl_style='ggplot')
will produce the following four files named by parameters of certain dimensions:
>>> import os >>> os.listdir(os.getcwd()) # list files in current directory nhru_param_maps.pdf nhru_by_nmonths_param_maps.pdf non_spatial_param_plots.pdf single_valued_params.html
-
write
(out_name)[source]¶ Writes current state of
Parameters
to disk in PRMS text formatTo reduce memory usage the
write
method copies parameters from the initialbase_file
parameter file for all parameters that were never modified.- Parameters
out_name (str) – path to write
Parameters
data to PRMS text format.- Returns
None
Simulation¶
-
class
prms_python.
Simulation
(input_dir=None, simulation_dir=None)[source]¶ Class that runs and manages file structure for a single PRMS simulation.
The
Simulation
class provides low-level managment of a PRMS simulation by copying model input files frominput_dir
argument to an output dirsimulation_dir
. The file stucture for an individual simulation after calling therun
method is simple, two subdirectories “inputs” and “outputs” are created undersimulation_dir
and the respective input and output files from the current PRMS simulation are transfered there after theSimulation.run()
method is called which executes the PRMS model, (see examples below inSimulation.run()
).A
Simulation
instance checks that all required PRMS inputs (control, parameters, data) exist in the expected locations. If simulation_dir is provided and does not exist, it will be created. If it does exist it will be overwritten.- Keyword Arguments
Example
see
Simulation.run()
- Raises
RuntimeError – if input directory does not contain a PRMS data, parameters, and control file.
-
classmethod
from_data
(data, parameters, control_path, simulation_dir)[source]¶ Create a
Simulation
from aData
andParameter
object, plus a path to the control file, and providing asimulation_dir
where the simulation should be run.- Parameters
data (
Data
) –Data
object for simulationparameters (
Parameters
) –Parameters
object for simulationcontrol_path (str) – path to control file
simulation_dir (str) – path to directory where simulations will be run and where input and output will be stored. If it exists it will be overwritten.
- Returns
Simulation
ready to be run usingsimulation_dir
forinputs and outputs
Example
>>> d = Data('path_to_data_file') >>> p = Parameters('path_to_parameters_file') >>> c = 'path_to_control_file' >>> sim_dir = 'path_to_create_simulation' >>> sim = Simulation.from_data(d, p, c, sim_dir) >>> sim.run()
- Raises
TypeError – if
data
andparameters
arguments are not of typeData
andParameters
-
run
(prms_exec='prms')[source]¶ Run a
Simulation
instance using PRMS input files frominput_dir
and copy to theSimulation
file structure undersimulation_dir
if given, otherwise leave PRMS input output unstructured and ininput_dir
This method runs a single PRMS simulation from a
Simulation
instance, waits until the process has completed and then transfers model input and output files to respective newly created directories. See example of the file structure that is created under different workflows of therun
method below.- Keyword Arguments
prms_exec (str) – name of PRMS executable on $PATH or path to executable
Examples
If we create a
Simulation
instance by only assigning theinput_dir
argument and call itsrun
method the model will be run in theinput_dir
and all model input and output files will remain ininput_dir
,>>> import os >>> input_dir = os.path.join( 'PRMS-Python', 'prms_python', 'models', 'lbcd' ) >>> os.listdir(input_dir) ['data', 'data_3deg_upshift', 'parameters', 'parameters_adjusted', 'control'] >>> sim = Simulation(input_dir) >>> sim.run() >>> os.listdir(input_dir) # all input and outputs in input_dir ['data', 'data_3deg_upshift', 'parameters', 'parameters_adjusted', 'control', 'statvar.dat', 'prms_ic.out', 'prms.out' ]
Instead if we assigned a path for
simulation_dir
keyword argument and then calledrun
, i.e.>>> sim = Simulation(input_dir, 'path_simulation') >>> sim.run()
the files structure for the PRMS simulation created by
Simulation.run()
would be:path_simulation ├── inputs │ ├── control │ ├── data │ └── parameters └── outputs ├── data_3deg_upshift ├── parameters_adjusted ├── prms_ic.out ├── prms.out └── statvar.dat
Note
As shown in the last example, currently the
Simulation.run
routine only recognizes the data, parameters. and control file as PRMS inputs, all other files found ininput_dir
before and after normal completion of the PRMS simulation will be transferred tosimulation_dir/outputs/
.
SimulationSeries¶
-
class
prms_python.
SimulationSeries
(simulations)[source]¶ Series of simulations all to be run through a common interface.
Utilizes
multiprocessing.Pool
class to parallelize the execution of series of PRMS simulations. SimulationSeries also allows the user to define the PRMS executable command which is set to “prms” as default. It is best to add the prms executable to your $PATH environment variable. Each simulation that is run throughSimulationSeries
will follow the strict file structure as defined bySimulation.run()
. This class is useful particularly for creating new programatic workflows not provided yet by PRMS-Python.- Parameters
simulations (list or tuple) – list of
Simulation
objects to be run.
Example
Lets say you have already created a series of PRMS models by modifying the input climatic forcing data, e.g. you have 100 data files and you want to run each using the same control and parameters file. For simplicity lets say there is a directory that contains all 100 data files e.g. data1, data2, … or whatever they are named and nothing else. This example also assumes that you want each simulation to be run and stored in directories named after the data files as shown.
>>> data_dir = 'dir_that_contains_all_data_files' >>> params = Parameters('path_to_parameter_file') >>> control_path = 'path_to_control' >>> # a list comprehension to make multiple simulations with >>> # different data files, alternatively you could use a for loop >>> sims = [ Simulation.from_data ( Data(data_file), params, control_path, simulation_dir='sim_{}'.format(data_file) ) for data_file in os.listdir(data_dir) ]
Next we can use
SimulationSeries
to run all of these simulations in parrallel. For example we may use 8 logical cores on a common desktop computer.>>> sim_series = SimulationSeries(sims) >>> sim_series.run(nprocs=8)
The
SimulationSeries.run()
method will run all 100 simulations where chunks of 8 at a time will be run in parrallel. Inputs and outputs of each simulation will be sent to each simulation’ssimulation_dir
following the file structure ofSimulation.run()
.Note
The
Simulation
andSimulationSeries
classes are low-level in that they alone do not create metadata for PRMS simulation scenarios. In other words they do not produce any additional files that may help the user know what differs between individual simulations.-
outputs_iter
()[source]¶ Return a
generator
of directories with the path to thesimulation_dir
as well as paths to the statvar.dat output file, and data and parameters input files used in the simulation.- Yields
dict
–- dictionary of paths to simulation directory,
input, and output files.
Example
>>> ser = SimulationSeries(simulations) >>> ser.run() >>> g = ser.outputs_iter()
Would return something like
>>> print(g.next()) { 'simulation_dir': 'path/to/sim/', 'statvar': 'path/to/statvar', 'data': 'path/to/data', 'parameters': 'path/to/parameters' }
-
run
(prms_exec='prms', nproc=None)[source]¶ Method to run multiple
Simulation
objects in parrallel.- Keyword Arguments
Example
see
SimulationSeries
Note
If
nproc
is not assigned the deault action is to use half of the available processecors on the machine using the Pythonmultiprocessing
module.
Scenario¶
-
class
prms_python.
Scenario
(base_dir, scenario_dir, title=None, description=None)[source]¶ Container for the process in which one modifies input parameters then runs a simulation while tracking metadata.
Metadata includes a title and description, if provided, plus start/end datetime, and parameter names of parameters that were modified including string representations of the Python modification functions that were applied to each parameter. The metadata file is in
json
format making it conveniently read as a Python dictionary.- Parameters
base_dir (str) – path to directory that contains initial control, parameter, and data files to be used for
Scenario
. The parameters file inbase_dir
will not be modifed instead will be copied toscenario_dir
and them modified.scenario_dir (str) – directory path to bundle inputs and outputs
title (str, optional) – title of
Scenario
, if given will be added toScenario.metadata
attribute as well as themetadata.json
file inscenario_dir
written after calling theScenario.build()
andScenario.run()
methods.description (str, optional) – description of
Scenario
, also is added toScenario.metadata
astitle
.
-
metadata
¶ a dictionary-like class in
prms_python.scenario
that tracksScenario
andScenarioSeries
imformation including user-defined parameter modifications and descriptions, and file structure.- Type
scenario.ScenarioMetadata
Examples
This example is kept simple for clarity, here we adjust a single PRMS parameter tmin_lapse by using a single arbitrary mathematical function. We use the example PRMS model included with PRMS-Python for this example,
>>> input_dir = 'PRMS-Python/test/data/models/lbcd' >>> scenario_directory = 'scenario_testing' >>> title = 'Scenario example' >>> desc = 'adjust tmin_lapse using sine wave function' >>> # create Scenario instance >>> scenario_obj = Scenario ( base_dir=input_dir, scenario_dir=scenario_directory, title=title, description=desc )
Next we need to build a dictionary to modify, in this case tmin_lapse, here we use a vectorized sine function
>>> # build the modification function and dictionary >>> def a_func(arr): return 4 + np.sin(np.linspace(0,2*np.pi,num=len(arr))) >>> # make dictionary with parameter names as keys and modification >>> # function as values >>> param_mod_dic = dict(tmin_lapse=a_func) >>> scenario_obj.build(param_mod_funs=param_mod_dic)
After building a
Scenario
instance the input files are copied toscenario_dir
which was assigned ‘scenario_testing’:scenario_testing ├── control ├── data └── parameters
After calling
build
the input files frominput_dir
were first copied toscenario_dir
and then the functions inparam_mod_dic
are applied the the parameters names (key) inparam_mod_dic
. To run theScenario
use the therun
method>>> scenario_obj.run()
Now the simulation is run and the
metadata.json
file is created, the final file structure will be similar to this:scenario_testing ├── inputs │ ├── control │ ├── data │ └── parameters ├── metadata.json └── outputs ├── prms_ic.out ├── prms.out └── statvar.dat
Finally, here is what is contained in
metadata.json
for this example which is also updates in theScenario.metadata
>>> scenario_obj.metadata { 'title': 'Scenario example', 'description': 'adjust tmin_lapse using sine wave function', 'start_datetime': '2018-09-01T19:20:21.723003', 'end_datetime': '2018-09-01T19:20:31.117004', 'mod_funs_dict': { 'tmin_lapse': 'def parab(arr): return 4 + np.sin(np.linspace(0,2*np.pi,num=len(arr)))' } }
As shown the metadata retirieved the parameter modification function as a string representation of the exact Python function(s) used for modifying the user-defined parameter(s).
Note
The main differentiator between
Scenario
andScenarioSeries
is thatScenario
is designed for modifying one or more parameters of a single parameters file whereasScenarioSeries
is designed for modifying and tracking the modification of one or more parameters in multiple PRMS parameters files, therefore resulting in multiple PRMS simulations.-
build
(param_mod_funs=None)[source]¶ Take a user-defined dictionary with param names as keys and Python functions as values, copy the original input files as given when initializing a
Scenario
instance to thesimulation_dir
then apply the functions in the user-defined dictionary to the parameters there. Thebuild
method must be called before running theScenario
(callingScenario.run()
).- Keyword Arguments
param_mod_funs (dict) – dictionary with parameter names as keys and Python functions as values to apply to the names (key)
- Returns
None
Example
see
Scenario
for a full example.Note
If the
scenario_dir
that was assigned for the current instance already exists, it will be overwritten whenbuild
is invoked.
-
run
(prms_exec='prms')[source]¶ Run the PRMS simulation for a built
Scenario
instance.- Keyword Arguments
prms_exec (str) – name of PRMS executable on $PATH or path to executable
- Returns
None
Examples
see
Scenario
for full example- Raises
RuntimeError – if the
Scenario.build()
method has not yet been called.
ScenarioSeries¶
-
class
prms_python.
ScenarioSeries
(base_dir, scenarios_dir, title=None, description=None)[source]¶ Create and manage a series of model runs where parameters are modified.
First initialize the series with an optional title and description. Then to build the series the user provides a list of dictionaries with parameter-function key-value pairs, and optionally a title and description for each dictionary defining the individual scenario.
The ScenarioSeries’
build
method creates a file structure under the series directory (scenarios_dir
) where each subdirectory is named with auuid
which can be later matched to its title using the metadata inscenario_dir/series_metadata.json
(seejson
). In the future we may add a way for the user to access the results of the scenario simulations directly through theScenarioSeries
instance, but for now the results are written to disk. Therefore each scenario’s title metadata can be used to refer to which parmaters were modified and how for post-processing and analysis. One could also use the description metadata for this purpose.- Parameters
- Keyword Arguments
-
metadata
¶ dictionary with title, description, and UUID map dictionary for individiual
Scenario
output directories, the UUID dictionary (uuid_title_map
)is left empty until callingScenarioSeries.build()
.- Type
-
scenarios
¶ empty list that will be filled with ``Scenario``s after defining them by calling
ScenarioSeries.build()
.- Type
Example
There are three steps to both
Scenario
and ScenarioSeries, first we initialize the object>>> series = ScenarioSeries( >>> base_dir = 'dir_with_input_files', >>> scenarios_dir = 'dir_to_run_series', >>> title = 'title_for_group_of_scenarios', >>> description = 'description_for_scenarios' >>> )
The next step is to “build” the
ScenarioSeries
by calling theScenarioSeries.build()
method which defines which parameters to modify, how to modify them, and then performs the modification which readies the series to be “run” (the last step). See theScenarioSeries.build()
method for the next step example.Also see Scenario & ScenarioSeries for full example
-
build
(scenarios_list)[source]¶ Build the scenarios from a list of scenario definitions in dicitonary form.
Each element of
scenarios_list
can have any number of parameters as keys with a function for each value. The other two acceptable keys aretitle
anddescription
which will be passed on to each individual Scenario’s metadata inseries_metadata.json
for future lookups. Thebuild
method also creates a file structure that uses UUID values as individiualScenario
subdirectories as shown below.- Parameters
scenarios_list (list) – list of dictionaries with key-value pairs being parameter-function definition pairs or title-title string or description-description string.
- Returns
None
Examples
Following the initialization of a
ScenarioSeries
instance as shown the example docstring there, we “build” the series by defining a list of param named-keyed function-valued dictionaries. This example uses arbitrary functions on two PRMS parameters snowinfil_max and snow_adj,>>> def _function1(x): #Note, function must start with underscore return x * 0.5 >>> def _function2(x): return x + 5 >>> dic1 = {'snowinfil_max': _function1, 'title': 'scenario1'} >>> dic2 = {'snowinfil_max': _function2, 'snow_adj': function1, 'title': 'scenario2', 'description': 'we adjusted two snow parameters' } >>> example_scenario_list = [dic1, dic2] >>> # now we can build the series >>> series.build(example_scenario_list)
In this example that follows from
ScenarioSeries
example the file structure that is created by thebuild
method is as follows:dir_to_run_series ├── 670d6352-2852-400a-997e-7b12ba34f0b0 │ ├── control │ ├── data │ └── parameters ├── base_inputs │ ├── control │ ├── data │ └── parameters ├── ee9526a9-8fe6-4e88-b357-7dfd7111208a │ ├── control │ ├── data │ └── parameters └── series_metadata.json
As shown the build method has copied the original inputs from the
base_dir
given on initialization ofScenarioSeries
to a new subdirectory of thescenarios_dir
, it also applied the modifications to the parameters for both scenarios above and move the input files to their respective directories. At this stage themetadata
will not have updated the UUID map dictionary to each scenarios subdirectory because they have not yet been run. See theScenarioSeries.run()
method for further explanation including the final file structure and metadata file contents.
-
classmethod
from_parameters_iter
(base_directory, parameters_iter, title=None, description=None)[source]¶ Alternative way to initialize and build a
ScenarioSeries
in one step.Create and build a
ScenarioSeries
by including the param-keyed- function-valued dictionary (parameters_iter
) that is otherwise passed inScenarioSeries.build()
.- Parameters
base_directory (str) – directory that contains model input files
parameters_iter (list of dicts) – list of dictionaries for each
Scenario
as described inScenario
andScenarioSeries.build()
.title (str) – title for group of scenarios
description (str) – description for group of scenarios
- Returns
None
-
run
(prms_exec='prms', nproc=None)[source]¶ Run a “built”
ScenarioSeries
and make final updates to file structure and metadata.- Keyword Arguments
prms_exec (str) – name of PRMS executable on $PATH or path to executable. Default = ‘prms’
nproc (int or None) – number of processceors available to parallelize PRMS simulations, if None (default) then use half of what the
multiprocessing
detects on the machine.
- Returns
None
Examples
This example starts where the example ends in
ScenarioSeries.build()
, callingrun
will run the models for all scenarios and then update the file structureas well as create individual
Scenario
metadata files as such:dir_to_run_series ├── 5498c21d-d064-45f4-9912-044734fd230e │ ├── inputs │ │ ├── control │ │ ├── data │ │ └── parameters │ ├── metadata.json │ └── outputs │ ├── prms_ic.out │ ├── prms.out │ └── statvar.dat ├── 9d28ec5a-b570-4abb-8000-8dac113cbed3 │ ├── inputs │ │ ├── control │ │ ├── data │ │ └── parameters │ ├── metadata.json │ └── outputs │ ├── prms_ic.out │ ├── prms.out │ └── statvar.dat ├── base_inputs │ ├── control │ ├── data │ └── parameters └── series_metadata.json
As we can see the file structure follows the combined structures as defined by
Simulation
andScenario
. The content of the top-level metadata fileseries_metadata.json
is as such:{ "title": "title_for_group_of_scenarios", "description": "description_for_scenarios", "uuid_title_map": { "5498c21d-d064-45f4-9912-044734fd230e": "scenario1", "9d28ec5a-b570-4abb-8000-8dac113cbed3": "scenario2" } }
Therefore one can use the
json
file to track between UUID’s and individual scenario titles. The json files are read as a Python dictionary which makes them particularly convenient. The contents of an individual scenariosmetadata.json
file included a string representation of the function(s) that were applied to the paramter(s):{ "description": null, "end_datetime": "2018-09-03T00:00:40.793817", "mod_funs_dict": { "snowinfil_max": "def _function1(x): return x * 0.5" }, "start_datetime": "2018-09-03T00:00:30.421353", "title": "scenario1" }
Note
As shown, it is important to give appropriate scenario titles when building a
ScenarioSeries
dictionary in order to later understand how parameters were modified in each scenario. If not one would have to rely on the individualmetadata.json
files in each scenario directory which may be more cumbersome.
Optimizer¶
-
class
prms_python.
Optimizer
(parameters, data, control_file, working_dir, title, description=None)[source]¶ Container for PRMS parameter optimization and related routines.
Currently the
monte_carlo
method provides random parameter resampling routines using uniform and normal random variables.Example
>>> from prms_python import Data, Optimizer, Parameters >>> params = Parameters('path/to/parameters') >>> data = Data('path/to/data') >>> control = 'path/to/control' >>> work_directory = 'path/to/create/simulations' >>> optr = Optimizer( params, data, control, work_directory, title='the title', description='desc') >>> measured = 'path/to/measured/csv' >>> statvar_name = 'basin_cfs' # or any other valid statvar >>> params_to_resample = ['dday_intcp', 'dday_slope'] # list of params >>> optr.monte_carlo(measured, params_to_resample, statvar_name)
-
monte_carlo
(reference_path, param_names, statvar_name, stage, n_sims=10, method='uniform', mu_factor=1, noise_factor=0.1, nproc=None)[source]¶ The
monte_carlo
method ofOptimizer
performs parameter random resampling techniques to a set of PRMS parameters and executes and manages the corresponding simulations.- Parameters
- Keyword Arguments
n_sims (int) – number of simulations to conduct parameter optimization/uncertaitnty analysis.
method (str) – resampling method for parameters (normal or uniform)
mu_factor (float) – coefficient to scale mean of the parameter(s) to resample from when using the normal distribution to resample i.e. a value of 1.5 will sample from a normal rv with mean 50% higher than the original parameter mean
noise_factor (float) – scales the variance of noise to add to parameter values when using normal rv (method=’normal’)
nproc (int) – number of processors available to run PRMS simulations
- Returns
None
-
plot_optimization
(freq='daily', method='time_series', plot_vars='both', plot_1to1=True, return_fig=False, n_plots=4)[source]¶ Basic plotting of current optimization results with limited options. Plots measured, original simluated, and optimization simulated variabes Not recommended for plotting results when n_sims is very large, instead use options from an OptimizationResult object, or employ a user-defined method using the result data.
- Keyword Arguments
freq (str) – frequency of time series plots, value can be ‘daily’ or ‘monthly’ for solar radiation
method (str) – ‘time_series’ for time series sub plot of each simulation alongside measured radiation. Another choice is ‘correlation’ which plots each measured daily solar radiation value versus the corresponding simulated variable as subplots one for each simulation in the optimization. With coefficients of determiniationi i.e. square of pearson correlation coef.
plot_vars (str) – what to plot alongside simulated srad: ‘meas’: plot simulated along with measured swrad ‘orig’: plot simulated along with the original simulated swrad ‘both’: plot simulated, with original simulation and measured
plot_1to1 (bool) – if True plot one to one line on correlation scatter plot, otherwise exclude.
return_fig (bool) – flag whether to return matplotlib figure
- Returns
- If kwarg return_fig=True, then return
copy of the figure that is generated to the user.
- Return type
f (
matplotlib.figure.Figure
)
-
OptimizationResult¶
-
class
prms_python.
OptimizationResult
(working_dir, stage)[source]¶ The
OptimizationResult
object serves to collect and manage output from anOptimizer
method. Upon initialization and a given optimization stage that was used when running the Optimizer method, e.g.monte_carlo
, the class gathers all JSON metadata that was produced for the given stage. TheOptimizationResult
has three main user methods: firstresult_table
which returns the top n simulations according to four model performance metrics (Nash-Sutcliffe efficiency (NSE), root-mean squared-error (RMSE), percent bias (PBIAS), and the coefficient of determination (COEF_DET) as calculated against measured data. For example the table may look like:>>> ddsolrad_res = OptimizationResult(work_directory, stage=stage) >>> top10 = ddsolrad_res.result_table(freq='monthly',top_n=10) >>> top10 ======================== ======== ======= ========= ======== ddsolrad parameters NSE RMSE PBIAS COEF_DET ======================== ======== ======= ========= ======== orig_params 0.956267 39.4725 -0.885715 0.963116 tmax_index_54.2224631748 0.921626 47.6092 -0.849256 0.94402 tmax_index_44.8823940703 0.879965 58.9194 5.79603 0.922021 tmax_index_47.6835387480 0.764133 82.5918 -4.78896 0.837582 ======================== ======== ======= ========= ========
Second, the
get_top_ranked_sims
which returns a dictionary that map key information about the top n ranked simulations, an example returned dictionary may look like:>>> { 'dir_name' : ['pathToSim1', 'pathToSim2'], 'param_path' : ['pathToSim1/input/parameters', 'pathToSim2/input/parameters'], 'statvar_path' : ['pathToSim1/output/statvar.dat', 'pathToSim2/output/statvar.dat'], 'params_adjusted' : [[param_names_sim1], [param_names_sim2]] }
The third method of
OptimizationResult
isarchive
which essentially opens all parameter and statvar files from each simulation of the given stage and archives the parameters that were modified and their modified values and the statistical variable (PRMS time series output) that is associated with the optimization stage. OtherOptimizer
simulation metadata is also gathered and new JSON metadata containing only this information is created and written within a newly created “archived” subdirectory within the same directory that theOptimizer
routine managed simulations. TheOptimizationResult.archive
method then recursively deletes the simulation data for each of the given stage.-
archive
(remove_sims=True, remove_meta=False, metric_freq='daily')[source]¶ Create archive directory to hold json files that contain information of adjusted parameters, model output, and performance metrics for each Optimizer simulation of the OptimizationResult.stage in the OptimizationResult.working_dir.
- Keyword Arguments
remove_sims (bool) – If True recursively delete all folders and files associated with original simulations of the OptimizationResult.stage in the OptimizationResult.working_dir, if False do not delete simulations.
remove_meta (bool) – Whether to delete original Optimizer JSON metadata files, default is False.
metric_freq (Str) – Frequency of output metric computation for recording of model performance. Can be ‘daily’ (default) or ‘monthly’. Note, other results can be computed later with archived results.
- Returns
None
-
Select helper functions¶
load_data¶
-
prms_python.
load_data
(data_file)[source]¶ Read the data file and load into a datetime indexed Pandas dataframe object.
- Parameters
data_file (str) – data file path
- Returns
- Pandas dataframe of input time series data
from data file with datetime index
- Return type
df (pandas.DataFrame)
modify_params¶
-
prms_python.
modify_params
(params_in, params_out, param_mods=None)[source]¶ Given a parameter file in and a dictionary of param_mods, write modified parameters to params_out.
- Parameters
- Keyword Arguments
param_mods (dict) – param name-keyed, param modification function-valued
- Returns
None
Example
Below we modify the monthly jh_coef parameter by increasing it 10% for every month,
>>> params_in = 'models/lbcd/parameters' >>> params_out = 'scenarios/jh_coef_1.1/params' >>> scale_10pct = lambda x: x * 1.1 >>> modify_params(params_in, params_out, {'jh_coef': scale_10pct})
So param_mods is a dictionary of with keys being parameter names and values a function that operates on a single value. Currently we only accept functions that operate without reference to any other parameters. The function will be applied to every cell, month, or cascade routing rule for which the parameter is defined.
resample_param¶
-
prms_python.optimizer.
resample_param
(params, param_name, how='uniform', mu_factor=1, noise_factor=0.1)[source]¶ Resample PRMS parameter by shifting all values by a constant that is taken from a uniform distribution, where the range of the uniform values is equal to the difference between the min and max of the allowable range. The parameter min and max are set in
Optimizer.param_ranges
. If the resampling method (how
argument) is set to ‘normal’, randomly sample a normal distribution with mean = mean(parameter) Xmu_factor
and sigma = param allowable range multiplied bynoise_factor
. If parameters have array length <= 366 then individual parameter values are resampled otherwise resample all param values at once, e.g. by taking a single random value from the uniform distribution. If they are taking all at once using the normal method then the original values are scaled by mu_factor and a normal random variable with mean=0 and std dev = parameter range Xnoise_factor
.- Parameters
params (
prms_python.Parameters
) –Parameters
objectparam_name (str) – name of PRMS parameter to resample
- Keyword Arguments
how (str) – distribution to resample parameters from in the case that each parameter element can be resampled (len <=366) Currently works for uniform and normal distributions.
noise_factor (float) – factor to multiply parameter range by, use the result as the standard deviation for the normal rand. variable used to add element wise noise. i.e. higher noise_factor will result in higher variance. Must be > 0.
- Returns
ndarray of param after resampling
- Return type
ret (
numpy.ndarry
)- Raises
KeyError – if
param_name
not a valid parameter nameValueError – if the parameter range has not been set in
Optimizer.param_ranges
nash_sutcliffe¶
-
prms_python.
nash_sutcliffe
(observed, modeled)[source]¶ Calculates the Nash-Sutcliffe Goodness-of-fit
- Parameters
observed (numpy.ndarray) – historic observational data
modeled (numpy.ndarray) – model output with matching time index