Tutorial¶
Note
This tutorial is currently under development.
Overview¶
Here we provide a brief introduction to the elements of this tutorial. The tutorial
explains how to setup pygcam
and manage the GCAM workflow. The steps involved
in using pygcam
are explained briefly here, and in more
detail in the sections that follow.
1. Install pygcam¶
Before using pygcam
, you must install a Python 2.7 environment and then
install the pygcam
package. See the Installation page for details.
Windows users should also see Using pygcam under Windows.
2. Configure pygcam¶
The pygcam
scripts and libraries rely on a configuration file to:
- define the location of essential and optional files,
- allow the user to set defaults for many command-line arguments to scripts, and
- define both global default and project-specific values for all parameters
See Initial Configuration for how to set up the configuration file for the first time, and the Configuration System page for detailed information about the configuration file.
3. Define the project¶
Many of the features of the GCAM tool (gt) script can be used directly without
setting up a project definition. However, the full workflow-management capabilities
of pygcam
require an XML-based project definition file that describes:
- one or more projects that may have different workflow steps
- one or more “scenario groups” that define a baseline and related policy scenarios
- a set of steps to perform (e.g., run GCAM, query the database, compute differences between scenario results, plot figures of scenario results and differences from baselines.)
- data required by some of the steps
See project.xml for a detailed description of the file’s XML schema, and more information later in this document. See GCAM XML-Setup and scenarios.xml for information regarding how to define scenarios in XML or in Python.
Note
The new sub-command of the GCAM tool (gt) script can be used to
create the initial structure and files required for a new project, and optionally,
insert a section for the new project in the $HOME/.pygcam.cfg
configuration file.
4. Setup the project files¶
The setup sub-command of the GCAM tool (gt) script provides support for modifying GCAM’s XML data files and configuration file according to the needs of your project. See GCAM XML-Setup for details.
This is the only step of the pygcam workflow process that requires Python programming. Work is underway to allow simple projects to be defined without requiring Python code.
5. Run the project¶
Project workflow is managed using the run sub-command of the GCAM tool (gt) script, which reads the project.xml file to understand the project setup, and offers numerous options allowing you to choose which project, scenario group, or scenarios to operate on and which steps to run.
Initial configuration¶
The pygcam
package uses a configuration file called .pygcam.cfg
, stored in
the user’s home directory, i.e., $(HOME)/.pygcam.cfg
. When gt
runs, it
checks whether this file exists. If the file is not found, it is created with all
available configuration parameters shown in comments (i.e., lines starting with ‘#’)
explaining their purpose and showing their default values. To uncomment a line,
simply remove the leading ‘#’ character.
Edit the configuration file with any editor capable of working with plain text.
(Word-processors such as Word introduce formatting information into the file which
renders it unusable by pygcam
.) You can use the command gt config -e
to
invoke a system-appropriate editor on the configuration file. See the Configuration System
page for details.
Configuration file sections¶
The configuration file is divided into sections indicated by a name within square brackets. All variable declarations following a section declaration, until the next section declaration (if any) appear in the declared section. You can declare a section multiple times to add new values to the section. (See Sample Configuration File, below.)
Project sections¶
Each project should have its own section. For example, to setup a project called, say,
“paper1”, I would create the section [paper1]
. Following this, I would define variables
particular to this project, e.g., where the to find the files defining scenarios, queries,
and so on.
Default section¶
Default values are defined in the [DEFAULT]
section. When pygcam
requests the value
of a variable from a project section, the default value is returned if the variable is not
defined in the project section. Variables that you want to set uniformly for all of your
projects can be defined in the [DEFAULT]
section.
All pre-defined pygcam
variables are defined in the [DEFAULT]
section,
allowing them to be overridden on a project-by-project basis.
Sample configuration file¶
Below is a sample configuration file for a project called Paper1
. By convention,
variables are named with a prefix identifying where they are defined. All variables
defined by pygcam
begin with GCAM.
, so if you create your own variables (e.g.,
to define values used in defining other variables) you should avoid confusion by avoiding
this prefix. You can use any prefix desired, or none at all.
[DEFAULT] GCAM.DefaultProject = paper1 GCAM.ProjectRoot = %(Home)s/gcamProjects GCAM.SandboxRoot = %(Home)s/ws GCAM.LogLevel = INFO GCAM.MI.LogFile = %(Home)s/tmp/mi.log GCAM.MI.Dir = /pic/projects/GCAM/ModelInterface GCAM.OtherBatchArgs = -A my_account GCAM.QueryDir = %(GCAM.ProjectDir)s/queries GCAM.QueryPath = %(GCAM.QueryDir)s GCAM.TextEditor = open -a emacs # Setup config files to not write extraneous files, so of which are very large GCAM.WriteDebugFile = False GCAM.WritePrices = False GCAM.WriteXmlOutputFile = False GCAM.WriteOutputCsv = False [paper1] GCAM.RewriteSetsFile = %(GCAM.ProjectDir)s/etc/rewriteSets.xml GCAM.ScenarioSetupFile = %(GCAM.ProjectDir)s/etc/scenarios.xml GCAM.LogLevel = DEBUG
Running a GCAM experiment¶
The basic GCAM experiment consists of a running a baseline scenario and one or more policy
scenarios that are compared to the baseline. In pygcam
, the experiment is defined in
a project.xml file, the location of which is specified by the config parameter
GCAM.ProjectXmlFile
, which defaults to %(GCAM.ProjectDir)s/etc/project.xml
.
The scenarios.xml file describes all the workflow steps required to setup, run, and
The project.xml file describes all the workflow steps required to setup, run, and analyze the scenarios. The entire workflow or select steps can be run using the gcamtool run sub-command.
After you have created a project.xml
file describing the scenarios, workflow steps,
and other parameters and data required by the workflow steps, and created a configuration
file to set appropriate defaults, you can run the entire analysis with a single command:
gt run
With no other options specified (as above), the default scenario group (identified in the project.xml file) of the default project (defined in your configuration file) will be run, starting with the scenario identified as the baseline, followed by all other policy scenarios. All defined workflow steps will be executed in the order defined, for all scenarios.
Of course, there are several options available to the GCAM tool (gt) command, including the ability to set the desired level of diagnostic output (the “log level”), and to run the command on a compute node on a cluster computing system.
The “run” sub-command also provides many options, including the ability to select which scenario group to run and limit which scenarios and steps to run (or not run).
Customizing project steps¶
The generic workflow steps defined in the project.xml file may suffice for many projects. It is likely, however, that you will want to customize several other elements of the project file.
Queries¶
The queries identified in the project file (or in an external file) determine which results are extracted from the GCAM database for each run of the model, and thus determine which subsequent steps (computing differences, creating charts) can be performed. To plot results, you must first extract them from the database using a query.
Queries can be extracted on-the-fly from files used with ModelInterface by specifying
the location of the XML file in the configuration variable GCAM.QueryPath
and
referencing the desired query by its defined “title”. (See the
query sub-command and the pygcam.query API documentation
for more information.)
Rewrite sets¶
Standard GCAM XML queries can define “rewrites” which modify the values of chosen data elements to allow them to be aggregated. For example, you can aggregate all values of CornAEZ01, CornAEZ02, ..., CornAEZ18 to be returned simply as “Corn”.
In pygcam
this idea is taken a step further by allowing you to define reusable,
named “rewrite sets” that can be applied on-the-fly to
queries named in the project file. For example, if you are working with a particular
regional aggregation, you can define this aggregation once in a rewrites.xml
file
and reference the name of the rewrite set when specifying queries in project.xml.
See rewrite sets for more information.