1.7.1. What is PTCLMmkdata?¶
PTCLMmkdata (pronounced Pee-Tee Cee-L-M make data is a Python script to help you set up PoinT CLM simulations.
It runs the CLM tools for you to get datasets set up, and copies them to a location you can use them including the changes needed for a case to use the dataset with namelist and XML changes.
Then you run create_newcase and point to the directory so that the namelist and XML changes are automatically applied.
PTCLMmkdata has a simple ASCII text file for storing basic information for your sites.
We also have complete lists for AmeriFlux and Fluxnet-Canada sites, although we only have the meteorology data for one site.
For other sites you will need to obtain the meteorology data and translate it to a format that the CESM datm model can use.
But, even without meteorology data PTCLMmkdata is useful to setup datasets to run with standard CLM_QIAN
data.
The original authors of PTCLMmkdata are: Daniel M. Ricciuto, Dali Wang, Peter E. Thornton, Wilfred M. Post all at Environmental Sciences Division, Oak Ridge National Laboratory (ORNL) and R. Quinn Thomas at Cornell University. It was then modified fairly extensively by Erik Kluzek at NCAR. We want to thank all of these individuals for this contribution to the CESM effort. We also want to thank the folks at University of Michigan Biological Stations (US-UMB) who allowed us to use their Fluxnet station data and import it into our inputdata repository, especially Gil Bohrer the PI on record for this site.
1.7.2. Details of PTCLMmkdata¶
To get help on PTCLM2_180611 use the “–help” option as follows.
> cd $CTSMROOT/tools/PTCLM
> ./PTCLMmkdata --help
The output to the above command is as follows:
Usage: PTCLM.py [options] -d inputdatadir -m machine -s sitename
Python script to create cases to run single point simulations with tower site data.
Options:
--version show program's version number and exit
-h, --help show this help message and exit
Required Options:
-d CCSM_INPUT, --csmdata=CCSM_INPUT
Location of CCSM input data
-m MYMACHINE, --machine=MYMACHINE
Machine, valid CESM script machine (-m list to list valid
machines)
-s MYSITE, --site=MYSITE
Site-code to run, FLUXNET code or CLM1PT name (-s list to list
valid names)
Configure and Run Options:
-c MYCOMPSET, --compset=MYCOMPSET
Compset for CCSM simulation (Must be a valid 'I' compset [other
than IG compsets], use -c list to list valid compsets)
--coldstart Do a coldstart with arbitrary initial conditions
--caseidprefix=MYCASEID
Unique identifier to include as a prefix to the case name
--cesm_root=BASE_CESM
Root CESM directory (top level directory with models and scripts
subdirs)
--debug Flag to turn on debug mode so won't run, but display what would
happen
--finidat=FINIDAT Name of finidat initial conditions file to start CLM from
--list List all valid: sites, compsets, and machines
--namelist=NAMELIST
List of namelist items to add to CLM namelist (example:
--namelist="hist_fincl1='TG',hist_nhtfrq=-1"
--QIAN_tower_yrs Use the QIAN forcing data year that correspond to the tower
years
--rmold Remove the old case directory before starting
--run_n=MYRUN_N Number of time units to run simulation
--run_units=MYRUN_UNITS
Time units to run simulation (steps,days,years, etc.)
--quiet Print minimul information on what the script is doing
--sitegroupname=SITEGROUP
Name of the group of sites to search for you selected site in
(look for prefix group names in the PTCLM_sitedata directory)
--stdurbpt If you want to setup for standard urban namelist settings
--useQIAN use QIAN input forcing data instead of tower site meterology data
--verbose Print out extra information on what the script is doing
Input data generation options:
These are options having to do with generation of input datasets. Note: When
running for supported CLM1PT single-point datasets you can NOT generate new
datasets. For supported CLM1PT single-point datasets, you MUST run with the
following settings: --nopointdata And you must NOT set any of these: --soilgrid
--pftgrid --owritesrf
--nopointdata Do NOT make point data (use data already created)
--owritesrf Overwrite the existing surface datasets if they exist (normally
do NOT recreate them)
--pftgrid Use pft information from global gridded file (rather than site
data)
--soilgrid Use soil information from global gridded file (rather than site
data)
Main Script Version Id: $Id: PTCLM.py 47576 2013-05-29 19:11:16Z erik $ Scripts URL: $HeadURL: https://svn-ccsm-models.cgd.ucar.edu/PTCLM/trunk_tags/PTCLM1_130529/PTCLM.py $:
Here we give a simple example of using PTCLMmkdata for a straightforward case of running at the US-UMB Fluxnet site on cheyenne where we already have the meteorology data on the machine. Note, see the Section called Converting AmeriFlux Data for use by PTCLMmkdata for permission information to use this data.
1.7.2.1. Example 6-1. Example of running PTCLMmkdata for US-UMB on cheyenne¶
> setenv CSMDATA $CESMDATAROOT/inputdata
> setenv MYDATAFILES `pwd`/mydatafiles
> setenv SITE US-UMB
> setenv MYCASE testPTCLM
# Next build all of the clm tools you will need
> cd $CTSMROOT/tools/PTCLM
> buildtools
# next run PTCLM (NOTE -- MAKE SURE python IS IN YOUR PATH)
> cd $CTSMROOT/tools/PTCLM
# Here we run it using qcmd so that it will be run on a batch node
> qcmd -- ./PTCLMmkdata --site=$SITE --csmdata=$CSMDATA --mydatadir=$MYDATAFILES >& ptclmrun.log &
> cd $CIMEROOT/scripts
> ./create_newcase --user-mods-dir $MYDATAFILES/1x1pt_$SITE --case $MYCASE --res CLM_USRDAT --compset I1PtClm50SpGs
# Next setup, build and run as normal
> cd $MYCASE
> ./case.setup
PTCLMmkdata includes a README file that gives some extra details and a simple example.
PTCLM/README 04/10/2015
PTCLMmkdata is a python tool built on top of CLM tools and CESM scripts
for building datasets to run CLM "I" cases for data from Ameriflux Tower-sites,
or other user-supplied single-point datasets.
Original Authors:
Daniel M. Ricciuto, Dali Wang, Peter E. Thornton, Wilfred M. Post
Environmental Sciences Division, Oak Ridge National Laboratory (ORNL)
R. Quinn Thomas
Cornell University
Modified by:
Erik Kluzek (NCAR)
General Directory structure:
PTCLM/PTCLMmkdata ----- Main script
PTCLM/PTCLM_sitedata - Site data files of
static information latitude, longitude, soil info., and PFT information
for each site Also different "groups" of site-data lists, and the script to
convert the transient years landuse_timeseries files into landuse_timeseries text files that
mksurfdata can use.
PTCLM/mydatafiles ----- Default location of
data files that will be created by PTCLMmkdata. Sites will be built
in their own subdirectories under here. Optionally you can give your
own location you'd like to use for your data.
PTCLM/PTCLMsublist --------- Script to submit a list of PTCLM
sites to the batch que (only setup for a few machines).
PTCLM/PTCLMsublist_prog.py - Python module to support submit
list script. Handles command line arguments and such.
PTCLM/batchque.py ---------- Python module for batch submital.
PTCLM/buildtools ----------- Script to build the CLM
tools needed to run PTCLMmkdata (mksurfdata_map and gen_domain). Works on cheyenne.
Quickstart:
# ASSUMPTIONS:
# For this example I'm running a I1PtClm50SpGs case on cheyenne using
# CSMDATA in the standard location
# Finally we use the 6-digit AmeriFlux site code for the University of Mich. Biological
# Station US-UMB (data for this station is checked into the inputdata repository).
# I also assume you are using UNIX C-shell, and GNU make is called gmake
setenv CSMDATA /glade/p/cesm/cseg/inputdata
setenv SITE US-UMB
cd PTCLM
setenv MYDATAFILES `pwd`/mydatafiles
# Next build all of the clm tools you will need
# The following script assumes cheyenne, hobart, or izumi for other machines
# you'll need to build each tool by hand
./buildtools
# next run PTCLMsublist which will submit PTCLMmkdata to batch queue (NOTE -- MAKE SURE python, NCO AND NCL IS IN YOUR PATH)
# PTCLMsublist is only setup for a few batch machines, you'll need to update them to add new machines
# or create your own batch submission script.
# NOTE: Every day you run PTCLMmkdata it will remake the map called
# renamemapfiles to rename files with todays creation date.
# This makes running PTCLMmkdata a reasonable amount of time.
# However, you can use the script in mydatafiles
#
qcmd -l walltime=02:00:00 -- ./PTCLMsublist -l $SITE -d $CSMDATA --account=XXXXXXXXX --mach=cheyenne
# NOTE: To submit several sites at once, make the "-l" option a comma delimited
# list of site names.
# Next copy the towersite meterology datafiles into your $MYDATAFILES space
# (For the US-UMB station you can skip this step as the .build step will bring the data over)
cd $MYDATAFILES/1x1pt_$SITE
mkdir $MYDATAFILES/1x1pt_$SITE/CLM1PT_data
# Copy meteorology data NetCDF files into 1x1pt_$SITE sub-directory
# (with filenames of yyyy-mm.nc)
# The variables assumed to be on the files are:
# ZBOT, TBOT, RH, WIND, PRECTmms, FSDS, PSRF, FLDS
# (if other fields are available or with different names this can be changed by
# adding a user_nl_datm.streams.txt file as we outline below)
# Make sure your data has time with the attribute: calendar="gregorian"
# Make sure the forcing directory points to the location of your data
# (PTCLMmkdata should already do this)
./xmlchange DIN_LOC_ROOT_CLMFORC=$MYDATAFILES/1x1pt_$SITE
# Then create a case using the data you just created
setenv MYCASE "testPTCLM"
cd $CTSMROOT/cime
setenv CIMEROOT `pwd`
cd $CIMEROOT/scripts
./create_newcase --user-mods-dir $MYDATAFILES/1x1pt_$SITE --case $MYCASE --res CLM_USRDAT --compset I1PtClm50SpGs --mach cheyenne
# Next setup as normal
cd $MYCASE
./case.setup
# If you need to customize your list of fields uncomment and do the following...
# cp CaseDocs/datm.streams.txt.CLM1PT.CLM_USRDAT user_datm.streams.txt.CLM1PT.CLM_USRDAT
# chmod u+w user_datm.streams.txt.CLM1PT.CLM_USRDAT
# $EDITOR user_datm.streams.txt.CLM1PT.CLM_USRDAT
# ./preview_namelists
# Finally build, and run the case as normal
./case.build
./case.submit