1.3.6. Creating Surface Datasets
When just creating a replacement file for an existing one, the relevant tool should be used directly to create the file. When you are creating a set of files for a new resolution there are some dependencies between the tools that you need to keep in mind when creating them. The main dependency is that you MUST create a SCRIP grid file first as the SCRIP grid dataset is then input into the other tools. Also look at Table 1.4.3.1.1 which gives information on the files required and when. Figure 1.3.2 shows an overview of the general data-flow for creation of the fsurdat datasets.
Figure 1.3.2 Data Flow for Creation of Surface Datasets from Raw SCRIP Grid Files
Starting from a SCRIP grid file that describes the grid you will run the model on, you first run `mkmapdata.sh to create a list of mapping files. See Figure 1.3.1 for a more detailed view of how mkmapdata.sh works. The mapping files tell mksurfdata_esmf how to map between the output grid and the raw datasets that it uses as input. The output of mksurfdata_esmf is a surface dataset that you then use for running the model. See Figure 2.27.3 for a more detailed view of how mksurfdata_esmf works.
Figure 1.3.3 is the legend for this figure (Figure 1.3.2) and other figures in this chapter (Figure 1.3.4 and Figure 1.3.5).
Figure 1.3.3 Legend for Data Flow Figures
Green arrows define the input to a program, while red arrows define the output. Cylinders define files that are either created by a program or used as input for a program. Boxes are programs.
You start with a description of a SCRIP grid file for your output grid file and then create mapping files from the raw datasets to it. Once, the mapping files are created mksurfdata_esmf is run to create the surface dataset to run the model.
1.3.6.1. Creating a Complete Set of Files for Input to CLM
Create SCRIP grid datasets (if NOT already done)
First you need to create a descriptor file for your grid, that includes the locations of cell centers and cell corners. There is also a “mask” field, but in this case the mask is set to one everywhere (i.e. all of the masks for the output model grid are “nomask”). An example SCRIP grid file is:
$CSMDATA/lnd/clm2/mappingdata/grids/SCRIPgrid_10x15_nomask_c110308.nc. Themkmapgridsandmkscripgrid.nclNCL script in the$CTSMROOT/tools/mkmapgridsdirectory can help you with this. SCRIP grid files for all the standard CLM grids are already created for you. See the Section called Creating an output SCRIP grid file at a resolution to run the model on for more information on this.
Todo
Update the below, as domain files aren’t needed with nuopc.
Create domain dataset (if NOT already done)
Next use
gen_domainto create a domain file for use by DATM and CLM. This is required, unless a domain file was already created. See the Section called Creating a domain file for CLM and DATM for more information on this.Create mapping files for
mksurfdata_esmf(if NOT already done)Create mapping files for
mksurfdata_esmfwithmkmapdata.shin$CTSMROOT/tools/mkmapdata. See the Section called Creating mapping files thatmksurfdata_esmfwill use for more information on this.Create surface datasets
Next use
mksurfdata_esmfto create a surface dataset, using the mapping datasets created on the previous step as input. There is a version for either clm4_0 or CTSM1 for this program. See the Section called Usingmksurfdata_esmfto create surface datasets from grid datasets for more information on this.Enter the new datasets into the
build-namelistXML database The last optional thing to do is to enter the new datasets into thebuild-namelistXML database. See Chapter 3 for more information on doing this. This is optional because the user may enter these files into their namelists manually. The advantage of entering them into the database is so that they automatically come up when you create new cases.
The $CTSMROOT/tools/README goes through the complete process for creating input files needed to run CLM. We repeat that file here:
$CTSMROOT/tools/README Jun/08/2018
CTSM tools for analysis of CTSM history files -- or for creation or
modification of CTSM input files.
I. General directory structure:
$CTSMROOT/tools
mksurfdata_esmf -- Create surface datasets.
crop_calendars --- Regrid and process GGCMI sowing and harvest date files for use in CTSM.
mkmapgrids ------- Create regular lat/lon SCRIP grid files
site_and_regional Scripts for handling input datasets for site and regional cases.
These scripts both help with creation of datasets using the
standard process as well as subsetting existing datasets and overwriting
some aspects for a specific case.
modify_input_files Scripts to modify CTSM input files. Specifically modifying the surface
datasets and mesh files.
contrib ---------- Miscellaneous tools for pre or post processing of CTSM.
Typically these are contributed by anyone who has something
they think might be helpful to the community. They may not
be as well tested or supported as other tools.
cime-tools ($CIMEROOT/tools/) (CIMEROOT is ../cime for a CTSM checkout and ../../../cime for a CESM checkout)
$CIMEROOT/mapping/gen_domain_files
gen_domain ------- Create data model domain datasets from SCRIP mapping datasets.
II. Notes on building/running for each of the above tools:
mksurfdata_esmf has a cime configure and CMake based build using the following files:
gen_mksurfdata_build ---- Build mksurfdata_esmf
src/CMakeLists.txt ------ Tells CMake how to build the source code
Makefile ---------------- GNU makefile to link the program together
cmake ------------------- CMake macros for finding libraries
mkmapgrids, and site_and_regional only contain scripts so don't have the above build files.
Some tools have copies of files from other directories -- see the README.filecopies
file for more information on this.
Tools may also have files with the directory name followed by namelist to provide sample namelists.
<directory>.namelist ------ Namelist to create a global file.
These files are also used by the test scripts to test the tools (see the
README.testing) file.
NOTE: Be sure to change the path of the datasets references by these namelists to
point to where you have exported your CESM inputdata datasets.
III. Process sequence to create input datasets needed to run CTSM
1.) Create SCRIP grid files (if needed)
a.) For standard resolutions these files will already be created. (done)
b.) To create regular lat-lon regional/single-point grids run site_and_regional/mknoocnmap.pl
This will create both SCRIP grid files and a mapping file that will
be valid if the region includes NO ocean whatsoever (so you can skip step 2).
You can also use this script to create SCRIP grid files for a region
(or even a global grid) that DOES include ocean if you use step 2 to
create mapping files for it (simply discard the non-ocean map created by
this script).
Example, for single-point over Boulder Colorado.
cd site_and_regional
./mknoocnmap.pl -p 40,255 -n 1x1_boulderCO
c.) General case
You'll need to convert or create SCRIP grid files on your own (using scripts
or other tools) for the general case where you have an unstructured grid, or
a grid that is not regular in latitude and longitude.
example format
==================
netcdf fv1.9x2.5_090205 {
dimensions:
grid_size = 13824 ;
grid_corners = 4 ;
grid_rank = 2 ;
variables:
double grid_center_lat(grid_size) ;
grid_center_lat:units = "degrees" ;
double grid_center_lon(grid_size) ;
grid_center_lon:units = "degrees" ;
double grid_corner_lat(grid_size, grid_corners) ;
grid_corner_lat:units = "degrees" ;
double grid_corner_lon(grid_size, grid_corners) ;
grid_corner_lon:units = "degrees" ;
int grid_dims(grid_rank) ;
int grid_imask(grid_size) ;
grid_imask:units = "unitless" ;
2.) Create ocean to atmosphere mapping file (if needed)
a.) Standard resolutions (done)
If this is a standard resolution with a standard ocean resolution -- this
step is already done, the files already exist.
b.) Region without Ocean (done in step 1.b)
IF YOU RAN mknoocnmap.pl FOR A REGION WITHOUT OCEAN THIS STEP IS ALREADY DONE.
c.) New atmosphere or ocean resolution
If the region DOES include ocean, use $CIMEROOT/tools/mapping/gen_domain_files/gen_maps.sh to create a
mapping file for it.
Example:
cd $CIMEROOT/tools/mapping/gen_domain_files
./gen_maps.sh -focn <ocngrid> -fatm <atmgrid> -nocn <ocnname> -natm <atmname>
3.) Add SCRIP grid file(s) created in (1) into XML database in CTSM (optional)
See the "Adding New Resolutions or New Files to the build-namelist Database"
Chapter in the CTSM User's Guide
http://www.cesm.ucar.edu/models/cesm1.0/clm/models/lnd/clm/doc/UsersGuide/book1.html
If you don't do this step, you'll need to specify the file to mksurfdata_esmf
in step (3) using the "-f" option.
4.) Convert map of ocean to atm for use by DATM and CTSM with gen_domain
(See $CIMEROOT/tools/mapping/README for more help on doing this)
- gen_domain uses the map from step (2) (or previously created CESM maps)
Example:
cd $CIMEROOT/tools/mapping/gen_domain_files/src
gmake
cd ..
setenv CDATE 090206
setenv OCNGRIDNAME gx1v6
setenv ATMGRIDNAME fv1.9x2.5
setenv MAPFILE $CSMDATA/cpl/cpl6/map_${OCNGRIDNAME}_to_${ATMGRIDNAME}_aave_da_${CDATE}.nc
./gen_domain -m $MAPFILE -o $OCNGRIDNAME -l $ATMGRIDNAME
Normally for I compsets running CTSM only you will discard the ocean domain
file, and only use the atmosphere domain file for datm and as the fatmlndfrc
file for CTSM. Output domain files will be named according to the input OCN/LND
gridnames.
5.) Create surface datasets with mksurfdata_esmf on Derecho
(See mksurfdata_esmf/README.md for more help on doing this)
- gen_mksurfdata_build to build
- gen_mksurfdata_namelist to build the namelist
- gen_mksurfdata_jobscript_single to build a batch script to run on Derecho
- Submit the batch script just created above
- This step uses the results of step (3) entered into the XML database
in step (4).
- If datasets were NOT entered into the XML database, set the resolution
by entering the mesh file using the options: --model-mesh --model-mesh-nx --model-mesh-ny
Example: for 0.9x1.25 resolution fro 1850
cd mksurfdata_esmf
./gen_mksurfdata_build
./gen_mksurfdata_namelist --res 0.9x1.25 --start-year 1850 --end-year 1850
./gen_mksurfdata_jobscript_single --number-of-nodes 24 --tasks-per-node 12 --namelist-file target.namelist
qsub mksurfdata_jobscript_single.sh
NOTE that surface dataset will be used by default for fatmgrid - and it will
contain the lat,lon,edges and area values for the atm grid - ASSUMING that
the atm and land grid are the same
6.) Add new files to XML data or using user_nl_clm (optional)
See notes on doing this in step (3) above.
IV. Notes on which input datasets are needed for CTSM
global or regional/single-point grids
- need fsurdata and fatmlndfrc
fsurdata ---- from mksurfdata_esmf in step (III.7)
fatmlndfrc -- use the domain.lnd file from gen_domain in step (III.6)