10. Extending CDEPS

10.1. Adding New Data Mode

While, the existing copyall data modes can be used to bring new data streams easily to the CDEPS, the more complex data streams might need to create a new data mode. For example, bringing a new data mode to the DATM might involve calculating added value fields such as humidity from temperature, pressure and dew point temperature or wind speed from wind components.

Adding a new data mode to the existing data components involve multiple steps:

  1. Creating ESMF mesh file for data stream

  2. Creating new Fortran module file for new data mode

  3. Adding New Data Mode to Data Component

10.1.1. Creating SCRIP File

The easiest way to create ESMF Mesh file is to use the ESMF_Scrip2Unstruct application. The application is a parallel program that converts a SCRIP format grid file into an unstructured grid file in the ESMF unstructured file format or in the UGRID file format.

To use ESMF_Scrip2Unstruct application, SCRIP or UGRID format grid files need to be created first by using coordinate information found in the stream data files. The SCRIP format grid file can be created using existing tools such as netCDF Operator (NCO) and NCAR Command Language (NCL) while UGRID format requires to develop a custom tool to generate.

For more sophisticated mesh definitions such as unstructured grids and the polygons with more sides, SCRIP format grid file or ESMF Mesh file needs to be directly created by developing custom tool by following the exiting convention that is used to define SCRIP and ESMF Mesh file formats.

netCDF Operator (NCO)

The ncks command provided by NCO generates accurate and complete SCRIP-format gridfiles for select grid types, including uniform, capped and Gaussian rectangular, latitude/longitude grids, global or regional. In this case, the grids are stored in an external grid-file. The more information and examples can be found in here.

As an example, a SCRIP grid definition file for Weather Research & Forecasting Model (WRF) can be created using following command:

ncks --rgr infer --rgr scrip=wrfinput_d01_scrip.nc wrfinput_d01_2d_lat_lon.nc wrfinput_d01_foo.nc

where wrfinput_d01_scrip.nc is the output file (in SCRIP format), wrfinput_d01_2d_lat_lon.nc is the input file with the 2d grid information, and wrfinput_d01_foo.nc is an output file containing metadata. In this case, wrfinput_d01_2d_lat_lon.nc needs to be created from origial WRF input file by following netCDF CF conventions.

The following example also creates SCRIP grid file using ncremap. command without providing any input file:

ncremap -g glo_30m.SCRIP.nc -G latlon=321,720#snwe=-80.25,80.25,-0.25,359.75#lat_typ=uni#lat_drc=s2n

where glo_30m.SCRIP.nc is the output file, -G options indicates that the command will create the gridfile in SCRIP format, latlon is used to indicate the size of latitude and longitude coordinates, snwe option specifies the the outer edges of a regional rectangular grid, lat_typ option is used to define grid type (uni is used for global and uniform grids) and lat_drc option specifies whether latitudes monotonically increase or decrease in rectangular grids (s2n for grids that begin with the most southerly latitude).

NCAR Command Language (NCL)

The NCL method of creating SCRIP grid definition file requires additional development to represent stream data grid information. In this case, NCL provides a set of functions to get grid coordinates as input to create SCRIP grid definition file. These functions are listed as follows:

latlon_to_SCRIP

This procedure writes the description of the requested lat/lon grid to a netCDF SCRIP output file. It does not get any input arguments related to the coordinates but generates them internally based on given grid_type option (“1deg”, “0.25deg”, etc).

rectilinear_to_SCRIP

This procedure writes the description of a rectilinear grid, given the 1D coordinate lat/lon arrays, to a netCDF SCRIP file.

curvilinear_to_SCRIP

This procedure writes the description of a curvilinear grid to a NetCDF SCRIP file, given the 2D lat/lon arrays.

A simple code snippet that demonstrates the usage of NCL provided routines to create ESMF mesh file can be seen in the following example:

;--- open file and read variables ---
ncGrdFilePath = "domain.nc"
grd_file = addfile(ncGrdFilePath,"r")
lon = grd_file->lon
lat = grd_file->lat

;--- set options for SCRIP generation ---
opt = True
opt@ForceOverwrite = True
opt@NetCDFType = "netcdf4"
opt@Title = "Global Grid"

;--- generate SCRIP file ---
dstGridPath = "SCRIP.nc"
rectilinear_to_SCRIP(dstGridPath, lat, lon, opt)

To add area field to the SCRIP file:

;--- add area to SCRIP file ---
scripFile = addfile("scrip.nc", "w")

grid_size = dimsizes(scripFile->grid_center_lat)
grid_area = new(grid_size,double)
grid_area!0 = "grid_size"

do i = 0,grid_size-1
  temp_tlat = (/ scripFile->grid_corner_lat(i,3), \
            scripFile->grid_corner_lat(i,1), \
            scripFile->grid_corner_lat(i,0), \
            scripFile->grid_corner_lat(i,2)    /)
  temp_tlon = (/ scripFile->grid_corner_lon(i,3), \
            scripFile->grid_corner_lon(i,1), \
            scripFile->grid_corner_lon(i,0), \
            scripFile->grid_corner_lon(i,2)    /)
  grid_area(i) = area_poly_sphere(temp_tlat, temp_tlon, 1)
end do

scripFile->grid_area = (/ grid_area /)

Note

The NCL project is feature frozen. The next generation Python tool is now underway, and more information about Geoscience Community Analysis Toolkit (GeoCAT) project can be found in the following site.

10.1.2. Creating ESMF Mesh File

Once a SCRIP grid definition file is created, the ESMF mesh file can be created using following command:

ESMF_Scrip2Unstruct input_SCRIP.nc output_ESMFmesh.nc 0

where input_SCRIP.nc is the input SCRIP grid file and output_ESMFmesh.nc is the ESMF mesh file.

Note

Creating SCRIP grid definition files and ESMF mesh files could be very memory intensive in case of creating file for very high-resolution global grids like GHRSST dataset (0.01 deg.).

In this case, the NCL method could fail due to the memory usage since the process is not parallel and can not be distributed to multiple nodes. The workaround could be generating SCRIP and ESMF mesh file for smaller domains or just for the region of interest. In some cases taking advantage of parallelization in ESMF_Scrip2Unstruct might help but the current implementation of ESMF_Scrip2Unstruct requires reading whole coordinate information in each MPI task and This could prevent scaling of the job in terms of its memory usage.

10.1.3. Creating New Fortran Module

The existing date mode specific Fortran module files can be used as a reference to create a new data mode. As an example, existing clmncep data mode under DATM can be used for this purpose.

In datm_datamode_clmncep_mod.F90, there are five main routines:

datm_datamode_clmncep_advertise()

This routine advertises a field in a state. In this case, an empty field is created and added to the state through use of ESMF/NUOPC provided NUOPC_Advertise() call. The dshr_fldList_add() is a generic routine defined under dshr/dshr_fldlist_mod.F90 and populates the internal data structure.

datm_datamode_clmncep_init_pointers()

This routine initializes pointers for module level stream arrays. It provides flexibility to access data pointer in actual stream data file (shr_strdata_get_stream_pointer()) as well as ESMF Fields (dshr_state_getfldptr()). The flexibility of checking the fields in the stream data file allows control the behaviour of the data mode based on different conditions. In this routine, it is also possible to access data provided by other model components to support interaction with other components like prognostic mode defined in this mode.

datm_datamode_clmncep_advance()

This routine is called every time when the data component needs to provide the data to other components. It also includes custom calculations like limiting temperature field, calculating specific humidity or downward longwave and applying unit conversions.

datm_datamode_clmncep_restart_write()

This routine is used to write restart information to data model specific restart file through the use of dshr_restart_write() call.

datm_datamode_clmncep_restart_read()

This routine is used to read restart information from data model specific restart file through the use of dshr_restart_read() call.

10.1.4. Adding New Data Mode to Data Component

The data modes are defined in data component specific Fortran modules named as CDEPS/d[model_name]/[model_name]_comp_nuopc.F90 where model_name can be atm, ice, lnd, ocn, rof or wav. In the clmncep example, the data mode is defined in CDEPS/datm/atm_comp_nuopc.F90 and DATM component calls different routine based on selected datamode argument in the [model_name]_in namelist file.