Tree species diversity distribution

Estimating tree diversity distribution patterns using the r.series.diversity addon for GRASS GIS


In this post I’ll use the r.series.diversity addon (Breugel 2016) for GRASS GIS (GRASS Development Team 2020) to compute the tree species diversity distribution patterns in the contiguous USA, based on tree distribution data from Wilson et al. (2013). With the addon, you can compute a number of diversity indices, including the Renyi entropy index and a number of specialized cases of the Renyi enthropy, viz.the species richness, the Shannon index, the Shannon based effective number of species (ENS), the Simpson index (inverse and gini variants), and Pielou’s eveness (Legendre and Legendre 1998). Install the addon first (see here how).

The data

Wilson et al. (2013) published raster maps with the estimated live tree basal area (square feet/acre) for 324 tree species in the United States. The raster maps, at a resolution of 250 meter and representing the time period 2000-2009, are based on MODIS imagery, raster data describing relevant environmental parameters, and field plot data of tree species basal area. The methods are described in (2012) and Wilson et al. (2013). We’ll use the basal area as proxy for the species abundances.

Data import

You can download the data from this website. There are three zip files, each containing approximately 80 raster layers (img format), one per species. The map projection is Albers Conical Equal Area (datum NAD83). First step is to download the data and unzip the rasters from the three zip files in one folder.

Working with data in GRASS GIS differs from what you may be used to in e.g. QGIS. So for readers less familiar with GRASS GIS, I would recommend to first read the introduction to GRASS GIS here. Here you are also introduced to another important concept, the region.

The next step is to create a new location and mapset. Below the steps are oulined in more details. Skip these steps if you are already familiar with how to create a new location and mapset.

Steps to create a new mapset

  1. Open GRASS and select the option to create a new location
Create new location

Figure 1: Create new location

  1. Enter a name and optionally a description. In the example, I named the new location NAD83CA, which is the location in which I will keep all source data in Albers Conical Equal Area (datum NAD83) projection.
Define database and location name

Figure 2: Define database and location name

  1. Select the method to create a new location. In this case, we want to read the projection and datum terms from the georeferenced file.
Choose method to determine crs

Figure 3: Choose method to determine crs

  1. In the next window, you can select one of the raster images you downloaded and unzipped.
Choose layer to be used to set teh crs

Figure 4: Choose layer to be used to set teh crs

  1. In the next window you are presented with a summary of the new location you are about to create. If it all looks good, click finish.

  2. Now you are asked if you want to import the data layer that you used to create the location. If you select yes, the data layer will be imported in the PERMANENT mapset. The default region extent and resolution will be set based on the imported data layer. Selected ‘no’. You’ll import the layer together with the others in one go in the next steps.

  3. If you selected ‘Create user mapset’ in step 2, you are now asked for a name for the new mapset.

Set name for new mapset

Figure 5: Set name for new mapset

After creating the new location and mapset, you can continue with importing the data layers in your newly created mapset. The files have an .img extension. We can check with the function gdalinfo what kind of raster file this is.

Check the raster file type

Run the following on the command line.

# Go to the directory with your raster files
cd Desktop/treesUSA

# Get information about the layer
gdalinfo s10.img

This will give you a lot of information, but what we are after is the file type. You can find this near the top of the output; it is a HFA/Erdas Imagine Image (.img).

Now that you know the type of the raster layers, you can import them using the r.import function.

Importing all raster layers in a folder

You can use the function r.import to import all the layers in a folder in one go. Go to Menu > File > Import raster data > Simplified raster import with reprojection. Don’t worry about the reprojection bit. The coordinate reference system (crs) of the mapset is equal to that of the raster layers (obviously, we used one of the raster layer to define the crs of the mapset), so no reprojection is carried out.

Import the data

Figure 6: Import the data

See the numbers in the image above: (1) Select as source type Directory and (2) select the directory with the raster layers. (3) Select Erdas Imagine Images (.img) as format and make sure the extension is set to img. After selecting the directory and raster format, a list will appear with all raster layers (4). Select all layers, and deselect Add imported layers into layer tree (5). Optionally, go to Import settings (6) to change some settings (e.g., if you have some RAM to spare, increase the maximum memory to be used).

Regions and masks

Before we compute the diversity indici, we need to set the computational region and define a mask. The region defines the extent and resolution that will be used for all raster computations. It can be set with the g.region function.

Set the region

The region should match the raster layers we just imported. To do so, open one of them and right-click on the layer name. In the context menu, select ‘Set computation region from selected map’

Set computation region

Figure 7: Set computation region

Alternatively, you can use the g.region function on the command line or console.

g.region raster=s10

The tree distribution data is for the continental part of the USA, excluding Alaska. So we want to mask out all other areas. To be able to do so, first download the USA boundaries, e.g., from the GADM website and import the data. The vector layer can subsequently be used to define the mask.

Set the mask

To download a geopackage with the vector data, use this link. After download, extract the geopackage. Note that the the data is in latlon (wgs84). Easiest is to use the v.import function, which reprojects the data on the fly. Make sure to select the layer with the national boundary gadm36_USA_0.

Import vector layer with USA boundaries

Figure 8: Import vector layer with USA boundaries

You can also run this from the command line or console.

v.import input=gadm36_USA_gpkg/gadm36_USA.gpkg layer=gadm36_USA_0

Next, we can create a mask with the r.mask function. This function essentially converts the vector layer to a raster layer, with the special name MASK. You can display this layer like any other layer.

# Set the mask
r.mask vector=gadm36_USA_0

Remember that all raster operations use the extent and resolution set by the region. This is true for the MASK layer as well. This means that only the areas within the bounds of the region are included. E.g. Alaska falls outside the region.

Compute diversity

Now we are finally ready to compute the tree species diversity, using the r.series.diversity addon for GRASS GIS. This can be done in two steps on the command line.

Get a list of input layers

You can create a list with the names of the raster layers using the g.list function. The names of our raster layers all start with a s. The following command will create a list with all names, and assign these to the variable MAPS.

# List all raster layers with species
MAPS=`g.list type=raster pattern=s* separator=comma mapset=trees`

The variable MAPS can now be used as input in the r.series.diversity function.

Use this as input for the r.series.diversity function.

Compute diversity indici

The flags indicate which diversity indici to calculate. In the code below, these are (r) Renyi enthropy index, (s) the richness index, (h), the shannon index, (p) the reversed Simpson index, (g) the Gini-Simpson index,(e) the Pielou’s evenness index, and (n) the Shannon effective number of species.

# Compute the diversity layers
r.series.diversity -s -h -p -g -e input=$MAPS output=TreeDiversity

Note that with this dataset, the computation will take a really long time. Long enough for you to forget about it for the rest of the day, hopefully to have it finished next morning. So yes, the code definitely needs some optimization (any suggestion as to how to accomplish this are most welcome).

The results

The output are maps with patterns of tree diversity across contiguous United States. In Figure 9 the maps with the species richness and the Shannon-wiener diversity index are shown. The maps show that in the mideastern states like West Virginia, Kentucky and Tennessee, tree diversity is relatively high, while tree diversity in the states on the west coast is clearly lower.

Tree diversity patterns across the USA

Figure 9: Tree diversity patterns across the USA

The observed patterns correspond fairly well with the map of the tree diversity by Heath et al. (2015). This is not surprising. Both the map by Heath et al. (2015) and the species distribution maps by Wilson et al. (2013) are based on the Forest Inventory and Analysis (FIA) field plot data from the U.S. Forest Service.

The tree diversity map by Jenkins et al. (2015) shows a different pattern, with a clear diversity hotspot in the southeastern United States. The use of different sources of information (and underlying methodologies and assumptions) might be one reason for the differences. Jenkins et al. used the range maps of tree species by Little (1971), digitized by the US Geological Survey (1999).

In addition, the diversity map by Jenkins et al. is based on the range maps of 641 species, out of the more than 700 different native and naturalized tree species in the United States (Little 1979). The Wilson et al. (2013) dataset contains a much smaller subset of 324 species. If these are e.g., the most common species, this could result in an underestimation of the species richness in regions with more rare or endemic species. And indeed, these maps show that the southeastern United States has a high number of endemic tree species. Another point to reiterate is that the basal area was used as proxy indicator for tree abundance.


I have uploaded the data to Open Science Framework (OSF) data sharing site from the Center for Open Science. If you use the data, please cite as follows: Van Breugel, P. (2020, August 3). Tree diversity distribution patterns in contiguous USA.


Breugel, Paulo van. 2016. “R.series.diversity - a GRASS GIS Addon to Compute Diversity Indici over Input Layers.”

GRASS Development Team. 2020. Geographic Resources Analysis Support System (Grass Gis) Software. USA: Open Source Geospatial Foundation.

Heath, Linda S., Sarah M. Anderson, Marla R. Emery, Jeffrey A. Hicke, Jeremy Littell, Alan Lucier, Jeffrey G. Masek, et al. 2015. “Indicators of Climate Impacts for Forests: Recommendations for the US National Climate Assessment Indicators System.” NRS-GTR-155. Newtown Square, PA: U.S. Department of Agriculture, Forest Service, Northern Research Station.

Jenkins, Clinton N., Kyle S. Van Houtan, Stuart L. Pimm, and Joseph O. Sexton. 2015. “US Protected Lands Mismatch Biodiversity Priorities.” Proceedings of the National Academy of Sciences 112 (16): 5081–6.

Legendre, P., and L. Legendre. 1998. Numerical Ecology. Second English edition. Developments in Environmental Modelling 20. Amsterdam: Elsevier.

Little, Elbert L. Jr. 1971. Atlas of United States Trees: Volume 1: Conifers and Important Hardwoods. First Edition edition. United States Government Printing Office,, Washington:

Little, Elbert Luther. 1979. Checklist of United States Trees (Native and Naturalized). D.C. : Forest Service, U.S. Department of Agriculture.

US Geological Survey. 1999. “Digital Representation of Tree Species Range Maps from "Atlas of United States Trees" by Elbert L. Little, Jr.” Geosciences; Environmental Change Science Center, USGS.

Wilson, Barry Tyler, Andrew J Lister, R I Riemann, Douglas M Griffith, and M Douglas. 2013. “Live Tree Species Basal Area of the Contiguous United States (2000-2009).” Newtown Square, PA: USDA Forest Service, Rocky Mountain Research Station.

Wilson, B. Tyler, Andrew J. Lister, and Rachel I. Riemann. 2012. “A Nearest-Neighbor Imputation Approach to Mapping Tree Species over Large Areas Using Forest Inventory Plots and Moderate Resolution Raster Data.” Forest Ecology and Management 271 (May): 182–98.

Paulo van Breugel
Paulo van Breugel
Lecturer & researcher

My interests range from biodiversity and ecology to spatial data analysis. I am also what one could describe as a lifelong learner; I enjoy to learn new things and widen my horizon, both professionally and personally.

comments powered by Disqus