Import GBIF data in QGIS
This tutorial shows how to download GBIF occurrence data of Ambrosia psilostachya in QGIS. The common name of this species is perennial ragweed. The species is native of North America. Outside North America the perennial ragweed has been introduced to many countries in Europe, Asia, Africa and Australia (CABI 2019). Let’s see if the GBIF data confirms this.
Install the plugin
To use the plugin, launch QGIS and in the main menu, go to
Plugins -> Manage and install plugins.... In the
All tab search for ‘GBIF occurrences. Select the plugin and click ’Install plugin.’
Use the plugin
You can open the plugin window from the main menu:
Vector -> GBIF Occurrences -> Load GBIF Occurrences. Alternatively, you can use the Plugin icon in the toolbar (circled in red in the image below).
Next, you fill in the details for your search. As a minimum, fill in the Scientific name of the species. Next, you click “Load occurrences” and the plugin will start to fetch your records.
Optionally, you can fill in other information to restrict your search. You can for example restrict your search to certain years, or what kind of observations you want to include.
When the downloading is done, you’ll see a new QGIS layer with the occurrences. The attribute table of the point vector layer contains all details known by GBIF. This makes it possible to make further selections, or check the downloaded data.
Note that the downloaded layer is a so-called temporary scratch layer. These are in-memory layers, meaning that they are not saved on disk and will be discarded when QGIS is closed. To avoid data loss, you should save the downloaded layer as vector layer. You can do this in any vector format supported by QGIS, using any of the following methods:
- click the indicatorMemory indicator icon next to the layer.
- right click on the layer name and in the contextual menu, select the
Make permanententry or use the
- in the menu, select
Layer ► Save As….
Each of these commands opens the Save Vector Layer as dialog described in the Creating new layers from an existing layer section and the saved file replaces the temporary one in the Layers panel.
Currently you can only select one item from the filter dropdown lists. I.e.d, it is not possible to select multiple items, nor to filter out certain items. So, one can select all ‘human observations’ but not filter all ‘human observations’ out. Also important to know, due to limitations of the GBIF API, searches are limited to 200,000 records. An alternative is to download the data from the GBIF website directly. The website offers you very convenient and advanced ways of filtering the data.
The tool offers a very convenient way to quickly download data. However, it does not offer the tools to help you check and clean the data. Yet, errors in the GBIF data, e.g., problematic geographic coordinates or duplicate records, are quite common and need to be cleaned. You can do this manually of course. However, this requires expert knowledge and is only feasible on small taxonomic or geographic scales. It is furthermore time-consuming and difficult to reproduce (Zizka et al. 2020; Jin and Yang 2020). One tool to help you automate part of the detection of problems and cleaning of the data is CoordinateCleaner toolset for R (Zizka et al. 2019). After cleaning the data, you can export the cleaned data set as Geopackage for use in QGIS. See this tutorial how to do this.
If you use the data for analysis and want to publish or share the results, it is important that your analysis are reproducible. This means it should be perfectly clear what data you have used. GBIF data is constantly updated. In addition, there are many ways to filter the data. So just mentioning you have downloaded the data from GBIF isn’t really enough.
A clear advantage of downloading the data directly from the GBIF website is that it automatically produces a Digital Object Identifier (DOI), thus providing a persistent link to the data download from GBIF.org, including information about the search date and filters used.