Overview
Teaching: 5 min Exercises: 10 minQuestions
How do I aggregate a time series of raster data over a time period?
How do I summarize data by vector regions?
How do I export tabular data summaries?
Objectives
Use reducers to aggregate a daily image collection to annual values
Import vector data to summarize values by polygon regions
Use climate data products available through GEE
Export tabular data
Link to a static version of the full script used in this module: https://code.earthengine.google.com/269a0d4a6b9854e6f81ac87187a72559
In Google Earth Engine (GEE), reducers are used to aggregate data over time, space, and other data structures. They belong to the ee.Reducer
class and include summary statistics, histograms, and linear regression, among others. Here’s a visual from Google demonstrating a reducer applied to an ImageCollection
:
Reductions can also occur in space, over bands within an image, or over the attributes of a FeatureCollection
. See the Reducer Overview in the Google Developer’s Guide for more information.
In Episode 3: Accessing Satellite Imagery, we used a vector boundary and date range to filter an image collection, mapped an algorithm (NDVI) over that collection, and then reduced that collection to one image in which each pixel value was its maximum NDVI. Here we follow the same workflow, but instead reduce using imageCollection.sum()
to calculate total annual precipitation for each pixel in the US (temporal reducers). We then take it a step further and use the spatial reducer ‘reduceRegions’ to calculate total annual precip for each US county.
Here, we will demonstrate a temporal reducer and a spatial reducer by obtaining data on annual precipitation by US county.
A secondary objective to this exercise is to use GEE to access common datasets stored in the data archive that may appeal to those not directly interested in remote sensing applications. As described in the Introduction, GEE has co-located a number of datasets relevant to earth systems analyses. The full archive can be browsed here. In this exercise, we will use the GRIDMET Meteorological Dataset to obtain precipitation. Briefly, GRIDMET blends PRISM and NLDAS to produce a daily, 4 km gridded climate dataset for the contiguous United States from 1979 - present.
As discussed in Accessing Satellite Imagery, an ImageCollection
is a stack or time series of images. Reducers are used to derive a single Image
based on the ImageCollection
. Operations occur on a per pixel basis. We will follow this workflow:
ImageCollection
First, we need to identify the ImageCollection ID for the GRIDMET data product and the band name for the precipitation data (and check any relevant metadata). You can find this either in the data catalog or directly in the GEE Code Editor at the top above the center panel.
From the GRIDMET description, we know the ImageCollection ID = ‘IDAHO_EPSCOR/GRIDMET’ and the precipitation band name is ‘pr’. We will specifically select
this band only.
By printing the resulting collection to the Console, we can see we’ve accessed 365 images, each with 1 band named ‘pr’.
The imageCollection.reduce()
operator allows you to apply any function of class ee.Reducer()
to all images in the collection. If your ImageCollection
had multiple bands, the reducer is applied separately to all bands (unless the reducer uses multiple bands as inputs, in which case the number of bands in the image collection must match the number of inputs required by the reducer). You can find available reducers and their descriptions in the searchable API reference under the Docs tab in the upper left panel of the code editor.
Some commonly used reducers have shortcut syntax, such as imageCollection.mean()
, imageCollection.min()
, and conveniently, imageCollection.sum()
. Both syntaxes are demonstrated in the following code chunk.
By printing the resulting image to the Console, we can see we now have 1 image with 1 band named ‘pr_sum’. Here’s what it looks like:
Now let’s take the image of annual precipitation we just created and get the mean annual precipitation by county in the United States. To get image statistics for multiple regions, we can use an image.reduceRegions() call. We will use a FeatureCollection to store our vector dataset of counties. Note that there is also a image.reduceRegion() operator if you wanted to summarize one polygon region only. The result of the reduceRegions()
operation is added to the properties of each feature in the FeatureCollection
.
*An important note on the scale parameter**
GEE uses lazy code evaluation that only executes parts of your script needed for results - in the case of the JavaScript API code editor environment, that means things needed to fulfill print statements, map visualizations, or export tasks. GEE will run your computations at the resolution of your current map view in the code editor unless you tell it otherwise. Whenever possible, explicitly set the scale arguments to force GEE to work in a scale that makes sense for your imagery/analysis. Read the modifiable areal unit problem wiki or the Developers Docs to see why this matters.
There are several ways to obtain vector data in GEE as discussed in 03 Accessing Satellite Imagery. Here, we will use an [existing public feature collection from the US Census Bureau.
This dataset includes entities outside of the contiguous US such as Alaska, Puerto Rico, and American Samoa. We will remove these based on their unique ID’s in a property attribute containing “state” FIPS codes to demonstrate vector filtering.
By printing the county featureCollection, we see there are 3108 county polygons and 11 columns of attribute data.
By printing the countyPrecip featureCollection, we see there are 3108 county polygons and now 12 columns of attribute data, with the addition of the “mean” column.
GEE can export tables in CSV (default), GeoJSON, KML, or KMZ. Here, we do a little formatting to prepare our FeatureCollection for export as a CSV.
Formatting includes:
Note on the folder name: If this folder exists within your Google Drive, GEE will find it and export here regardless of the full file path for the folder. If the folder doesn’t exist, GEE will create it upon export.
In order to actually export your data, you have to explicitly hit the “Run” button under the “Tasks” tab in the upper right panel of the code editor. It should take 20-30 seconds to export, depending on GEE user loads.
A new, helpful feature has been added where you can hold your mouse over right side of the completed task and click on the question mark to open a window with details on the task as in the diagram below.
Link to a static version of the full script used in this module: (https://code.earthengine.google.com/269a0d4a6b9854e6f81ac87187a72559)
Key Points
GEE hosts a wide variety of useful spatial datasets
Reducers aggregate or summarize data in space and time
There are several ways to use vector data in GEE
Results can be exported to Google Drive or Google Cloud