HarmonyPy Tutorial
Harmony Py Tutorial¶
This notebook shows a basic example of a Harmony job using a Harmony test Collection to perform a combination of both spatial and temporal subsetting.
First, we import a few things that will help us create a request and display images. We then import the Harmony Py classes we need to make a request.
import helper
# Install the project and 'examples' dependencies
helper.install_project_and_dependencies('..', libs=['examples'])
import sys
import datetime as dt
from IPython.display import display, JSON
import rasterio
import rasterio.plot
import netCDF4 as nc4
from matplotlib import pyplot as plt
import numpy as np
from harmony import BBox, WKT, Client, Collection, Request, Environment
Now we create a Harmony Client object, letting it pick up our credentials from a .netrc file.
harmony_client = Client(env=Environment.UAT)
Next, let's create a Collection object with the CMR collection id for the CMR collection we'd like to look at.
We then create a Request which specifies that collection, a spatial BBox describing the bounding box for the area we're interested in (we'll look at the BBox in other tutorials). In this case we're interested in looking at Alaska (and who wouldn't be?). We also include a date/time range to narrow down the data.
Because this data includes a lot of different variables, we limit it by passing in a list of variables we're interested in; in this test collection we'll look at the blue variable. We include a max_results parameter to limit the results to the first 10 images just to get a sample of what things look like.
Next, we include a coordinate reference system (CRS) indicating we'd like to reproject the data into the Arctic Polar Stereographic projection. We also specify that we'd like the output to be in the GeoTIFF format with a resolution of 512x512 pixels.
Finally we check if the request we've created is valid.
collection = Collection(id='C1234088182-EEDTEST')
request = Request(
collection=collection,
spatial=BBox(-165, 52, -140, 77),
temporal={
'start': dt.datetime(2010, 1, 1),
'stop': dt.datetime(2020, 12, 30)
},
variables=['blue_var'],
max_results=10,
crs='EPSG:3995',
format='image/tiff',
height=512,
width=512
)
request.is_valid()
True
Now that we have a request, we can submit it to Harmony using the Harmony Client object we created earlier. We'll get back an id for our request which we can use to find the job's status and get the results.
job_id = harmony_client.submit(request)
Let's see how it's going. This will show the percentage complete in the progress field. (We use the JSON helper function to show the results in a nicer-to-look-at format).
JSON(harmony_client.status(job_id))
<IPython.core.display.JSON object>
Let's download the results to our system temp directory, overwriting files if they already exist. This returns us a list of Future objects. Each of these "stand in" for a file in our set of results. We can ask a Future for its result and when it's available, it will return the filename to us.
results = harmony_client.download_all(job_id, directory='/tmp', overwrite=True)
Allright, now let's show some colorful Alaska images! Here we iterate over the results, asking each Future for its result, and then using rasterio to open the file and display the image.
for r in results:
rasterio.plot.show(rasterio.open(r.result()))
/tmp/4862438_2020_01_01_7f00ff_global_blue_var_regridded_subsetted.tif
/tmp/4862439_2020_01_01_7f00ff_global_blue_var_regridded_subsetted.tif
/tmp/4862440_2020_01_02_3200ff_global_blue_var_regridded_subsetted.tif
/tmp/4862441_2020_01_03_0019ff_global_blue_var_regridded_subsetted.tif
/tmp/4862442_2020_01_04_0065ff_global_blue_var_regridded_subsetted.tif
/tmp/4862445_2020_01_07_00ffb2_global_blue_var_regridded_subsetted.tif
/tmp/4862443_2020_01_05_00b2ff_global_blue_var_regridded_subsetted.tif
/tmp/4862444_2020_01_06_00feff_global_blue_var_regridded_subsetted.tif
/tmp/4862446_2020_01_08_00ff66_global_blue_var_regridded_subsetted.tif
/tmp/4862447_2020_01_09_00ff19_global_blue_var_regridded_subsetted.tif
We can also get a URL corresponding to our request that we can use in a browser. Note: This will not work if the request includes a shapefile.
url = harmony_client.request_as_url(request)
print(url)
https://harmony.uat.earthdata.nasa.gov/C1234088182-EEDTEST/ogc-api-coverages/1.0.0/collections/parameter_vars/coverage/rangeset?forceAsync=true&subset=lat%2852%3A77%29&subset=lon%28-165%3A-140%29&subset=time%28%222010-01-01T00%3A00%3A00%22%3A%222020-12-30T00%3A00%3A00%22%29&outputcrs=EPSG%3A3995&width=512&height=512&format=image%2Ftiff&maxResults=10&variable=blue_var
Let's build another request. This time, we'll request a specific granule and grid using the UMM grid name from the CMR. You can find grids in the CMR at https://cmr.uat.earthdata.nasa.gov/search/grids.umm_json. Include a query parameter ?grid=LambertExample to list detailed information about a specific grid.
collection = Collection(id='C1233860183-EEDTEST')
request = Request(
collection=collection,
granule_id='G1233860486-EEDTEST',
interpolation='near',
grid='GEOS1x1test'
)
request.is_valid()
True
Submit the job and check on the status.
job_id = harmony_client.submit(request)
JSON(harmony_client.status(job_id))
<IPython.core.display.JSON object>
Download and plot the file using pyplot and numpy.
results = harmony_client.download_all(job_id, directory='/tmp', overwrite=True)
nc4_file=nc4.Dataset(list(results)[0].result())
arrays = []
for var in ['red_var', 'green_var', 'blue_var', 'alpha_var']:
ds = nc4_file.variables[var][0,:]
arrays.append(ds)
plt.imshow(np.dstack(arrays))
<matplotlib.image.AxesImage at 0x122b4db50>
For request that completes directly without creating a job, the submit call returns the harmony JSON response including the direct download links.
collection = Collection(id='C1233800302-EEDTEST')
request = Request(
collection=collection,
max_results=1,
variables=['all']
)
response = harmony_client.submit(request)
response
'eb073a98-a2b4-4ffb-91cf-2adf3456933e'
Download the result and show the file can be opened with NetCDF library.
results = harmony_client.download_all(response, directory='/tmp', overwrite=True)
file_names = [f.result() for f in results]
for filename in file_names:
if filename.endswith("nc"):
print(nc4.Dataset(filename))
<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
GDAL_AREA_OR_POINT: Area
GDAL_HARMONY_HEX_COLOR: 7f00ff
GDAL_HARMONY_HUE: 0.75
Conventions: CF-1.5
GDAL: GDAL 3.2.0, released 2020/10/26
history: Fri May 27 16:34:47 2022: GDAL CreateCopy( out/nc/001_00_7f00ff_global.nc, ... )
dimensions(sizes): lon(3600), lat(1800)
variables(dimensions): |S1 crs(), float64 lat(lat), float64 lon(lon), uint8 red_var(lat, lon), uint8 green_var(lat, lon), uint8 blue_var(lat, lon), uint8 alpha_var(lat, lon)
groups:
Example of submitting a request with WKT spatial. The supported WKT geometry types are listed at: https://harmony-py.readthedocs.io/en/latest/api.html#harmony.client.WKT
collection = Collection(id='C1233800302-EEDTEST')
request = Request(
collection=collection,
spatial=WKT('POLYGON((-140 20, -50 20, -50 60, -140 60, -140 20))'),
granule_id=['C1233800302-EEDTEST'],
max_results=1,
temporal={
'start': dt.datetime(1980, 1, 1),
'stop': dt.datetime(2020, 12, 30)
},
variables=['blue_var'],
crs='EPSG:31975',
format='image/png'
)
request.is_valid()
True
response = harmony_client.submit(request)
response
'a29ae061-9c49-49e1-9997-cc04b9607dbe'
Show the result:
for filename in [f.result() for f in harmony_client.download_all(response, directory='/tmp', overwrite=True)]:
if filename.endswith("png"):
helper.show_result(filename)
/tmp/4862452_001_00_7f00ff_global_blue_var_regridded.nc.png
/Users/yliu10/.pyenv/versions/3.11.9/envs/harmony-py/lib/python3.11/site-packages/rasterio/__init__.py:304: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned. dataset = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)