API Documentation
Here we cover the package and its modules, focusing first on the classes normally imported when working with Harmony.
Top-level Package API
The classes in the harmony
package that are used for crafting a request, submitting it to Harmony, and getting the results.
- class harmony.Request(collection: Collection, *, spatial: BBox = None, temporal: Mapping[str, datetime] = None, dimensions: List[Dimension] = None, extend: List[str] = None, crs: str = None, destination_url: str = None, format: str = None, granule_id: List[str] = None, granule_name: List[str] = None, height: int = None, interpolation: str = None, max_results: int = None, scale_extent: List[float] = None, scale_size: List[float] = None, shape: Tuple[IO, str] | None = None, variables: List[str] = ['all'], width: int = None, concatenate: bool = None, skip_preview: bool = None, ignore_errors: bool = None, grid: str = None)
A Harmony request with the CMR collection and various parameters expressing how the data is to be transformed.
- Parameters:
collection – The CMR collection that should be queried
spatial – Bounding box spatial constraints on the data
temporal – Date/time constraints on the data provided as a dict mapping “start” and “stop” keys to corresponding start/stop datetime.datetime objects
dimensions – A list of dimensions to use for subsetting the data
extend – A list of dimensions to extend
crs – reproject the output coverage to the given CRS. Recognizes CRS types that can be inferred by gdal, including EPSG codes, Proj4 strings, and OGC URLs (http://www.opengis.net/def/crs/…)
interpolation – specify the interpolation method used during reprojection and scaling
scale_extent – scale the resulting coverage either among one axis to a given extent
scale_size – scale the resulting coverage either among one axis to a given size
shape – a file path to an ESRI Shapefile zip, GeoJSON file, or KML file to use for spatial subsetting. Note: not all collections support shapefile subsetting
granule_id – The CMR Granule ID for the granule which should be retrieved
granule_name – The granule ur or provider id for the granule(s) to be retrieved wildcards * (multi character match) and ? (single character match) are supported
width – number of columns to return in the output coverage
height – number of rows to return in the output coverage
format – the output mime type to return
max_results – limits the number of input granules processed in the request
concatenate – Whether to invoke a service that supports concatenation
skip_preview – Whether Harmony should skip auto-pausing and generating a preview for large jobs
ignore_errors – if “true”, continue processing a request to completion even if some items fail
destination_url – Destination URL specified by the client (only S3 is supported, e.g. s3://my-bucket-name/mypath)
grid – The name of the output grid to use for regridding requests. The name must match the UMM grid name in the CMR.
- Returns:
A Harmony Transformation Request instance
- class harmony.CapabilitiesRequest(**request_params)
A Harmony request to get the harmony capabilities of a CMR collection :param Keyword arguments with optional collection_id: :param short_name and capabilities_version fields: :param - collection_id: The CMR collection Id that should be queried :param - short_name: The CMR collection shortName that should be queried :param - capabilities_version: the version of the collection capabilities request api
- Returns:
A Harmony Capability Request instance
- class harmony.Client(*, auth: Tuple[str, str] | None = None, should_validate_auth: bool = True, env: Environment = Environment.PROD, token: str = None, check_interval: float = 3.0)
A Harmony client object which can be used to submit requests to Harmony.
Examples:
With no arguments:
>>> client = Client()
will create a Harmony client that will either use the EDL_USERNAME & EDL_PASSWORD environment variables to authenticate with Earthdata Login, or will use the credentials in the user’s
.netrc
file, if one is available.To explicitly include the user’s credentials:
>>> client = Client(auth=('rfeynman', 'quantumf1eld5'))
You can also create a Harmony client using user’s EDL token:
>>> client = Client(token='myEDLTokenValue')
By default, the Client will validate the provided credentials immediately. This can be disabled by passing
should_validate_auth=False
.
When creating a request, the BBox
and Collection
classes are used to create a valid request.
- class harmony.BBox(w: float, s: float, e: float, n: float)
A bounding box specified by western & eastern longitude, southern & northern latitude constraints in degrees.
Example
An area bounded by latitudes 30N and 60N and longitudes 130W and 100W:
>>> spatial = BBox(-130, 30, -100, 60)
Important: When specified positionally, the parameters must be given in order: west, south, east, north.
Alternatively, one can explicitly set each bound using the single-letter for each bound:
>>> spatial = BBox(n=60, s=30, e=-100, w=-130)
Print a readable representation of the spatial bounds:
>>> print(spatial) BBox: West:-130, South:30, East:-100, North:60
- Parameters:
w – The western longitude bounds (degrees)
s – The souther latitude bounds (degrees)
e – The easter longitude bounds (degrees)
n – The northern latitude bounds (degrees)
- Returns:
A BBox instance with the provided bounds.
- class harmony.Collection(id: str)
The identity of a CMR Collection.
Authenticating with Earthdata Login
HarmonyPy requires that you have a valid Earthdata Login account. There are four ways to use your EDL account with HarmonyPy:
Provide EDL token when creating a HarmonyPy
Client
harmony_client = Client(token='myEDLTokenValue')
Provide your credentials when creating a HarmonyPy
Client
harmony_client = Client(auth=('captainmarvel', 'marve10u5'))
Set your credentials using environment variables
$ export EDL_USERNAME='captainmarvel' $ export EDL_PASSWORD='marve10u5'
Use a
.netrc
file:Create a
.netrc
file in your home directory, using the example belowmachine urs.earthdata.nasa.gov login captainmarvel password marve10u5
Exceptions
Exceptions that may be raised when authenticating with Earthdata Login.
- exception harmony.auth.MalformedCredentials
The provided Earthdata Login credentials were not correctly specified.
- exception harmony.auth.BadAuthentication
The provided Earthdata Login credentials were invalid.
Developer Documentation
Here we show the full API documentation. This will most often be used when developing on the HarmonyPy package, and will not likely be needed if you are using HarmonyPy to make requests.
Submodules
harmony.auth module
Earthdata Login Authorization extensions to the requests
package.
This module defines two functions that enable seamless integration between
the requests
module and NASA Earthdata Login. The create_session
function
constructs a requests.Session
that will correctly handle the OAuth redirect
‘dance’ that is necessary to authenticate a user. The validate_auth
function
checks that the authentication credentials are valid, and can be used before
attempting to download data, for example.
The SessionWithHeaderRedirection
–a requests.Session
subclass–is used
to perform authentication with Earthdata Login. The create_session
function
uses this class and clients of the Harmony Py package do not need to use this
explicitly.
- exception harmony.auth.BadAuthentication
Bases:
Exception
The provided Earthdata Login credentials were invalid.
- exception harmony.auth.MalformedCredentials
Bases:
Exception
The provided Earthdata Login credentials were not correctly specified.
- class harmony.auth.SessionWithHeaderRedirection(auth: Tuple[str, str] | None = None, token: str = None)
Bases:
Session
A
requests.Session
that modifies HTTP Authorization headers in accordance with Earthdata Login (EDL) common usage.Example:
session = SessionWithHeaderRedirection(username, password)
- Parameters:
auth – A tuple of the form (‘edl_username’, ‘edl_password’)
- rebuild_auth(prepared_request: PreparedRequest, response: Response) None
Override Session.rebuild_auth. Strips the Authorization header if neither original URL nor redirected URL belong to an Earthdata Login (EDL) host. Also allows the default requests behavior of searching for relevant .netrc credentials if and only if a username and password weren’t provided during object instantiation.
- Parameters:
prepared_request – Object for the redirection destination.
response – Object for the where we just came from.
- harmony.auth.create_session(config: Config, auth: Tuple[str, str] = None, token: str = None) Session
Creates a configured
requests
session.Attempts to create an authenticated session in the following order:
If
auth
is a tuple of (username, password), create a session.Attempt to read a username and password from environment variables, either from the system or from a .env file to return a session.
Return a session that attempts to read credentials from a .netrc file.
- Parameters:
config – Configuration object with EDL and authentication context
auth – A tuple of the form (‘edl_username’, ‘edl_password’)
- Returns:
The authenticated
requests
session.- Raises:
MalformedCredentials –
auth
credential not in the correct format.BadAuthentication – Incorrect credentials or unknown error.
harmony.config module
Provides a Config class for conveniently specifying the environment for Harmony Py.
The Config
class can be instantiated without parameters and will default to
the Harmony production environment. To create a configuration for the
testing (UAT) environment, for example:
cfg = Config(Environment.UAT)
This configuration object can then be passed as an argument when creating
the harmony.Client
.
- class harmony.config.Config(environment: Environment = Environment.PROD, localhost_port: int = 3000)
Bases:
object
Runtime configuration variables including defaults and environment vars.
Example:
>>> cfg = Config() >>> cfg.foo 'bar'
- Parameters:
None –
- config = {'DOWNLOAD_CHUNK_SIZE': '4194304', 'NUM_REQUESTS_WORKERS': '3'}
- property edl_validation_url
Returns the full URL to a Harmony endpoint used to validate the user’s Earthdata Login credentials for this Config’s Environment.
- property harmony_hostname
Returns the hostname for this Config object’s Environment.
- property root_url: str
- property url_scheme: str
harmony.harmony module
This module defines the main classes used to interact with Harmony.
The classes defined here are also available by importing them from the
top-level harmony
package, e.g.:
from harmony import Client, Request
Overview of the classes:
Collection: A CMR Collection ID
BBox: A bounding box (lat/lon) used in Requests
Request: A complete Harmony request with all criteria
Client: Allows submission of a Harmony Request and getting results
- class harmony.harmony.BBox(w: float, s: float, e: float, n: float)
Bases:
NamedTuple
A bounding box specified by western & eastern longitude, southern & northern latitude constraints in degrees.
Example
An area bounded by latitudes 30N and 60N and longitudes 130W and 100W:
>>> spatial = BBox(-130, 30, -100, 60)
Important: When specified positionally, the parameters must be given in order: west, south, east, north.
Alternatively, one can explicitly set each bound using the single-letter for each bound:
>>> spatial = BBox(n=60, s=30, e=-100, w=-130)
Print a readable representation of the spatial bounds:
>>> print(spatial) BBox: West:-130, South:30, East:-100, North:60
- Parameters:
w – The western longitude bounds (degrees)
s – The souther latitude bounds (degrees)
e – The easter longitude bounds (degrees)
n – The northern latitude bounds (degrees)
- Returns:
A BBox instance with the provided bounds.
- e: float
Alias for field number 2
- n: float
Alias for field number 3
- s: float
Alias for field number 1
- w: float
Alias for field number 0
- class harmony.harmony.BaseRequest(*, collection: Collection)
Bases:
object
A Harmony base request with the CMR collection. It is the base class of all harmony requests.
- Parameters:
collection – The CMR collection that should be queried
- Returns:
A Harmony Request instance
- error_messages() List[str]
- is_valid() bool
Determines if the request and its parameters are valid.
- parameter_values() List[Tuple[str, Any]]
Returns tuples of each query parameter that has been set and its value.
- class harmony.harmony.CapabilitiesRequest(**request_params)
Bases:
BaseRequest
A Harmony request to get the harmony capabilities of a CMR collection :param Keyword arguments with optional collection_id: :param short_name and capabilities_version fields: :param - collection_id: The CMR collection Id that should be queried :param - short_name: The CMR collection shortName that should be queried :param - capabilities_version: the version of the collection capabilities request api
- Returns:
A Harmony Capability Request instance
- error_messages() List[str]
A list of error messages, if any, for the request.
- class harmony.harmony.Client(*, auth: Tuple[str, str] | None = None, should_validate_auth: bool = True, env: Environment = Environment.PROD, token: str = None, check_interval: float = 3.0)
Bases:
object
A Harmony client object which can be used to submit requests to Harmony.
Examples:
With no arguments:
>>> client = Client()
will create a Harmony client that will either use the EDL_USERNAME & EDL_PASSWORD environment variables to authenticate with Earthdata Login, or will use the credentials in the user’s
.netrc
file, if one is available.To explicitly include the user’s credentials:
>>> client = Client(auth=('rfeynman', 'quantumf1eld5'))
You can also create a Harmony client using user’s EDL token:
>>> client = Client(token='myEDLTokenValue')
By default, the Client will validate the provided credentials immediately. This can be disabled by passing
should_validate_auth=False
.- aws_credentials() dict
Retrieve temporary AWS credentials for retrieving data in S3.
- Returns:
A python dict containing
aws_access_key_id
,aws_secret_access_key
, andaws_session_token
.- Raises:
Exception – Can raise when e.g. server is unreachable.
- download(url: str, directory: str = '', overwrite: bool = False) Future
Downloads data and saves it to a file asynchronously.
- Parameters:
url – The location (URL) of the file to be downloaded
directory – Optional. If specified, saves files there. Saves files to the current
default. (working directory by) –
overwrite – If True, will overwrite a local file that shares a filename with the
truncated (downloaded file. Defaults to False. If you're seeing malformed data or) –
downloads (files from incomplete) –
True. (set overwrite to) –
- Returns:
A Future that resolves to the full path to the file.
- download_all(job_id_or_result_json: str | dict, directory: str = '', overwrite: bool = False) Generator[Future, None, None]
Using a job_id, fetches all the data files from a finished job.
After this method is able to contact Harmony and query a finished job, it will immediately return with a list of python concurrent.Futures corresponding to each of the files to be downloaded. Call the result() method to block until the downloading of that file is complete. When finished, the Future will return the filename.
Files are downloaded by an executor backed by a thread pool. Number of threads in the thread pool can be specified with the environment variable NUM_REQUESTS_WORKERS.
Performance should be close to native with an appropriate chunk size. This can be changed via environment variable
DOWNLOAD_CHUNK_SIZE
.Filenames are automatically determined by using the latter portion of the provided URL.
Will wait for an unfinished job to finish before downloading.
- Parameters:
job_id – UUID string for the job you wish to interrogate.
directory – Optional. If specified, saves files there. Saves files to the current
default. (working directory by) –
overwrite – If True, will overwrite a local file that shares a filename with the
truncated (downloaded file. Defaults to False. If you're seeing malformed data or) –
downloads (files from incomplete) –
True. (set overwrite to) –
- Returns:
A list of Futures, each of which will return the filename (with path) for each result.
- get_download_filename_from_url(url: str) str
For a given URL, returns the filename that will be used for download. It will include a Harmony generated ID prefix if the data is staged.
- Parameters:
url – The location (URL) of the file to be downloaded
- Returns:
The filename that will be used to name the downloaded file.
- iterator(job_id: str, directory: str = '', overwrite: bool = False) Iterator
Create an iterator that will poll for data in the background and download it as it is available and requested via next().
Each iteration returns a dictionary, or None when all granules have been iterated. The dictionary has the following form:
{ 'path': Future 'bbox': BBox object containing the bounding box for the granule, 'temporal': { 'start': '2020-01-11T14:00:00.000Z', 'end': '2020-01-11T15:59:59.000Z' } }
The Future resolves to the path to the downloaded file.
If the job is paused and all processed granules have already been returned in the iteration, the status returned will be GranuleStatus.PAUSED until the job is resumed.
If the job fails during iteration then calls to next will raise an exception. Note that this is not true if the job completed with errors, in which case specific granules may return errors, but no exception is raised. This allows the caller to retrieve any granules that were successfully processed.
Note: if a job gets stuck in the ‘running’ state, this iterator will happily wait forever re-checking the status page periodically.
- Parameters:
job_id – UUID string for the job you wish to interrogate.
directory – Optional. If specified, saves files there. Saves files to the current
default. (working directory by) –
overwrite – If True, will overwrite a local file that shares a filename with the
truncated (downloaded file. Defaults to False. If you're seeing malformed data or) –
downloads (files from incomplete) –
True. (set overwrite to) –
- Returns:
An Iterator that can be used to iterate over the granule results from a job
- pause(job_id: str)
Pause a job.
- Parameters:
job_id – UUID string for the job you wish to pause.
- Raises:
Exception – This can happen if an invalid job_id is provided or Harmony services can’t be reached or the job cannot be paused (usually because it is already in a terminal state).
- progress(job_id: str) Tuple[int, str, str]
Retrieve a submitted job’s completion status in percent.
- Parameters:
job_id – UUID string for the job you wish to interrogate.
- Returns:
The job’s processing progress as a percentage, the job’s processing state, and the job’s status message
- Return type:
A tuple of
- Raises:
Exception – This can happen if an invalid job_id is provided or Harmony services can’t be reached.
- read_text(url: str) str
Uses the harmony-py Client session to fetch a URL.
- Parameters:
url – A URL, such as one from stac_catalog_url().
- Returns:
The response text.
- Raises:
Exception – Can occur on malformed or unreachable URLs.
- request_as_curl(request: BaseRequest) str
Returns a curl command representation of the given request. Note Authorization headers will be masked to reduce risk of accidental exposure. Also, cookies containing the string ‘token’ will be removed from the curl command.
- Parameters:
request – The Request to build a curl command for
- Returns:
An equivalent curl command as based on this client and request.
- request_as_url(request: BaseRequest) str
Returns a URL string representation of the given request. Note Headers and cookies are not included, just the URL. Shapefiles are not supported.
- Parameters:
request – The Request to build the URL string for
- Returns:
A URL string that can be pasted into a browser.
- :raises
Exception: if a shapefile is included in the request.
- result_json(job_id: str, show_progress: bool = False, link_type: LinkType = LinkType.https) str
Retrieve a job’s final json output.
Harmony jobs’ output is built as the job is processed and this method fetches the complete and final output.
- Parameters:
job_id – UUID string for the job you wish to interrogate.
show_progress – Whether a progress bar should show via stdout.
link_type – The type of link to output, s3:// or https://
- Returns:
The job’s complete json output.
- result_urls(job_id: str, show_progress: bool = False, link_type: LinkType = LinkType.https) Generator[str, None, None]
Retrieve the data URLs for a job.
The URLs include links to all of the jobs data output. Blocks until the Harmony job is done processing.
- Parameters:
job_id – UUID string for the job you wish to interrogate.
show_progress – Whether a progress bar should show via stdout.
link_type – The type of link to output, s3:// or https://
- Returns:
The job’s complete list of data URLs.
- resume(job_id: str)
Resume a job.
- Parameters:
job_id – UUID string for the job you wish to resume.
- Raises:
Exception – This can happen if an invalid job_id is provided or Harmony services can’t be reached or the job cannot be resumed (usually because it is already in a terminal state).
- stac_catalog_url(job_id: str, show_progress: bool = False, link_type: LinkType = LinkType.https) str
Extract the STAC catalog URL from job results.
Blocks until the Harmony job is done processing.
- Parameters:
job_id – UUID string for the job you wish to interrogate.
show_progress – Whether a progress bar should show via stdout.
link_type – The type of link to output, s3:// or https://
- Returns:
A STAC catalog URL.
- :raises
Exception: This can happen if an invalid job_id is provided or Harmony services can’t be reached.
- status(job_id: str) dict
Retrieve a submitted job’s metadata from Harmony.
- Parameters:
job_id – UUID string for the job you wish to interrogate.
- Returns:
A dict of metadata.
- Raises:
Exception – This can happen if an invalid job_id is provided or Harmony services can’t be reached.
- submit(request: BaseRequest) any
Submits a request to Harmony and returns the Harmony Job ID.
- Parameters:
request – The Request to submit to Harmony (will be validated before sending)
- Returns:
The Harmony Job ID for request done through async jobs The JSON response for direct download request The capabilities response for capabilities request
- wait_for_processing(job_id: str, show_progress: bool = False) None
Retrieve a submitted job’s completion status in percent.
- Parameters:
job_id – UUID string for the job you wish to interrogate.
- Returns:
The job’s processing progress as a percentage.
- :raises
Exception: This can happen if an invalid job_id is provided or Harmony services can’t be reached.
- zarr_download_exception = Exception('The zarr library must be used for zarr files. See https://github.com/nasa/harmony/blob/main/docs/Harmony%20Feature%20Examples.ipynb for zarr library usage example.')
- zarr_download_exception_msg = 'The zarr library must be used for zarr files. See https://github.com/nasa/harmony/blob/main/docs/Harmony%20Feature%20Examples.ipynb for zarr library usage example.'
- class harmony.harmony.Collection(id: str)
Bases:
object
The identity of a CMR Collection.
- class harmony.harmony.Dimension(name: str, min: float = None, max: float = None)
Bases:
object
An arbitrary dimension to subset against. A dimension can take a minimum value and a maximum value to to subset against.
Example
Requesting the data to be subset by the dimension lev with a minimum value of 10.0 and maximum value of 20.0:
>>> dimension = Dimension('lev', 10.0, 20.0)
Important: When specified positionally, the parameters must be given in the order: dimension name, minimum value, maximum value.
Alternatively, one can explicitly set each value used named parameters:
>>> dimension = Dimension(name='lev', min=10.0, max=20.0)
Print a readable representation of the dimension:
>>> print(dimension) Dimension: Name: lev, Minimum: 10.0, Maximum: 20.0
- Parameters:
name – The dimension name
min – The minimum value for the given dimension to subset against (optional)
max – The maximum value for the given dimension to subset against (optional)
- Returns:
A Dimension instance with the provided dimension subset values.
- max: float
- min: float
- name: str
- class harmony.harmony.LinkType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
The type of URL to provide when returning links to data.
s3: Returns an Amazon Web Services (AWS) S3 URL https: Returns a standard HTTP URL
- http = 'http'
- https = 'https'
- s3 = 's3'
- exception harmony.harmony.ProcessingFailedException(job_id: str, message: str)
Bases:
Exception
Indicates a Harmony job has failed during processing
- class harmony.harmony.Request(collection: Collection, *, spatial: BBox = None, temporal: Mapping[str, datetime] = None, dimensions: List[Dimension] = None, extend: List[str] = None, crs: str = None, destination_url: str = None, format: str = None, granule_id: List[str] = None, granule_name: List[str] = None, height: int = None, interpolation: str = None, max_results: int = None, scale_extent: List[float] = None, scale_size: List[float] = None, shape: Tuple[IO, str] | None = None, variables: List[str] = ['all'], width: int = None, concatenate: bool = None, skip_preview: bool = None, ignore_errors: bool = None, grid: str = None)
Bases:
BaseRequest
A Harmony request with the CMR collection and various parameters expressing how the data is to be transformed.
- Parameters:
collection – The CMR collection that should be queried
spatial – Bounding box spatial constraints on the data
temporal – Date/time constraints on the data provided as a dict mapping “start” and “stop” keys to corresponding start/stop datetime.datetime objects
dimensions – A list of dimensions to use for subsetting the data
extend – A list of dimensions to extend
crs – reproject the output coverage to the given CRS. Recognizes CRS types that can be inferred by gdal, including EPSG codes, Proj4 strings, and OGC URLs (http://www.opengis.net/def/crs/…)
interpolation – specify the interpolation method used during reprojection and scaling
scale_extent – scale the resulting coverage either among one axis to a given extent
scale_size – scale the resulting coverage either among one axis to a given size
shape – a file path to an ESRI Shapefile zip, GeoJSON file, or KML file to use for spatial subsetting. Note: not all collections support shapefile subsetting
granule_id – The CMR Granule ID for the granule which should be retrieved
granule_name – The granule ur or provider id for the granule(s) to be retrieved wildcards * (multi character match) and ? (single character match) are supported
width – number of columns to return in the output coverage
height – number of rows to return in the output coverage
format – the output mime type to return
max_results – limits the number of input granules processed in the request
concatenate – Whether to invoke a service that supports concatenation
skip_preview – Whether Harmony should skip auto-pausing and generating a preview for large jobs
ignore_errors – if “true”, continue processing a request to completion even if some items fail
destination_url – Destination URL specified by the client (only S3 is supported, e.g. s3://my-bucket-name/mypath)
grid – The name of the output grid to use for regridding requests. The name must match the UMM grid name in the CMR.
- Returns:
A Harmony Transformation Request instance
- error_messages() List[str]
A list of error messages, if any, for the request.