API Documentation

Here we cover the package and its modules, focusing first on the classes normally imported when working with Harmony.

Top-level Package API

The classes in the harmony package that are used for crafting a request, submitting it to Harmony, and getting the results.

class harmony.Request(collection: Collection, *, spatial: BBox = None, temporal: Mapping[str, datetime] = None, dimensions: List[Dimension] = None, extend: List[str] = None, crs: str = None, destination_url: str = None, format: str = None, granule_id: List[str] = None, granule_name: List[str] = None, height: int = None, interpolation: str = None, max_results: int = None, scale_extent: List[float] = None, scale_size: List[float] = None, shape: Tuple[IO, str] | None = None, variables: List[str] = ['all'], width: int = None, concatenate: bool = None, skip_preview: bool = None, ignore_errors: bool = None, grid: str = None)

A Harmony request with the CMR collection and various parameters expressing how the data is to be transformed.

Parameters:
  • collection – The CMR collection that should be queried

  • spatial – Bounding box spatial constraints on the data

  • temporal – Date/time constraints on the data provided as a dict mapping “start” and “stop” keys to corresponding start/stop datetime.datetime objects

  • dimensions – A list of dimensions to use for subsetting the data

  • extend – A list of dimensions to extend

  • crs – reproject the output coverage to the given CRS. Recognizes CRS types that can be inferred by gdal, including EPSG codes, Proj4 strings, and OGC URLs (http://www.opengis.net/def/crs/…)

  • interpolation – specify the interpolation method used during reprojection and scaling

  • scale_extent – scale the resulting coverage either among one axis to a given extent

  • scale_size – scale the resulting coverage either among one axis to a given size

  • shape – a file path to an ESRI Shapefile zip, GeoJSON file, or KML file to use for spatial subsetting. Note: not all collections support shapefile subsetting

  • granule_id – The CMR Granule ID for the granule which should be retrieved

  • granule_name – The granule ur or provider id for the granule(s) to be retrieved wildcards * (multi character match) and ? (single character match) are supported

  • width – number of columns to return in the output coverage

  • height – number of rows to return in the output coverage

  • format – the output mime type to return

  • max_results – limits the number of input granules processed in the request

  • concatenate – Whether to invoke a service that supports concatenation

  • skip_preview – Whether Harmony should skip auto-pausing and generating a preview for large jobs

  • ignore_errors – if “true”, continue processing a request to completion even if some items fail

  • destination_url – Destination URL specified by the client (only S3 is supported, e.g. s3://my-bucket-name/mypath)

  • grid – The name of the output grid to use for regridding requests. The name must match the UMM grid name in the CMR.

Returns:

A Harmony Transformation Request instance

class harmony.CapabilitiesRequest(**request_params)

A Harmony request to get the harmony capabilities of a CMR collection :param Keyword arguments with optional collection_id: :param short_name and capabilities_version fields: :param - collection_id: The CMR collection Id that should be queried :param - short_name: The CMR collection shortName that should be queried :param - capabilities_version: the version of the collection capabilities request api

Returns:

A Harmony Capability Request instance

class harmony.Client(*, auth: Tuple[str, str] | None = None, should_validate_auth: bool = True, env: Environment = Environment.PROD, token: str = None, check_interval: float = 3.0)

A Harmony client object which can be used to submit requests to Harmony.

Examples:

With no arguments:

>>> client = Client()

will create a Harmony client that will either use the EDL_USERNAME & EDL_PASSWORD environment variables to authenticate with Earthdata Login, or will use the credentials in the user’s .netrc file, if one is available.

To explicitly include the user’s credentials:

>>> client = Client(auth=('rfeynman', 'quantumf1eld5'))

You can also create a Harmony client using user’s EDL token:

>>> client = Client(token='myEDLTokenValue')

By default, the Client will validate the provided credentials immediately. This can be disabled by passing should_validate_auth=False.

When creating a request, the BBox and Collection classes are used to create a valid request.

class harmony.BBox(w: float, s: float, e: float, n: float)

A bounding box specified by western & eastern longitude, southern & northern latitude constraints in degrees.

Example

An area bounded by latitudes 30N and 60N and longitudes 130W and 100W:

>>> spatial = BBox(-130, 30, -100, 60)

Important: When specified positionally, the parameters must be given in order: west, south, east, north.

Alternatively, one can explicitly set each bound using the single-letter for each bound:

>>> spatial = BBox(n=60, s=30, e=-100, w=-130)

Print a readable representation of the spatial bounds:

>>> print(spatial)
BBox: West:-130, South:30, East:-100, North:60
Parameters:
  • w – The western longitude bounds (degrees)

  • s – The souther latitude bounds (degrees)

  • e – The easter longitude bounds (degrees)

  • n – The northern latitude bounds (degrees)

Returns:

A BBox instance with the provided bounds.

class harmony.Collection(id: str)

The identity of a CMR Collection.

Authenticating with Earthdata Login

HarmonyPy requires that you have a valid Earthdata Login account. There are four ways to use your EDL account with HarmonyPy:

  1. Provide EDL token when creating a HarmonyPy Client

    harmony_client = Client(token='myEDLTokenValue')
    
  2. Provide your credentials when creating a HarmonyPy Client

    harmony_client = Client(auth=('captainmarvel', 'marve10u5'))
    
  3. Set your credentials using environment variables

    $ export EDL_USERNAME='captainmarvel'
    $ export EDL_PASSWORD='marve10u5'
    
  4. Use a .netrc file:

    Create a .netrc file in your home directory, using the example below

    machine urs.earthdata.nasa.gov
    login captainmarvel
    password marve10u5
    

Exceptions

Exceptions that may be raised when authenticating with Earthdata Login.

exception harmony.auth.MalformedCredentials

The provided Earthdata Login credentials were not correctly specified.

exception harmony.auth.BadAuthentication

The provided Earthdata Login credentials were invalid.

Developer Documentation

Here we show the full API documentation. This will most often be used when developing on the HarmonyPy package, and will not likely be needed if you are using HarmonyPy to make requests.

Submodules

harmony.auth module

Earthdata Login Authorization extensions to the requests package.

This module defines two functions that enable seamless integration between the requests module and NASA Earthdata Login. The create_session function constructs a requests.Session that will correctly handle the OAuth redirect ‘dance’ that is necessary to authenticate a user. The validate_auth function checks that the authentication credentials are valid, and can be used before attempting to download data, for example.

The SessionWithHeaderRedirection–a requests.Session subclass–is used to perform authentication with Earthdata Login. The create_session function uses this class and clients of the Harmony Py package do not need to use this explicitly.

exception harmony.auth.BadAuthentication

Bases: Exception

The provided Earthdata Login credentials were invalid.

exception harmony.auth.MalformedCredentials

Bases: Exception

The provided Earthdata Login credentials were not correctly specified.

class harmony.auth.SessionWithHeaderRedirection(auth: Tuple[str, str] | None = None, token: str = None)

Bases: Session

A requests.Session that modifies HTTP Authorization headers in accordance with Earthdata Login (EDL) common usage.

Example:

session = SessionWithHeaderRedirection(username, password)
Parameters:

auth – A tuple of the form (‘edl_username’, ‘edl_password’)

rebuild_auth(prepared_request: PreparedRequest, response: Response) None

Override Session.rebuild_auth. Strips the Authorization header if neither original URL nor redirected URL belong to an Earthdata Login (EDL) host. Also allows the default requests behavior of searching for relevant .netrc credentials if and only if a username and password weren’t provided during object instantiation.

Parameters:
  • prepared_request – Object for the redirection destination.

  • response – Object for the where we just came from.

harmony.auth.create_session(config: Config, auth: Tuple[str, str] = None, token: str = None) Session

Creates a configured requests session.

Attempts to create an authenticated session in the following order:

  1. If auth is a tuple of (username, password), create a session.

  2. Attempt to read a username and password from environment variables, either from the system or from a .env file to return a session.

  3. Return a session that attempts to read credentials from a .netrc file.

Parameters:
  • config – Configuration object with EDL and authentication context

  • auth – A tuple of the form (‘edl_username’, ‘edl_password’)

Returns:

The authenticated requests session.

Raises:
harmony.auth.validate_auth(config: Config, session: Session)

Validates the credentials against the EDL authentication URL.

harmony.config module

Provides a Config class for conveniently specifying the environment for Harmony Py.

The Config class can be instantiated without parameters and will default to the Harmony production environment. To create a configuration for the testing (UAT) environment, for example:

cfg = Config(Environment.UAT)

This configuration object can then be passed as an argument when creating the harmony.Client.

class harmony.config.Config(environment: Environment = Environment.PROD, localhost_port: int = 3000)

Bases: object

Runtime configuration variables including defaults and environment vars.

Example:

>>> cfg = Config()
>>> cfg.foo
'bar'
Parameters:

None

config = {'DOWNLOAD_CHUNK_SIZE': '4194304', 'NUM_REQUESTS_WORKERS': '3'}
property edl_validation_url

Returns the full URL to a Harmony endpoint used to validate the user’s Earthdata Login credentials for this Config’s Environment.

property harmony_hostname

Returns the hostname for this Config object’s Environment.

property root_url: str
property url_scheme: str
class harmony.config.Environment(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

LOCAL = 1
PROD = 4
SIT = 2
UAT = 3

harmony.harmony module

This module defines the main classes used to interact with Harmony.

The classes defined here are also available by importing them from the top-level harmony package, e.g.:

from harmony import Client, Request

Overview of the classes:

  • Collection: A CMR Collection ID

  • BBox: A bounding box (lat/lon) used in Requests

  • Request: A complete Harmony request with all criteria

  • Client: Allows submission of a Harmony Request and getting results

class harmony.harmony.BBox(w: float, s: float, e: float, n: float)

Bases: NamedTuple

A bounding box specified by western & eastern longitude, southern & northern latitude constraints in degrees.

Example

An area bounded by latitudes 30N and 60N and longitudes 130W and 100W:

>>> spatial = BBox(-130, 30, -100, 60)

Important: When specified positionally, the parameters must be given in order: west, south, east, north.

Alternatively, one can explicitly set each bound using the single-letter for each bound:

>>> spatial = BBox(n=60, s=30, e=-100, w=-130)

Print a readable representation of the spatial bounds:

>>> print(spatial)
BBox: West:-130, South:30, East:-100, North:60
Parameters:
  • w – The western longitude bounds (degrees)

  • s – The souther latitude bounds (degrees)

  • e – The easter longitude bounds (degrees)

  • n – The northern latitude bounds (degrees)

Returns:

A BBox instance with the provided bounds.

e: float

Alias for field number 2

n: float

Alias for field number 3

s: float

Alias for field number 1

w: float

Alias for field number 0

class harmony.harmony.BaseRequest(*, collection: Collection)

Bases: object

A Harmony base request with the CMR collection. It is the base class of all harmony requests.

Parameters:

collection – The CMR collection that should be queried

Returns:

A Harmony Request instance

error_messages() List[str]
is_valid() bool

Determines if the request and its parameters are valid.

parameter_values() List[Tuple[str, Any]]

Returns tuples of each query parameter that has been set and its value.

class harmony.harmony.CapabilitiesRequest(**request_params)

Bases: BaseRequest

A Harmony request to get the harmony capabilities of a CMR collection :param Keyword arguments with optional collection_id: :param short_name and capabilities_version fields: :param - collection_id: The CMR collection Id that should be queried :param - short_name: The CMR collection shortName that should be queried :param - capabilities_version: the version of the collection capabilities request api

Returns:

A Harmony Capability Request instance

error_messages() List[str]

A list of error messages, if any, for the request.

class harmony.harmony.Client(*, auth: Tuple[str, str] | None = None, should_validate_auth: bool = True, env: Environment = Environment.PROD, token: str = None, check_interval: float = 3.0)

Bases: object

A Harmony client object which can be used to submit requests to Harmony.

Examples:

With no arguments:

>>> client = Client()

will create a Harmony client that will either use the EDL_USERNAME & EDL_PASSWORD environment variables to authenticate with Earthdata Login, or will use the credentials in the user’s .netrc file, if one is available.

To explicitly include the user’s credentials:

>>> client = Client(auth=('rfeynman', 'quantumf1eld5'))

You can also create a Harmony client using user’s EDL token:

>>> client = Client(token='myEDLTokenValue')

By default, the Client will validate the provided credentials immediately. This can be disabled by passing should_validate_auth=False.

aws_credentials() dict

Retrieve temporary AWS credentials for retrieving data in S3.

Returns:

A python dict containing aws_access_key_id, aws_secret_access_key, and aws_session_token.

Raises:

Exception – Can raise when e.g. server is unreachable.

download(url: str, directory: str = '', overwrite: bool = False) Future

Downloads data and saves it to a file asynchronously.

Parameters:
  • url – The location (URL) of the file to be downloaded

  • directory – Optional. If specified, saves files there. Saves files to the current

  • default. (working directory by) –

  • overwrite – If True, will overwrite a local file that shares a filename with the

  • truncated (downloaded file. Defaults to False. If you're seeing malformed data or) –

  • downloads (files from incomplete) –

  • True. (set overwrite to) –

Returns:

A Future that resolves to the full path to the file.

download_all(job_id_or_result_json: str | dict, directory: str = '', overwrite: bool = False) Generator[Future, None, None]

Using a job_id, fetches all the data files from a finished job.

After this method is able to contact Harmony and query a finished job, it will immediately return with a list of python concurrent.Futures corresponding to each of the files to be downloaded. Call the result() method to block until the downloading of that file is complete. When finished, the Future will return the filename.

Files are downloaded by an executor backed by a thread pool. Number of threads in the thread pool can be specified with the environment variable NUM_REQUESTS_WORKERS.

Performance should be close to native with an appropriate chunk size. This can be changed via environment variable DOWNLOAD_CHUNK_SIZE.

Filenames are automatically determined by using the latter portion of the provided URL.

Will wait for an unfinished job to finish before downloading.

Parameters:
  • job_id – UUID string for the job you wish to interrogate.

  • directory – Optional. If specified, saves files there. Saves files to the current

  • default. (working directory by) –

  • overwrite – If True, will overwrite a local file that shares a filename with the

  • truncated (downloaded file. Defaults to False. If you're seeing malformed data or) –

  • downloads (files from incomplete) –

  • True. (set overwrite to) –

Returns:

A list of Futures, each of which will return the filename (with path) for each result.

get_download_filename_from_url(url: str) str

For a given URL, returns the filename that will be used for download. It will include a Harmony generated ID prefix if the data is staged.

Parameters:

url – The location (URL) of the file to be downloaded

Returns:

The filename that will be used to name the downloaded file.

iterator(job_id: str, directory: str = '', overwrite: bool = False) Iterator

Create an iterator that will poll for data in the background and download it as it is available and requested via next().

Each iteration returns a dictionary, or None when all granules have been iterated. The dictionary has the following form:

{
    'path': Future
    'bbox': BBox object containing the bounding box for the granule,
    'temporal': {
        'start': '2020-01-11T14:00:00.000Z',
        'end': '2020-01-11T15:59:59.000Z'
    }
}

The Future resolves to the path to the downloaded file.

If the job is paused and all processed granules have already been returned in the iteration, the status returned will be GranuleStatus.PAUSED until the job is resumed.

If the job fails during iteration then calls to next will raise an exception. Note that this is not true if the job completed with errors, in which case specific granules may return errors, but no exception is raised. This allows the caller to retrieve any granules that were successfully processed.

Note: if a job gets stuck in the ‘running’ state, this iterator will happily wait forever re-checking the status page periodically.

Parameters:
  • job_id – UUID string for the job you wish to interrogate.

  • directory – Optional. If specified, saves files there. Saves files to the current

  • default. (working directory by) –

  • overwrite – If True, will overwrite a local file that shares a filename with the

  • truncated (downloaded file. Defaults to False. If you're seeing malformed data or) –

  • downloads (files from incomplete) –

  • True. (set overwrite to) –

Returns:

An Iterator that can be used to iterate over the granule results from a job

pause(job_id: str)

Pause a job.

Parameters:

job_id – UUID string for the job you wish to pause.

Raises:

Exception – This can happen if an invalid job_id is provided or Harmony services can’t be reached or the job cannot be paused (usually because it is already in a terminal state).

progress(job_id: str) Tuple[int, str, str]

Retrieve a submitted job’s completion status in percent.

Parameters:

job_id – UUID string for the job you wish to interrogate.

Returns:

The job’s processing progress as a percentage, the job’s processing state, and the job’s status message

Return type:

A tuple of

Raises:

Exception – This can happen if an invalid job_id is provided or Harmony services can’t be reached.

read_text(url: str) str

Uses the harmony-py Client session to fetch a URL.

Parameters:

url – A URL, such as one from stac_catalog_url().

Returns:

The response text.

Raises:

Exception – Can occur on malformed or unreachable URLs.

request_as_curl(request: BaseRequest) str

Returns a curl command representation of the given request. Note Authorization headers will be masked to reduce risk of accidental exposure. Also, cookies containing the string ‘token’ will be removed from the curl command.

Parameters:

request – The Request to build a curl command for

Returns:

An equivalent curl command as based on this client and request.

request_as_url(request: BaseRequest) str

Returns a URL string representation of the given request. Note Headers and cookies are not included, just the URL. Shapefiles are not supported.

Parameters:

request – The Request to build the URL string for

Returns:

A URL string that can be pasted into a browser.

:raises

Exception: if a shapefile is included in the request.

result_json(job_id: str, show_progress: bool = False, link_type: LinkType = LinkType.https) str

Retrieve a job’s final json output.

Harmony jobs’ output is built as the job is processed and this method fetches the complete and final output.

Parameters:
  • job_id – UUID string for the job you wish to interrogate.

  • show_progress – Whether a progress bar should show via stdout.

  • link_type – The type of link to output, s3:// or https://

Returns:

The job’s complete json output.

result_urls(job_id: str, show_progress: bool = False, link_type: LinkType = LinkType.https) Generator[str, None, None]

Retrieve the data URLs for a job.

The URLs include links to all of the jobs data output. Blocks until the Harmony job is done processing.

Parameters:
  • job_id – UUID string for the job you wish to interrogate.

  • show_progress – Whether a progress bar should show via stdout.

  • link_type – The type of link to output, s3:// or https://

Returns:

The job’s complete list of data URLs.

resume(job_id: str)

Resume a job.

Parameters:

job_id – UUID string for the job you wish to resume.

Raises:

Exception – This can happen if an invalid job_id is provided or Harmony services can’t be reached or the job cannot be resumed (usually because it is already in a terminal state).

stac_catalog_url(job_id: str, show_progress: bool = False, link_type: LinkType = LinkType.https) str

Extract the STAC catalog URL from job results.

Blocks until the Harmony job is done processing.

Parameters:
  • job_id – UUID string for the job you wish to interrogate.

  • show_progress – Whether a progress bar should show via stdout.

  • link_type – The type of link to output, s3:// or https://

Returns:

A STAC catalog URL.

:raises

Exception: This can happen if an invalid job_id is provided or Harmony services can’t be reached.

status(job_id: str) dict

Retrieve a submitted job’s metadata from Harmony.

Parameters:

job_id – UUID string for the job you wish to interrogate.

Returns:

A dict of metadata.

Raises:

Exception – This can happen if an invalid job_id is provided or Harmony services can’t be reached.

submit(request: BaseRequest) any

Submits a request to Harmony and returns the Harmony Job ID.

Parameters:

request – The Request to submit to Harmony (will be validated before sending)

Returns:

The Harmony Job ID for request done through async jobs The JSON response for direct download request The capabilities response for capabilities request

wait_for_processing(job_id: str, show_progress: bool = False) None

Retrieve a submitted job’s completion status in percent.

Parameters:

job_id – UUID string for the job you wish to interrogate.

Returns:

The job’s processing progress as a percentage.

:raises

Exception: This can happen if an invalid job_id is provided or Harmony services can’t be reached.

zarr_download_exception = Exception('The zarr library must be used for zarr files. See https://github.com/nasa/harmony/blob/main/docs/Harmony%20Feature%20Examples.ipynb for zarr library usage example.')
zarr_download_exception_msg = 'The zarr library must be used for zarr files. See https://github.com/nasa/harmony/blob/main/docs/Harmony%20Feature%20Examples.ipynb for zarr library usage example.'
class harmony.harmony.Collection(id: str)

Bases: object

The identity of a CMR Collection.

class harmony.harmony.Dimension(name: str, min: float = None, max: float = None)

Bases: object

An arbitrary dimension to subset against. A dimension can take a minimum value and a maximum value to to subset against.

Example

Requesting the data to be subset by the dimension lev with a minimum value of 10.0 and maximum value of 20.0:

>>> dimension = Dimension('lev', 10.0, 20.0)

Important: When specified positionally, the parameters must be given in the order: dimension name, minimum value, maximum value.

Alternatively, one can explicitly set each value used named parameters:

>>> dimension = Dimension(name='lev', min=10.0, max=20.0)

Print a readable representation of the dimension:

>>> print(dimension)
Dimension: Name: lev, Minimum: 10.0, Maximum: 20.0
Parameters:
  • name – The dimension name

  • min – The minimum value for the given dimension to subset against (optional)

  • max – The maximum value for the given dimension to subset against (optional)

Returns:

A Dimension instance with the provided dimension subset values.

max: float
min: float
name: str
class harmony.harmony.LinkType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

The type of URL to provide when returning links to data.

s3: Returns an Amazon Web Services (AWS) S3 URL https: Returns a standard HTTP URL

http = 'http'
https = 'https'
s3 = 's3'
exception harmony.harmony.ProcessingFailedException(job_id: str, message: str)

Bases: Exception

Indicates a Harmony job has failed during processing

class harmony.harmony.Request(collection: Collection, *, spatial: BBox = None, temporal: Mapping[str, datetime] = None, dimensions: List[Dimension] = None, extend: List[str] = None, crs: str = None, destination_url: str = None, format: str = None, granule_id: List[str] = None, granule_name: List[str] = None, height: int = None, interpolation: str = None, max_results: int = None, scale_extent: List[float] = None, scale_size: List[float] = None, shape: Tuple[IO, str] | None = None, variables: List[str] = ['all'], width: int = None, concatenate: bool = None, skip_preview: bool = None, ignore_errors: bool = None, grid: str = None)

Bases: BaseRequest

A Harmony request with the CMR collection and various parameters expressing how the data is to be transformed.

Parameters:
  • collection – The CMR collection that should be queried

  • spatial – Bounding box spatial constraints on the data

  • temporal – Date/time constraints on the data provided as a dict mapping “start” and “stop” keys to corresponding start/stop datetime.datetime objects

  • dimensions – A list of dimensions to use for subsetting the data

  • extend – A list of dimensions to extend

  • crs – reproject the output coverage to the given CRS. Recognizes CRS types that can be inferred by gdal, including EPSG codes, Proj4 strings, and OGC URLs (http://www.opengis.net/def/crs/…)

  • interpolation – specify the interpolation method used during reprojection and scaling

  • scale_extent – scale the resulting coverage either among one axis to a given extent

  • scale_size – scale the resulting coverage either among one axis to a given size

  • shape – a file path to an ESRI Shapefile zip, GeoJSON file, or KML file to use for spatial subsetting. Note: not all collections support shapefile subsetting

  • granule_id – The CMR Granule ID for the granule which should be retrieved

  • granule_name – The granule ur or provider id for the granule(s) to be retrieved wildcards * (multi character match) and ? (single character match) are supported

  • width – number of columns to return in the output coverage

  • height – number of rows to return in the output coverage

  • format – the output mime type to return

  • max_results – limits the number of input granules processed in the request

  • concatenate – Whether to invoke a service that supports concatenation

  • skip_preview – Whether Harmony should skip auto-pausing and generating a preview for large jobs

  • ignore_errors – if “true”, continue processing a request to completion even if some items fail

  • destination_url – Destination URL specified by the client (only S3 is supported, e.g. s3://my-bucket-name/mypath)

  • grid – The name of the output grid to use for regridding requests. The name must match the UMM grid name in the CMR.

Returns:

A Harmony Transformation Request instance

error_messages() List[str]

A list of error messages, if any, for the request.

Module contents