Motu Client Java

From Gcube Wiki
Jump to: navigation, search

TODO

This client is part of the CMEMS Dataset Importer.

Overview

TODO

Access to CMEMS data is subject to user authentication. An account can be created at: http://marine.copernicus.eu/services-portfolio/register-now

Authentication is based on the CAS protocol[1]. The CAS server for the CMEMS data infrastructure is https://cmems-cas.cls.fr/cas/login?service=... The CAS endpoint is not required to be set by the user since it's reached by the client via HTTP redirect.

Features

TODO

How CMEMS data published

  • CMEMS data are published through a different Motu servers, usually operated by different institutions.
  • A Motu server publishes a number of Products.
  • A Product is exposed as a set of Datasets.
  • Datasets usually contains a subset of the variables in a Product, or different time resolutions (e.g. daily means or monthly means).

Note: CMEMS products and datasets are referred in Motu terminology as services and products, respectively. This client library adopts the Motu terminology. Whenever the term dataset is used here, it's equivalent to Motu product.

Usage

Configure your project

Maven coordinates:

  <dependency>
    <groupId>org.gcube.dataanalysis</groupId>
    <artifactId>cmems-client</artifactId>
    <version>[1.0.0, 2.0.0)</version>
  </dependency>

The current version of the library is 1.0.0

Create a client

As a first step, you need to create a MotuClient object providing enough information to connect to the corresponding server:

  MotuClient client = new MotuClient("server endpoint");
  client.setUsername("username");
  client.setPassword("password");

Setting the preferred download size

You can optionally specify the size of chunks to be downloaded. If not provided, the client will use the maximum size allowed by the server (currently most servers allow between 1 and 2 GB). This setting is meaningful only when using the enhanced API, enabling the split of a request in a number of smaller ones.

  client.setPreferredDownloadSize(20*SizeUtils.MB);

Retrieving the product catalogue

A full catalogue containing all services published by the Motu server can be obtained with the code below.

Note: building a catalogue for a Motu server takes some time (about 8 minutes for http://cmems-med-mfc.eu/motu-web/Motu), since it recursively goes into nested server resources. If you plan to deliver the catalogue in an interactive way (e.g. in a GUI), please strongly consider caching to improve the user experience. Conversely, if you only need some specific information (e.g. only the list of services), it's better to query for them individually, as described later in this document.

  // retrieve the catalogue (it might take some time)
  MotuCatalogue catalogue = client.getCatalogue();
 
  // get available services  Collection<ServiceMetadata> services = catalogue.getServices();

A service usually contain different products (datasets). Typically there are products for different time resolution and/or different variables.

  // get the list of products for a service
  Collection<ProductMetadataInfo> products = service.getProducts();

Product metadata can be obtained with:

  ProductMetadataInfo product = ...;
 
  // get timestamps for which there are data available
  List<Calendar> times = product.getAvailableTimeCodes(); 
  // the oldest timestamp in the dataset
  Calendar start = product.getFirstAvailableTimeCode();
 
  // the most recent timestamp in the dataset
  Calendar end = product.getLastAvailableTimeCode();
 
  // get the time resolution of the dataset (in hours)
  Long hours = product.getTimeResolution();
 
  // get a list of depths for which there are data available
  List<Double> depths = product.getAvailableDepths();
 
  // get the lowest depth value (i.e. closer to the surface)
  Double depth = product.getFirstAvailableDepth();
 
  // get the highest depth value (i.e. the deeper level)
  Double depth = product.getLastAvailableDepth();
 
  // get the set of variables in the dataset
  Collection<Variable> variables = product.getVariables();
 
  // get the dimensions of the dataset (e.g. lat, lon, depth)
  Collection<Axis> axes = product.getDataGeospatialCoverage();

Getting the list of services

Instead of retrieving the full catalogue (it might be slow), you might want to get the list of available services, without any nested product information:

  // get a shallow list of available services
  Collection<ServiceMetadata> services = client.listServices();

Getting product metadata

Similarly, once you know the service and product name, you can retrieve product metadata directly:

  // get product metadata by service and product name
  ServiceMetadata service = ...;
  ProductMetadataInfo product = describeProduct(service, "productName");
   // or...
  ProductMetadataInfo product = describeProduct("serviceName", "productName");

Downloading products

Products can be downloaded using two approaches:

  • the legacy Motu API (asynchronous, with limited download size);
  • the enhanced gCube API (synchronous, without any limitation on the download size).

Building a download request

Once you've identified the product you want to download, a download request has to be prepared with references to the product, bounding box, time frame and variables.

  // build a download request
  DownloadRequest request = new DownloadRequest();
 
  // the service name is mandatory  request.setService("MEDSEA_ANALYSIS_FORECAST_PHYS_006_001-TDS");
 
  // product name is also mandatory
  request.setProduct("cmemsv02-med-ingv-cur-an-fc-d");
 
  // optionally, set the longitude range. If not set, the whole extent is downloaded
  request.setxRange(15d, 20d);
 
  // optionally, set the latitude range. If not set, the whole extent is downloaded
  request.setyRange(35d, 40d);
 
  // optionally, set the depth range. If not set, the whole depth extent is downloaded
  request.setzRange(1.4721018075942993, 5334.64794921875);
 
  // optionally, set the time window. If not set, all timestamps are downloaded
  request.settRange("2014-01-01 00:00:00", "2014-01-30 00:00:00");
 
  // this seems to be mandatory.
  request.setScriptVersion("1.4.00-20170410143941999");
 
  // process the request in asynchronous mode.
  request.setMode("status");
 
  // the output format (no alternatives here)
  request.setOutput("netcdf");
 
  // optionally set which variables to download. If not set, all of them are downloaded
  request.addVariable("vomecrty");
  request.addVariable("vozocrtx");

Getting the estimated download size

Motu has an operation to estimate the size of a download request:

  // get the estimated output size
  RequestSize requestSize = client.getSize(request);
 
  // here is the size (in bytes)
  Long actualSize = requestSize.getSizeInBytes();
 
  // here is also the maximum allowed download size
  Long maxSize = requestSize.getMaxAllowedSizeInBytes();

When using the legacy Motu download API, the size must be smaller than the one allowed by the server. If higher you'll need to shrink the request (either bounding box and/or time span and/or variables).

Using the legacy Motu API

Once a request has been built and the result size is within allowed limits, you can queue the download request:

  // queue the request and returns immediately (async call)
  StatusModeResponse submitStatus = client.queueProductDownload(request);

Then, the status of remote processing can be checked with:

  // fetch the id of the request
  String requestId = submitStatus.getRequestId();
 
  // ask the server for an updated status
  StatusModeResponse status = client.checkStatus(requestId);

The status of the request is encoded in the 'status' property, with the same values returned by the Motu server. Possible values are:

 "0": the request is being processed
 "1": the file is ready for download
 "2": an error occurred
 "3": the request is pending

The status can be checked with the following methods:

  // check for specific status
  boolean r = status.isReady();
  boolean e = status.isError();
  boolean p = status.isInProgress();
 
  // or get the status code
  String code = status.getStatus();

When the request reaches the 'ready' status, the dataset is ready for download. The URL is also included in the status message:

  URI uri = status.getRemoteUri();

Note: the file is usually available on the remote server for a limited amount of time (TODO: quantify this).

Using the enhanced API

This download API allows to download products of any size, taking care of splitting the request in a number of smaller chunks and merging the responses into a single netcdf file. Currently, only asynchronous invocation is available for this operation.

  // download the product (synchronous) in a standard location (/tmp)
  File product = client.downloadProduct(request);
 
  // download the product (synchronous) to a given local file
  File product = client.downloadProduct(request, localFile);

Under the hood

TODO

References

  1. https://en.wikipedia.org/wiki/Central_Authentication_Service