Difference between revisions of "Motu Client Java"

From Gcube Wiki
Jump to: navigation, search
 
(44 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
TODO
 +
 +
This client is part of the [[CMEMS Dataset Importer]].
 +
 
= Overview =
 
= Overview =
 +
TODO
 +
 +
Access to CMEMS data is subject to user authentication. An account can be created at: http://marine.copernicus.eu/services-portfolio/register-now
 +
 +
Authentication is based on the CAS protocol<ref>https://en.wikipedia.org/wiki/Central_Authentication_Service</ref>. The CAS server for the CMEMS data infrastructure is https://cmems-cas.cls.fr/cas/login?service=... The CAS endpoint is not required to be set by the user since it's reached by the client via HTTP redirect.
 +
 +
== Features ==
 +
TODO
 +
 +
== How CMEMS data published ==
 +
* CMEMS data are published through a different Motu servers, usually operated by different institutions.
 +
* A Motu server publishes a number of Products.
 +
* A Product is exposed as a set of Datasets.
 +
* Datasets usually contains a subset of the variables in a Product, or different time resolutions (e.g. daily means or monthly means).
 +
 +
'''Note''': CMEMS ''products'' and ''datasets'' are referred in Motu terminology as ''services'' and ''products'', respectively. This client library adopts the Motu terminology. Whenever the term ''dataset'' is used here, it's equivalent to ''Motu product''.
  
 
= Usage =  
 
= Usage =  
Line 10: Line 30:
 
   <dependency>
 
   <dependency>
 
     <groupId>org.gcube.dataanalysis</groupId>
 
     <groupId>org.gcube.dataanalysis</groupId>
     <artifactId>motu-client</artifactId>
+
     <artifactId>cmems-client</artifactId>
 
     <version>[1.0.0, 2.0.0)</version>
 
     <version>[1.0.0, 2.0.0)</version>
 
   </dependency>
 
   </dependency>
 
</source>
 
</source>
 +
 +
The current version of the library is 1.0.0
  
 
== Create a client ==
 
== Create a client ==
Line 25: Line 47:
 
   client.setPassword("password");
 
   client.setPassword("password");
 
</source>
 
</source>
 +
 +
== Setting the preferred download size ==
  
 
You can optionally specify the size of chunks to be downloaded.  
 
You can optionally specify the size of chunks to be downloaded.  
 
If not provided, the client will use the maximum size allowed by the server (currently most servers allow between 1 and 2 GB).  
 
If not provided, the client will use the maximum size allowed by the server (currently most servers allow between 1 and 2 GB).  
 +
This setting is meaningful only when using the enhanced API, enabling the split of a request in a number of smaller ones.
  
 
<source lang="java5" highlight="4">
 
<source lang="java5" highlight="4">
Line 33: Line 58:
 
</source>
 
</source>
  
= Under the hood =  
+
== Retrieving the product catalogue ==
 +
 
 +
A full catalogue containing all services published by the Motu server can be obtained with the code below.
 +
 
 +
'''Note''': building a catalogue for a Motu server takes some time (about 8 minutes for http://cmems-med-mfc.eu/motu-web/Motu), since it recursively goes into nested server resources. If you plan to deliver the catalogue in an interactive way (e.g. in a GUI), please strongly consider caching to improve the user experience. Conversely, if you only need some specific information (e.g. only the list of services), it's better to query for them individually, as described later in this document.
 +
 
 +
<source lang="java5" highlight="4">
 +
  // retrieve the catalogue (it might take some time)
 +
  MotuCatalogue catalogue = client.getCatalogue();
 +
 
 +
  // get available services
 +
  Collection<ServiceMetadata> services = catalogue.getServices();
 +
</source>
 +
 
 +
A service usually contain different products (datasets). Typically there are products for different time resolution and/or different variables.
 +
 
 +
<source lang="java5" highlight="4">
 +
  // get the list of products for a service
 +
  Collection<ProductMetadataInfo> products = service.getProducts();
 +
</source>
 +
 
 +
Product metadata can be obtained with:
 +
 
 +
<source lang="java5" highlight="4">
 +
  ProductMetadataInfo product = ...;
 +
 
 +
  // get timestamps for which there are data available
 +
  List<Calendar> times = product.getAvailableTimeCodes();
 +
 
 +
  // the oldest timestamp in the dataset
 +
  Calendar start = product.getFirstAvailableTimeCode();
 +
 
 +
  // the most recent timestamp in the dataset
 +
  Calendar end = product.getLastAvailableTimeCode();
 +
 
 +
  // get the time resolution of the dataset (in hours)
 +
  Long hours = product.getTimeResolution();
 +
 
 +
  // get a list of depths for which there are data available
 +
  List<Double> depths = product.getAvailableDepths();
 +
 
 +
  // get the lowest depth value (i.e. closer to the surface)
 +
  Double depth = product.getFirstAvailableDepth();
 +
 
 +
  // get the highest depth value (i.e. the deeper level)
 +
  Double depth = product.getLastAvailableDepth();
 +
 
 +
  // get the set of variables in the dataset
 +
  Collection<Variable> variables = product.getVariables();
 +
 
 +
  // get the dimensions of the dataset (e.g. lat, lon, depth)
 +
  Collection<Axis> axes = product.getDataGeospatialCoverage();
 +
 
 +
</source>
 +
 
 +
== Getting the list of services ==
 +
 
 +
Instead of retrieving the full catalogue (it might be slow), you might want to get the list of available services, without any nested product information:
 +
 
 +
<source lang="java5" highlight="4">
 +
  // get a shallow list of available services
 +
  Collection<ServiceMetadata> services = client.listServices();
 +
</source>
 +
 
 +
== Getting product metadata ==
 +
 
 +
Similarly, once you know the service and product name, you can retrieve product metadata directly:
 +
 
 +
<source lang="java5" highlight="4">
 +
  // get product metadata by service and product name
 +
  ServiceMetadata service = ...;
 +
  ProductMetadataInfo product = describeProduct(service, "productName");
 +
 
 +
  // or...
 +
  ProductMetadataInfo product = describeProduct("serviceName", "productName");
 +
</source>
 +
 
 +
== Downloading products ==
 +
 
 +
Products can be downloaded using two approaches:
 +
* the legacy Motu API (asynchronous, with limited download size);
 +
* the enhanced gCube API (synchronous, without any limitation on the download size).
 +
 
 +
=== Building a download request ===
 +
 
 +
Once you've identified the product you want to download, a download request has to be prepared with references to the product, bounding box, time frame and variables.
 +
 
 +
<source lang="java5" highlight="4">
 +
  // build a download request
 +
  DownloadRequest request = new DownloadRequest();
 +
 
 +
  // the service name is mandatory
 +
  request.setService("MEDSEA_ANALYSIS_FORECAST_PHYS_006_001-TDS");
 +
 
 +
  // product name is also mandatory
 +
  request.setProduct("cmemsv02-med-ingv-cur-an-fc-d");
 +
 
 +
  // optionally, set the longitude range. If not set, the whole extent is downloaded
 +
  request.setxRange(15d, 20d);
 +
 
 +
  // optionally, set the latitude range. If not set, the whole extent is downloaded
 +
  request.setyRange(35d, 40d);
 +
 
 +
  // optionally, set the depth range. If not set, the whole depth extent is downloaded
 +
  request.setzRange(1.4721018075942993, 5334.64794921875);
 +
 
 +
  // optionally, set the time window. If not set, all timestamps are downloaded
 +
  request.settRange("2014-01-01 00:00:00", "2014-01-30 00:00:00");
 +
 
 +
  // this seems to be mandatory.
 +
  request.setScriptVersion("1.4.00-20170410143941999");
 +
 
 +
  // process the request in asynchronous mode.
 +
  request.setMode("status");
 +
 
 +
  // the output format (no alternatives here)
 +
  request.setOutput("netcdf");
 +
 
 +
  // optionally set which variables to download. If not set, all of them are downloaded
 +
  request.addVariable("vomecrty");
 +
  request.addVariable("vozocrtx");
 +
</source>
 +
 
 +
=== Getting the estimated download size ===
 +
 
 +
Motu has an operation to estimate the size of a download request:
 +
 
 +
<source lang="java5">
 +
  // get the estimated output size
 +
  RequestSize requestSize = client.getSize(request);
 +
 
 +
  // here is the size (in bytes)
 +
  Long actualSize = requestSize.getSizeInBytes();
 +
 
 +
  // here is also the maximum allowed download size
 +
  Long maxSize = requestSize.getMaxAllowedSizeInBytes();
 +
</source>
 +
 
 +
When using the legacy Motu download API, the size must be smaller than the one allowed by the server. If higher you'll need to shrink the request (either bounding box and/or time span and/or variables).
 +
 
 +
=== Using the legacy Motu API ===
 +
 
 +
Once a request has been built and the result size is within allowed limits, you can queue the download request:
 +
 
 +
<source lang="java5">
 +
 
 +
  // queue the request and returns immediately (async call)
 +
  StatusModeResponse submitStatus = client.queueProductDownload(request);
 +
 
 +
</source>
 +
 
 +
Then, the status of remote processing can be checked with:
 +
 
 +
<source lang="java5">
 +
 
 +
  // fetch the id of the request
 +
  String requestId = submitStatus.getRequestId();
 +
 
 +
  // ask the server for an updated status
 +
  StatusModeResponse status = client.checkStatus(requestId);
 +
 
 +
</source>
 +
 
 +
The status of the request is encoded in the 'status' property, with the same values returned by the Motu server. Possible values are:
 +
 
 +
  "0": the request is being processed
 +
  "1": the file is ready for download
 +
  "2": an error occurred
 +
  "3": the request is pending
 +
 
 +
The status can be checked with the following methods:
 +
 
 +
<source lang="java5">
 +
  // check for specific status
 +
  boolean r = status.isReady();
 +
  boolean e = status.isError();
 +
  boolean p = status.isInProgress();
 +
 
 +
  // or get the status code
 +
  String code = status.getStatus();
 +
</source>
 +
 
 +
When the request reaches the 'ready' status, the dataset is ready for download.
 +
The URL is also included in the <code>status</code> message:
 +
 
 +
<source lang="java5">
 +
  URI uri = status.getRemoteUri();
 +
</source>
 +
 
 +
'''Note''': the file is usually available on the remote server for a limited amount of time (TODO: quantify this).
 +
 
 +
=== Using the enhanced API ===
 +
 
 +
This download API allows to download products of any size, taking care of splitting the request in a number of smaller chunks and merging the responses into a single netcdf file. Currently, only asynchronous invocation is available for this operation.
 +
 
 +
<source lang="java5">
 +
  // download the product (synchronous) in a standard location (/tmp)
 +
  File product = client.downloadProduct(request);
 +
 
 +
  // download the product (synchronous) to a given local file
 +
  File product = client.downloadProduct(request, localFile);
 +
</source>
 +
 
 +
= Under the hood =
 +
TODO
  
 
= References =
 
= References =

Latest revision as of 13:05, 16 January 2018

TODO

This client is part of the CMEMS Dataset Importer.

Overview

TODO

Access to CMEMS data is subject to user authentication. An account can be created at: http://marine.copernicus.eu/services-portfolio/register-now

Authentication is based on the CAS protocol[1]. The CAS server for the CMEMS data infrastructure is https://cmems-cas.cls.fr/cas/login?service=... The CAS endpoint is not required to be set by the user since it's reached by the client via HTTP redirect.

Features

TODO

How CMEMS data published

  • CMEMS data are published through a different Motu servers, usually operated by different institutions.
  • A Motu server publishes a number of Products.
  • A Product is exposed as a set of Datasets.
  • Datasets usually contains a subset of the variables in a Product, or different time resolutions (e.g. daily means or monthly means).

Note: CMEMS products and datasets are referred in Motu terminology as services and products, respectively. This client library adopts the Motu terminology. Whenever the term dataset is used here, it's equivalent to Motu product.

Usage

Configure your project

Maven coordinates:

  <dependency>
    <groupId>org.gcube.dataanalysis</groupId>
    <artifactId>cmems-client</artifactId>
    <version>[1.0.0, 2.0.0)</version>
  </dependency>

The current version of the library is 1.0.0

Create a client

As a first step, you need to create a MotuClient object providing enough information to connect to the corresponding server:

  MotuClient client = new MotuClient("server endpoint");
  client.setUsername("username");
  client.setPassword("password");

Setting the preferred download size

You can optionally specify the size of chunks to be downloaded. If not provided, the client will use the maximum size allowed by the server (currently most servers allow between 1 and 2 GB). This setting is meaningful only when using the enhanced API, enabling the split of a request in a number of smaller ones.

  client.setPreferredDownloadSize(20*SizeUtils.MB);

Retrieving the product catalogue

A full catalogue containing all services published by the Motu server can be obtained with the code below.

Note: building a catalogue for a Motu server takes some time (about 8 minutes for http://cmems-med-mfc.eu/motu-web/Motu), since it recursively goes into nested server resources. If you plan to deliver the catalogue in an interactive way (e.g. in a GUI), please strongly consider caching to improve the user experience. Conversely, if you only need some specific information (e.g. only the list of services), it's better to query for them individually, as described later in this document.

  // retrieve the catalogue (it might take some time)
  MotuCatalogue catalogue = client.getCatalogue();
 
  // get available services  Collection<ServiceMetadata> services = catalogue.getServices();

A service usually contain different products (datasets). Typically there are products for different time resolution and/or different variables.

  // get the list of products for a service
  Collection<ProductMetadataInfo> products = service.getProducts();

Product metadata can be obtained with:

  ProductMetadataInfo product = ...;
 
  // get timestamps for which there are data available
  List<Calendar> times = product.getAvailableTimeCodes(); 
  // the oldest timestamp in the dataset
  Calendar start = product.getFirstAvailableTimeCode();
 
  // the most recent timestamp in the dataset
  Calendar end = product.getLastAvailableTimeCode();
 
  // get the time resolution of the dataset (in hours)
  Long hours = product.getTimeResolution();
 
  // get a list of depths for which there are data available
  List<Double> depths = product.getAvailableDepths();
 
  // get the lowest depth value (i.e. closer to the surface)
  Double depth = product.getFirstAvailableDepth();
 
  // get the highest depth value (i.e. the deeper level)
  Double depth = product.getLastAvailableDepth();
 
  // get the set of variables in the dataset
  Collection<Variable> variables = product.getVariables();
 
  // get the dimensions of the dataset (e.g. lat, lon, depth)
  Collection<Axis> axes = product.getDataGeospatialCoverage();

Getting the list of services

Instead of retrieving the full catalogue (it might be slow), you might want to get the list of available services, without any nested product information:

  // get a shallow list of available services
  Collection<ServiceMetadata> services = client.listServices();

Getting product metadata

Similarly, once you know the service and product name, you can retrieve product metadata directly:

  // get product metadata by service and product name
  ServiceMetadata service = ...;
  ProductMetadataInfo product = describeProduct(service, "productName");
   // or...
  ProductMetadataInfo product = describeProduct("serviceName", "productName");

Downloading products

Products can be downloaded using two approaches:

  • the legacy Motu API (asynchronous, with limited download size);
  • the enhanced gCube API (synchronous, without any limitation on the download size).

Building a download request

Once you've identified the product you want to download, a download request has to be prepared with references to the product, bounding box, time frame and variables.

  // build a download request
  DownloadRequest request = new DownloadRequest();
 
  // the service name is mandatory  request.setService("MEDSEA_ANALYSIS_FORECAST_PHYS_006_001-TDS");
 
  // product name is also mandatory
  request.setProduct("cmemsv02-med-ingv-cur-an-fc-d");
 
  // optionally, set the longitude range. If not set, the whole extent is downloaded
  request.setxRange(15d, 20d);
 
  // optionally, set the latitude range. If not set, the whole extent is downloaded
  request.setyRange(35d, 40d);
 
  // optionally, set the depth range. If not set, the whole depth extent is downloaded
  request.setzRange(1.4721018075942993, 5334.64794921875);
 
  // optionally, set the time window. If not set, all timestamps are downloaded
  request.settRange("2014-01-01 00:00:00", "2014-01-30 00:00:00");
 
  // this seems to be mandatory.
  request.setScriptVersion("1.4.00-20170410143941999");
 
  // process the request in asynchronous mode.
  request.setMode("status");
 
  // the output format (no alternatives here)
  request.setOutput("netcdf");
 
  // optionally set which variables to download. If not set, all of them are downloaded
  request.addVariable("vomecrty");
  request.addVariable("vozocrtx");

Getting the estimated download size

Motu has an operation to estimate the size of a download request:

  // get the estimated output size
  RequestSize requestSize = client.getSize(request);
 
  // here is the size (in bytes)
  Long actualSize = requestSize.getSizeInBytes();
 
  // here is also the maximum allowed download size
  Long maxSize = requestSize.getMaxAllowedSizeInBytes();

When using the legacy Motu download API, the size must be smaller than the one allowed by the server. If higher you'll need to shrink the request (either bounding box and/or time span and/or variables).

Using the legacy Motu API

Once a request has been built and the result size is within allowed limits, you can queue the download request:

  // queue the request and returns immediately (async call)
  StatusModeResponse submitStatus = client.queueProductDownload(request);

Then, the status of remote processing can be checked with:

  // fetch the id of the request
  String requestId = submitStatus.getRequestId();
 
  // ask the server for an updated status
  StatusModeResponse status = client.checkStatus(requestId);

The status of the request is encoded in the 'status' property, with the same values returned by the Motu server. Possible values are:

 "0": the request is being processed
 "1": the file is ready for download
 "2": an error occurred
 "3": the request is pending

The status can be checked with the following methods:

  // check for specific status
  boolean r = status.isReady();
  boolean e = status.isError();
  boolean p = status.isInProgress();
 
  // or get the status code
  String code = status.getStatus();

When the request reaches the 'ready' status, the dataset is ready for download. The URL is also included in the status message:

  URI uri = status.getRemoteUri();

Note: the file is usually available on the remote server for a limited amount of time (TODO: quantify this).

Using the enhanced API

This download API allows to download products of any size, taking care of splitting the request in a number of smaller chunks and merging the responses into a single netcdf file. Currently, only asynchronous invocation is available for this operation.

  // download the product (synchronous) in a standard location (/tmp)
  File product = client.downloadProduct(request);
 
  // download the product (synchronous) to a given local file
  File product = client.downloadProduct(request, localFile);

Under the hood

TODO

References

  1. https://en.wikipedia.org/wiki/Central_Authentication_Service