GCube Data Catalogue for GRSF

From Gcube Wiki
Jump to: navigation, search

** THIS PAGE IS UNDER CONSTRUCTION **

GCube Data Catalogue: support for GRSF

In this page are reported the relevant information about the GRSF Data Catalogue, which is available here. This page is somehow an extension of the main gCube Data Catalogue guide, you are suggested to read before continue.

The GRFS Data Catalogue stores, as well as allows the publication of products of two types: Stock and Fishery. Apart from the default set of metadata, each type of product will also have specific fields. Some of them will also become automatically tags of the product. The same reasoning applies for group associations. In fact a set of groups was already available and each product will be automatically associated to them during publication, if that is the case. Fields that fire tags creation or groups association are documented below.

The publication phase is performed by means of a RESTful service whose publish methods accept JSON objects. The service allows also to publish records belonging to the original source records (i.e., "RAM", "FishSource" and "FIRMS") on which the aggregated GRSF records are build.

Metadata

Base Metadata

The following table shows the set of core metadata, that is the ones shared by both Stock and Fishery types. Some of them are automatically filled. The values given to some fields are automatically used to tag the product. Check the 'Is Tag' column of the table below. Other fields have a controlled vocabulary (that is, they can assume values selected from a defined set), and the value assigned to these fields allow to automatically determine to which group assign the product. Check the 'Is Group' column below.

The table also reports the column Type that shows how the field is mapped into the product, once it is published within the catalogue, i.e. as a simple field (key/value pair, possibly repeatable) or a resource (an object with a mandatory name, an optional description and a mandatory url).

IMPORTANT: Any other field that doesn't match one of the below ones will be managed as an object having a key and a value, both of String type, and attached to the product as a simple field.

Name Api Name (JSON) Display Section Is Tag Is Group Is Sensitive Example Guidelines/Comments
Title stock_name or fishery_name Top of the record No No No
Description N/A Top of the record No No No
License license_id Right panel No No No CC-BY-SA-4.0
Author N/A Management Info No No No This field is automatically compiled by using the information of the caller entity.
Author contact N/A Management Info No No No joe.blogg@example.com This field is automatically compiled by using the information of the caller entity
Version version Management Info No No No 1.0
Maintainer N/A Management Info No No No
Maintainer Contact N/A Management Info No No No

Common Metadata

Besides the above common metadata, there is the following set of attributes that are captured for both Stock and Fishery records.

Name Api Name (JSON) Display Section Is Tag Is Group Is Sensitive Type Example Guidelines/Comments
GRSF Type grsf_type Stock/Fishery Identity Yes No No Field
Short Name short_name Stock/Fishery Identity No No No Field
Database Source database_sources Stock/Fishery Identity No Yes in GRSF_Admin only No Field (repeatable) "database_sources": [ "FishSource", "Fisheries and Resources Monitoring System (FIRMS)", "RAM Legacy Stock Assessment Database", "FAO SDG 14.4.1 Questionnaire" ]
Species species Stock/Fishery Identity Yes No No Field (repeatable)
Similar GRSF Record similar_grsf_records Stock/Fishery Identity No No No Field (repeatable)
Connected Stock Record / Connected Fishery Record connected Stock/Fishery Identity No No No Field (repeatable)
Data Owner data_owner Stock/Fishery Data No No Yes Field (repeatable)
Catch catches Stock/Fishery Data No Yes (Catch) Yes TimeSeries
Landing landings Stock/Fishery Data No Yes (Landing) Yes TimeSeries
SDG Flag sdg_flag Additional Info No Yes (only if the value is true) No Field
Status of the Record status_grsf_record Additional Info No No No Field
is "spatial" due to plugin constraint MUST BE "Geospatial" spatial Additional Info No No No Field
Domain N/A Additional Info No Yes (Stock or Fishery). No Field
GRSF UUID grsf_uuid Additional Info No No No Field
Citation citation Additional Info + Top Button No No No Field
Annotation annotations Additional Info No No Yes Field (repeatable)
N/A source_of_information Data and Resources No No No Resource
N/A refers_to Data and Resources No No No Resource
N/A connections_indicator N/A Yes (connected, not connected) No No N/A
N/A similarities_indicator N/A Yes ("with similar records", "without similar records") No No N/A
N/A source from URL path N/A No Yes - if(source == 'GRSF') 'grsf-group' else legacy-group' No N/A Examples URL /stock/grsf, /stock/ram, /stock/firms, /stock/fishsource, /stock/sdg

Stock Metadata

The Stock product type also supports the following list of fields.

Name Api Name (JSON) Display Section Is Tag Is Group Is Sensitive Type Example Guidelines/Comments
GRSF Stock Name stock_name Stock Identity + Top of the record No No No Field
GRSF Semantic Identifier grsf_semantic_identifier Stock Identity + Top of the record No No No Field
Assessment Area assessment_area Stock Identity Yes No No Field (repeatable)
Intersecting FAO Major Fishing Areas hidden_assessment_area Stock Identity Yes No Yes Field (repeatable)
Assessment Method assessment_methods Stock Data No Yes (Group name: "Assessment Method") Yes TimeSeries
Abundance Level (FIRMS Standard) firms_standard_abundance_level Stock Data No Yes (Group name: "Abundance Level (FIRMS Standard)") Yes TimeSeries
Abundance Level abundance_level Stock Data No Yes (Group name: "Abundance Level") Yes TimeSeries
Biomass biomass Stock Data No Yes (Group name: "Biomass") Yes TimeSeries
Fishing Pressure (FIRMS Standard) firms_standard_fishing_pressure Stock Data No Yes (Group name: "Fishing Pressure (FIRMS Standard)") Yes TimeSeries
Fishing Pressure fishing_pressure Stock Data No Yes (Group name: "Fishing Pressure") Yes TimeSeries
State and Trend state_and_trend_of_marine_resources Stock Data No Yes (Group name: "State and Trend") Yes TimeSeries
FAO Stock Status Category fao_categories Stock Data No Yes (Group name: "FAO Stock Status Category") Yes TimeSeries
Scientific Advice scientific_advice Stock Data No Yes (Group name: "Scientific Advice") Yes TimeSeries
Assessor assessor Stock Data No No Yes Field

Fishery Metadata

The Fishery product type also supports the following list of fields.

Name Api Name (JSON) Display Section Is Tag Is Group Is Sensitive Type Example Guidelines/Comments
GRSF Fishery Name fishery_name Fishery Identity + Top of the record No No No Field
GRSF Semantic identifier grsf_semantic_identifier Fishery Identity + Top of the record No No No Field
Traceability Flag traceability_flag Fishery Identity No Yes (only if the value is true) No Field
Fishing Area fishing_area Fishery Identity Yes No No Field (repeatable)
Intersecting FAO Major Fishing Areas hidden_fishing_area Fishery Identity Yes No Yes Field (repeatable)
Jurisdiction Area jurisdiction_area Fishery Identity Yes No No Field (repeatable)
Flag State flag_state Fishery Identity Yes No No Field (repeatable)
Fishing Gear fishing_gear Fishery Identity Yes No No Field (repeatable)
Management Body/Authority management_body_authorities Fishery Identity No No No Field (repeatable)
Resources Exploited resources_exploited Fishery Identity No No No Field (repeatable)

GRSF Publication Web Service

Records publication is performed by means of a RESTful web service running over SmartGears. Almost every call to the service requires the security token of the user for the context in which he/she wants to publish or exploit the other functionalities. Please note that in case of product publication it is needed that the user has enough privileges. The list of roles and associated privileges for the catalogue users is reported here. The VRE Manager assignes them.

In order to retrieve your security token you can use the token generator portlet.

The right address for contacting the GRSF service in a context can be discovered by means of the Information System [1]. You need the following parameters

Service Name = GRSFPublisher
Service Class = Data-Catalogue
Entry Name = jersey-servlet

For testing purpose, a running instance can be contacted at the following address

https://smart-grsf-d-d4s.d4science.org/grsf-publisher-ws/rest/  [GRSF_PUBLISHER_WS_BASE_URL]

The token for testing purpose can be retrieved from the VRE at this url https://next.d4science.org/group/nextnext/home (register yourself if needed). In the following, every time you perform a request you must specify the type of record for which it will be valid, i.e., in the request path you need to specify a value among:

  • grsf;
  • fishsource;
  • ram;
  • firms.

Please check the Troubleshooting section in case of errors on publication phase.

Check Service Availability

To check that the stock/fishery service is up and running, just put the url below in your browser

[GRSF_PUBLISHER_WS_BASE_URL]/{...}/fishery/hello

Specify the record type by replacing {...}. The response should look like

Hello.. Fishery service is here

or

[GRSF_PUBLISHER_WS_BASE_URL]/{...}/stock/hello

and the response should look like

Hello.. Stock service is here

Retrieve the licenses list

The default license that will be associated to the products, if not specified, is the CC-BY-SA-4.0 one. However, if it doesn't feet your needs, you can use one of the others available and retrievable by contacting the service(s) this way

[GRSF_PUBLISHER_WS_BASE_URL]/{...}/fishery/get-licenses?gcube-token=YOUR_TOKEN

or, for stock

[GRSF_PUBLISHER_WS_BASE_URL]/{...}/stock/get-licenses?gcube-token=YOUR_TOKEN

The response is a JSON object, containing couples <license key, license name>, which looks like

{
    "AFL-3.0": "Academic Free License 3.0",
    "RPSL-1.0": "RealNetworks Public Source License 1.0",
    "ODC-BY-1.0": "Open Data Commons Attribution License 1.0",
    "IPL-1.0": "IBM Public License 1.0",
    "ODbL-1.0": "Open Data Commons Open Database License 1.0",
    "PostgreSQL": "PostgreSQL License",
    "W3C": "W3C License",
    ....
}

During the publication phase, the identifier of the license chosen should be provided.

Stock Publication Example

The publish method to invoke to publish a stock for rams, grsf, fishsource or ram is the following

[GRSF_PUBLISHER_WS_BASE_URL]/{...}/stock/publish-product?gcube-token=YOUR_TOKEN

The JSON object you must provide in input has the following structure (of course, not all fields are mandatory)

{  
    "description":...,
    "license_id":...,
    "version":...,
    "maintainer":...,
    "maintainer_contact":...,
    "database_sources":[...],
    "source_of_information":[...],
    "uuid_knowledge_base":...,
    "traceability_flag":..,
    "data_owner":...,
    "type":...,
    "stock_name":...,
    "stock_id":...,
    "stock_uri":...,
    "species":[...],
    "assessment_distribution_area":[...],
    "exploiting_fishery":[...],
    "management_entity":...,
    "assessment_methods":...,
    "state_of_marine_resource":...,
    "standard_exploitation_rate":[...],
    "standard_abundance_level":[...],
    "water_area":[...],
    "narrative_state_and_trend":...,
    "scientific_advice":...,
    "reporting_entity":...,
    "reporting_year":...,
    "status":...,
    "short_title":...,
    "refers_to":[...]
}

The response of the method is a JSON object of this kind

{
   "id": ... , // identifier of the created product in the catalogue
   "knowledge_base_id" : .... // identifier of the product in the knowledge base
   "product_url": ..., // url of the created product
   "error": ... // in case of error, check this field
}

In case of success the HTTP code is 201 (CREATED) and the response contains the url and the unique identifier assigned to the product. In case of errors (BAD_REQUEST, INTERNAL_SERVER_ERROR, FORBIDDEN ... ) the "error" message of the above object reports what was wrong.

Example

A valid JSON, for example, is the following one

/*TODO*/

the response obtained from the service is

/*TODO*/

Fishery Publication Example

The publish method to invoke to publish a fishery product is the following

[GRSF_PUBLISHER_WS_BASE_URL]/{...}/fishery/publish-product?gcube-token=YOUR_TOKEN

The JSON object you must provide in input has the following structure (of course, not all fields are mandatory)

{
   "description": ...,
   "license_id": ...,
   "version": ...,
   "maintainer": ...,
   "maintainer_contact": ...,
   "catches_or_landings": [...],
   "database_sources": [...],
   "source_of_information": [...],
   "uuid_knowledge_base" : ...,
   "traceability_flag": ...,
   "short_title": ...,
   "refers_to":[...],
   "data_owner": ...,
   "reporting_year": …,
   "type": ...,
   "fishery_name": ...,
   "fishery_id": ...,
   "scientific_name": ...,
   "fishing_area": [...],
   "exploited_stocks": [...],
   "management_entity": ...,
   "jurisdiction_area": ...,
   "production_system_type": ...,
   "flag_state": ...,
   "fishing_gear": ...,
   "status": ...,
   "environment”: ...
}

The response of the method is a JSON object of this kind

{
   "id": ... , // identifier of the created product in the catalogue
   "knowledge_base_id" : .... // identifier of the product in the knowledge base
   "product_url": ..., // url of the created product
   "error": ... // in case of error, check this field
}

In case of success the HTTP code is 201 (CREATED) and the response contains the url and the unique identifier assigned to the product. In case of errors (BAD_REQUEST, INTERNAL_SERVER_ERROR, FORBIDDEN ... ) the "error" message of the above object reports what was wrong.

Example

A valid JSON, for example, is the following one

/* TODO */

the response obtained from the service is

/* TODO */

Delete a published product

If for some reason you need to delete a published product, you can invoke the following delete http methods. For fishery it is

[GRSF_PUBLISHER_WS_BASE_URL]/{...}/fishery/delete-product?gcube-token=YOUR_TOKEN

whereas for stock

[GRSF_PUBLISHER_WS_BASE_URL]/{...}/stock/delete-product?gcube-token=YOUR_TOKEN

You must provide the identifier (returned back at creation time) of the product, in a JSON that looks like

{"id": "product-to-delete-identifier"}

The response status of the service, in case of success is 200 (OK)

How To Publish a GRSF product using JAVA

Below you find a simple Java application publishing a GRSF product (i.e. a stock).

package [YOUR PACKAGE];

import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.HttpException;
import org.apache.commons.httpclient.MultiThreadedHttpConnectionManager;
import org.apache.commons.httpclient.methods.ByteArrayRequestEntity;
import org.apache.commons.httpclient.methods.PostMethod;
import org.apache.http.HttpStatus;
import org.apache.log4j.Logger;

/**
 * The Class GRSFPublishMetadata.
 *
 * @author Francesco Mangiacrapa francesco.mangiacrapa@isti.cnr.it
 * Oct 13, 2016
 */
public class GRSFPublishMetadata {

	public static final Logger logger = Logger.getLogger(GRSFPublishMetadata.class);
	private static final String GRSF_PUBLISHER_REST_SERVICE_BASE_URL = "https://next.d4science.org/grsf-publisher-ws/rest/";

	/**
	 * The Enum PRODUCT_TYPE.
	 *
	 * @author Francesco Mangiacrapa francesco.mangiacrapa@isti.cnr.it
	 * Oct 13, 2016
	 */
	private static enum PRODUCT_TYPE{stock, fishery}
	private static final String PUBLISH_PRODUCT_REQUEST = "publish-product";
	private static final String GCUBE_TOKEN_PARAMETER = "gcube-token";
	private static final String GCUBE_TOKEN_VALUE = [YOUR TOKEN]; //***********SET YOUR TOKEN************
	private static final String CONTENTTYPE = "application/json";
	private HttpClient httpClient = null;
	public static final int TIME_OUT_REQUESTS = 5000; //5 sec

	/**
	 * Instantiates a new GRSF publish metadata.
	 */
	public GRSFPublishMetadata() {
		MultiThreadedHttpConnectionManager connectionManager = new MultiThreadedHttpConnectionManager();
		connectionManager.getParams().setSoTimeout(TIME_OUT_REQUESTS);
		this.httpClient = new HttpClient(connectionManager);

	}

	/**
	 * Publish product.
	 *
	 * @param type the type
	 * @param body the body
	 * @return the string
	 * @throws Exception the exception
	 */
	public String publishProduct(PRODUCT_TYPE type, String body) throws Exception {
		// Create a method instance.
		String buildURL = GRSF_PUBLISHER_REST_SERVICE_BASE_URL + "/" + type.toString() +"/"+PUBLISH_PRODUCT_REQUEST +"?"+GCUBE_TOKEN_PARAMETER +"="+GCUBE_TOKEN_VALUE;
		PostMethod method = new PostMethod(buildURL);
		method.setRequestHeader("Content-type", CONTENTTYPE);
		logger.debug("call post to URI .... " + method.getURI());
		logger.debug("	the body is..." + body);
		method.setRequestEntity(new ByteArrayRequestEntity(body.getBytes()));
		byte[] responseBody = null;
		try {
			// Execute the method.
			int statusCode = httpClient.executeMethod(method);

			if (statusCode != HttpStatus.SC_OK && statusCode != HttpStatus.SC_CREATED) {
				logger.error("Method failed: " + method.getStatusLine()+"; Response bpdy: "+method.getResponseBody());
				method.releaseConnection();
				throw new Exception("Method failed: " + method.getStatusLine()+"; Response body: "+new String(method.getResponseBody()));
			}
			// Read the response body.
			responseBody = method.getResponseBody();

		} catch (HttpException e) {
			logger.error("Fatal protocol violation: ", e);
			method.releaseConnection();
			throw new Exception("Fatal protocol violation: " + e.getMessage());
		} catch (Exception e) {
			logger.error("Fatal transport error: ", e);
			method.releaseConnection();
			throw new Exception("Fatal transport error: " + e.getMessage());
		}
		method.releaseConnection();
		return new String(responseBody);
	}

	/**
	 * The main method.
	 *
	 * @param args the arguments
	 */
	public static void main(String[] args) {
		try {
		GRSFPublishMetadata grsfP = new GRSFPublishMetadata();
			String minimal_json_stock =
			   "{\"description\":\"This stock product was generated for testing purposes, to show how publication works. Please refer to https://wiki.gcube-system.org/gcube/GCube_Data_Catalogue_for_GRSF for more information\"," +
			   "\"license_id\":\"CC-BY-SA-4.0\"," +
			   "\"version\":1," +
			   "\"maintainer\":\"Francesco Mangiacrapa\","+
			   "\"maintainer_contact\":\"francesco.mangiacrapa@isti.cnr.it\","+
			   "\"catches_or_landings\":\"unknown\","+
			   "\"database_sources\":["+
			      "{\"name\":\"RAM\",\"url\":\"test url\"}" +
			      "]" +
			    ",\"source_of_information\":[{\"name\":\"the source of information\",\"url\":\"http://www.google.com\"}" +
			    "]," +
			   "\"data_owner\":\"IATTC\","+
			   "\"type\":\"Assessment Unit\","+
			   "\"stock_name\":\"Skipjack tuna - Western Pacific\","+ //YOU MUST CHANGE THE STOCK NAME FOR TESTING
			   "\"stock_id\":\"SKJ - EPO - TESTING\","+
			   "\"species_scientific_name\":\"SKJ\","+
			   "\"assessment_distribution_area\":\"Western Pacific\","+
			   "\"exploiting_fishery\":\"Tunas and billfishes fishery\","+
			   "\"management_entity\":\"DFO\","+
			   "\"assessment_methods\":\"Analytical assessment\","+
			   "\"state_of_marine_resource\":null,"+
			   "\"exploitation_rate\":\"High fishing mortality\","+
			   "\"abundance_level\":\"Intermediate abundance\","+
			   "\"narrative_state_and_trend\":\"Stock size and fishing pressure are considered to be close to their value at MSY.\","+
			   "\"scientific_advice\":\"The indices of abundance from two longline fleets available for this stock present divergent trends over the last few years, the differences observed in targeting are not fully explained.\","+
			   "\"reporting_entity\":\"GRP3\","+
			   "\"reporting_year\":2016,"+
			   "\"status\":\"pending\"}";

			String response = grsfP.publishProduct(PRODUCT_TYPE.stock, minimal_json_stock);
			logger.info("The Response: "+response);
		}catch (Exception e) {
			e.printStackTrace();
		}
	}

}

The response should look like:


call post to URI .... https://next.d4science.org/grsf-publisher-ws/rest/stock/publish-product?gcube-token=[YOUR TOKEN]
the body is...{"description":"This stock product was generated for testing purposes, to show how publication works. Please refer to https://wiki.gcube-system.org/gcube/GCube_Data_Catalogue_for_GRSF for more information","license_id":"CC-BY-SA-4.0","version":1,"maintainer":"Francesco Mangiacrapa","maintainer_contact":"francesco.mangiacrapa@isti.cnr.it","catches_or_landings":"unknown","database_sources":[{"name":"RAM","url":"test url"}],"source_of_information":[{"name":"the source of information","url":"http://www.google.com"}],"data_owner":"IATTC","type":"Assessment Unit","stock_name":"Skipjack tuna - Western Pacific Ocean 4","stock_id":"SKJ - EPO - TESTING","species_scientific_name":"SKJ","assessment_distribution_area":"Western Pacific Ocean 4","exploiting_fishery":"Tunas and billfishes fishery","management_entity":"DFO","assessment_methods":"Analytical assessment","state_of_marine_resource":null,"exploitation_rate":"High fishing mortality","abundance_level":"Intermediate abundance","narrative_state_and_trend":"Stock size and fishing pressure are considered to be close to their value at MSY.","scientific_advice":"The indices of abundance from two longline fleets available for this stock present divergent trends over the last few years, the differences observed in targeting are not fully explained.","reporting_entity":"GRP3","reporting_year":2016,"status":"pending"}
The Response: {"id":"8426bce9-15b9-4c4a-a526-e829882b91ec","dataset_url":"https://next.d4science.org/group/nextnext/data-catalogue?path=/dataset/skipjack_tuna_-_western_pacific_ocean_4","error":null}

You must use the following dependencies (if you are using Maven):

<!-- COMMONS HTTP -->
<dependency>
    <groupId>commons-httpclient</groupId>
    <artifactId>commons-httpclient</artifactId>
    <version>3.1</version>
</dependency>
<!-- LOGS -->
<dependency>
  <groupId>log4j</groupId>
  <artifactId>log4j</artifactId>
  <version>1.2.16</version>
</dependency>

GRSF records - manage facility

GRSF records will be validated by experts, in order to check that they have been properly generated starting from source (i.e., FIRMS, RAM and FishSource) records. The GRSF Admin VRE is the environment dedicated to this activity. It will offer a manage facility to users having the GRSF Administrator role's rights.

By pushing on the "Manage Item" button when a GRSF record is selected, the manage panel shows up to report a summary of the item and currently allows to:

  • change the current status;
  • add an annotation message for the ongoing change.

The updates to the record will be notified to the Knowledge Base of GRSF records before any change is reflected to the catalogue.

Troubleshooting

Here are reported solutions to some of the errors may arise while you try to publish items. Generally, the error is specified in the message field of the returned JSON. In case it is empty but still the success field is set to false, then you should take in consideration the following:

  • Tags: they must have a length within the range [2, 100] characters

References

  1. Information System can be queried via ic-client read more