Difference between revisions of "GCube Data Catalogue for GRSF"
Luca.frosini (Talk | contribs) (→Common Metadata) |
Luca.frosini (Talk | contribs) (→Common Metadata) |
||
Line 292: | Line 292: | ||
| Data and Resources | | Data and Resources | ||
| No | | No | ||
− | + | | No | |
| No | | No | ||
| Resource | | Resource |
Revision as of 11:56, 3 April 2024
** THIS PAGE IS UNDER CONSTRUCTION **
Contents
GCube Data Catalogue: support for GRSF
In this page are reported the relevant information about the GRSF Data Catalogue, which is available here. This page is somehow an extension of the main gCube Data Catalogue guide, you are suggested to read before continue.
The GRFS Data Catalogue stores, as well as allows the publication of products of two types: Stock and Fishery. Apart from the default set of metadata, each type of product will also have specific fields. Some of them will also become automatically tags of the product. The same reasoning applies for group associations. In fact a set of groups was already available and each product will be automatically associated to them during publication, if that is the case. Fields that fire tags creation or groups association are documented below.
The publication phase is performed by means of a RESTful service whose publish methods accept JSON objects. The service allows also to publish records belonging to the original source records (i.e., "RAM", "FishSource" and "FIRMS") on which the aggregated GRSF records are build.
Metadata
Base Metadata
The following table shows the set of core metadata, that is the ones shared by both Stock and Fishery types. Some of them are automatically filled. The values given to some fields are automatically used to tag the product. Check the 'Is Tag' column of the table below. Other fields have a controlled vocabulary (that is, they can assume values selected from a defined set), and the value assigned to these fields allow to automatically determine to which group assign the product. Check the 'Is Group' column below.
The table also reports the column Type that shows how the field is mapped into the product, once it is published within the catalogue, i.e. as a simple field (key/value pair, possibly repeatable) or a resource (an object with a mandatory name, an optional description and a mandatory url).
IMPORTANT: Any other field that doesn't match one of the below ones will be managed as an object having a key and a value, both of String type, and attached to the product as a simple field.
Name | Api Name (JSON) | Display Section | Is Tag | Is Group | Is Sensitive | Example | Guidelines/Comments |
---|---|---|---|---|---|---|---|
Title | stock_name or fishery_name | Top of the record | No | No | No | ||
Description | N/A | Top of the record | No | No | No | ||
License | license_id | Right panel | No | No | No | CC-BY-SA-4.0 | |
Author | N/A | Management Info | No | No | No | This field is automatically compiled by using the information of the caller entity. | |
Author contact | N/A | Management Info | No | No | No | joe.blogg@example.com | This field is automatically compiled by using the information of the caller entity |
Version | version | Management Info | No | No | No | 1.0 | |
Maintainer | N/A | Management Info | No | No | No | ||
Maintainer Contact | N/A | Management Info | No | No | No |
Common Metadata
Besides the above common metadata, there is the following set of attributes that are captured for both Stock and Fishery records.
Name | Api Name (JSON) | Display Section | Is Tag | Is Group | Is Sensitive | Type | Example | Guidelines/Comments |
---|---|---|---|---|---|---|---|---|
GRSF Type | grsf_type | Stock/Fishery Identity | Yes | No | No | Field | ||
Short Name | short_name | Stock/Fishery Identity | No | No | No | Field | ||
Database Source | database_sources | Stock/Fishery Identity | No | Yes in GRSF_Admin only | No | Field (repeatable) | "database_sources": [ "FishSource", "Fisheries and Resources Monitoring System (FIRMS)", "RAM Legacy Stock Assessment Database", "FAO SDG 14.4.1 Questionnaire" ] | |
Species | species | Stock/Fishery Identity | Yes | No | No | Field (repeatable) | ||
Similar GRSF Record | similar_grsf_records | Stock/Fishery Identity | No | No | No | Field (repeatable) | ||
Connected Stock Record / Connected Fishery Record | connected | Stock/Fishery Identity | No | No | No | Field (repeatable) | ||
Data Owner | data_owner | Stock/Fishery Data | No | No | Yes | Field (repeatable) | ||
Catch | catches | Stock/Fishery Data | No | Yes (Catch) | Yes | TimeSeries | ||
Landing | landings | Stock/Fishery Data | No | Yes (Landing) | Yes | TimeSeries | ||
SDG Flag | sdg_flag | Additional Info | No | Yes (only if the value is true) | No | Field | ||
Status of the Record | status_grsf_record | Additional Info | No | No | No | Field | ||
spatial | spatial | Additional Info | No | No | No | Field | Due to plugin constraints, the key must be spatial. The portlet visualizes it as "Geospatial" | |
Domain | N/A | Additional Info | No | Yes (Stock or Fishery). | No | Field | ||
GRSF UUID | grsf_uuid | Additional Info | No | No | No | Field | ||
Citation | citation | Additional Info + Top Button | No | No | No | Field | ||
Annotation | annotations | Additional Info | No | No | Yes | Field (repeatable) | ||
N/A | source_of_information | Data and Resources | No | No | No | Resource | ||
N/A | refers_to | Data and Resources | No | No | No | Resource | ||
N/A | connections_indicator | N/A | Yes (connected, not connected) | No | No | N/A | ||
N/A | similarities_indicator | N/A | Yes ("with similar records", "without similar records") | No | No | N/A | ||
N/A | source from URL path | N/A | No | Yes - if(source == 'GRSF') 'grsf-group' else legacy-group' | No | N/A | Examples URL /stock/grsf, /stock/ram, /stock/firms, /stock/fishsource, /stock/sdg |
Stock Metadata
The Stock product type also supports the following list of fields.
Name | Api Name (JSON) | Display Section | Is Tag | Is Group | Is Sensitive | Type | Example | Guidelines/Comments |
---|---|---|---|---|---|---|---|---|
GRSF Stock Name | stock_name | Stock Identity + Top of the record | No | No | No | Field | ||
GRSF Semantic Identifier | grsf_semantic_identifier | Stock Identity + Top of the record | No | No | No | Field | ||
Assessment Area | assessment_area | Stock Identity | Yes | No | No | Field (repeatable) | ||
Intersecting FAO Major Fishing Areas | hidden_assessment_area | Stock Identity | Yes | No | Yes | Field (repeatable) | ||
Assessment Method | assessment_methods | Stock Data | No | Yes (Group name: "Assessment Method") | Yes | TimeSeries | ||
Abundance Level (FIRMS Standard) | firms_standard_abundance_level | Stock Data | No | Yes (Group name: "Abundance Level (FIRMS Standard)") | Yes | TimeSeries | ||
Abundance Level | abundance_level | Stock Data | No | Yes (Group name: "Abundance Level") | Yes | TimeSeries | ||
Biomass | biomass | Stock Data | No | Yes (Group name: "Biomass") | Yes | TimeSeries | ||
Fishing Pressure (FIRMS Standard) | firms_standard_fishing_pressure | Stock Data | No | Yes (Group name: "Fishing Pressure (FIRMS Standard)") | Yes | TimeSeries | ||
Fishing Pressure | fishing_pressure | Stock Data | No | Yes (Group name: "Fishing Pressure") | Yes | TimeSeries | ||
State and Trend | state_and_trend_of_marine_resources | Stock Data | No | Yes (Group name: "State and Trend") | Yes | TimeSeries | ||
FAO Stock Status Category | fao_categories | Stock Data | No | Yes (Group name: "FAO Stock Status Category") | Yes | TimeSeries | ||
Scientific Advice | scientific_advice | Stock Data | No | Yes (Group name: "Scientific Advice") | Yes | TimeSeries | ||
Assessor | assessor | Stock Data | No | No | Yes | Field |
Fishery Metadata
The Fishery product type also supports the following list of fields.
Name | Api Name (JSON) | Display Section | Is Tag | Is Group | Is Sensitive | Type | Example | Guidelines/Comments |
---|---|---|---|---|---|---|---|---|
GRSF Fishery Name | fishery_name | Fishery Identity + Top of the record | No | No | No | Field | ||
GRSF Semantic identifier | grsf_semantic_identifier | Fishery Identity + Top of the record | No | No | No | Field | ||
Traceability Flag | traceability_flag | Fishery Identity | No | Yes (only if the value is true) | No | Field | ||
Fishing Area | fishing_area | Fishery Identity | Yes | No | No | Field (repeatable) | ||
Intersecting FAO Major Fishing Areas | hidden_fishing_area | Fishery Identity | Yes | No | Yes | Field (repeatable) | ||
Jurisdiction Area | jurisdiction_area | Fishery Identity | Yes | No | No | Field (repeatable) | ||
Flag State | flag_state | Fishery Identity | Yes | No | No | Field (repeatable) | ||
Fishing Gear | fishing_gear | Fishery Identity | Yes | No | No | Field (repeatable) | ||
Management Body/Authority | management_body_authorities | Fishery Identity | No | No | No | Field (repeatable) | ||
Resources Exploited | resources_exploited | Fishery Identity | No | No | No | Field (repeatable) |
GRSF Publication Web Service
Records publication is performed by means of a RESTful web service running over SmartGears. Almost every call to the service requires the security token of the user for the context in which he/she wants to publish or exploit the other functionalities. Please note that in case of product publication it is needed that the user has enough privileges. The list of roles and associated privileges for the catalogue users is reported here. The VRE Manager assignes them.
In order to retrieve your security token you can use the token generator portlet.
The right address for contacting the GRSF service in a context can be discovered by means of the Information System [1]. You need the following parameters
Service Name = GRSFPublisher Service Class = Data-Catalogue Entry Name = jersey-servlet
For testing purpose, a running instance can be contacted at the following address
https://smart-grsf-d-d4s.d4science.org/grsf-publisher-ws/rest/ [GRSF_PUBLISHER_WS_BASE_URL]
The token for testing purpose can be retrieved from the VRE at this url https://next.d4science.org/group/nextnext/home (register yourself if needed). In the following, every time you perform a request you must specify the type of record for which it will be valid, i.e., in the request path you need to specify a value among:
- grsf;
- fishsource;
- ram;
- firms.
Please check the Troubleshooting section in case of errors on publication phase.
Check Service Availability
To check that the stock/fishery service is up and running, just put the url below in your browser
[GRSF_PUBLISHER_WS_BASE_URL]/{...}/fishery/hello
Specify the record type by replacing {...}. The response should look like
Hello.. Fishery service is here
or
[GRSF_PUBLISHER_WS_BASE_URL]/{...}/stock/hello
and the response should look like
Hello.. Stock service is here
Retrieve the licenses list
The default license that will be associated to the products, if not specified, is the CC-BY-SA-4.0 one. However, if it doesn't feet your needs, you can use one of the others available and retrievable by contacting the service(s) this way
[GRSF_PUBLISHER_WS_BASE_URL]/{...}/fishery/get-licenses?gcube-token=YOUR_TOKEN
or, for stock
[GRSF_PUBLISHER_WS_BASE_URL]/{...}/stock/get-licenses?gcube-token=YOUR_TOKEN
The response is a JSON object, containing couples <license key, license name>, which looks like
{ "AFL-3.0": "Academic Free License 3.0", "RPSL-1.0": "RealNetworks Public Source License 1.0", "ODC-BY-1.0": "Open Data Commons Attribution License 1.0", "IPL-1.0": "IBM Public License 1.0", "ODbL-1.0": "Open Data Commons Open Database License 1.0", "PostgreSQL": "PostgreSQL License", "W3C": "W3C License", .... }
During the publication phase, the identifier of the license chosen should be provided.
Stock Publication Example
The publish method to invoke to publish a stock for rams, grsf, fishsource or ram is the following
[GRSF_PUBLISHER_WS_BASE_URL]/{...}/stock/publish-product?gcube-token=YOUR_TOKEN
The JSON object you must provide in input has the following structure (of course, not all fields are mandatory)
{ "description":..., "license_id":..., "version":..., "maintainer":..., "maintainer_contact":..., "database_sources":[...], "source_of_information":[...], "uuid_knowledge_base":..., "traceability_flag":.., "data_owner":..., "type":..., "stock_name":..., "stock_id":..., "stock_uri":..., "species":[...], "assessment_distribution_area":[...], "exploiting_fishery":[...], "management_entity":..., "assessment_methods":..., "state_of_marine_resource":..., "standard_exploitation_rate":[...], "standard_abundance_level":[...], "water_area":[...], "narrative_state_and_trend":..., "scientific_advice":..., "reporting_entity":..., "reporting_year":..., "status":..., "short_title":..., "refers_to":[...] }
The response of the method is a JSON object of this kind
{ "id": ... , // identifier of the created product in the catalogue "knowledge_base_id" : .... // identifier of the product in the knowledge base "product_url": ..., // url of the created product "error": ... // in case of error, check this field }
In case of success the HTTP code is 201 (CREATED) and the response contains the url and the unique identifier assigned to the product. In case of errors (BAD_REQUEST, INTERNAL_SERVER_ERROR, FORBIDDEN ... ) the "error" message of the above object reports what was wrong.
Example
A valid JSON, for example, is the following one
/*TODO*/
the response obtained from the service is
/*TODO*/
Fishery Publication Example
The publish method to invoke to publish a fishery product is the following
[GRSF_PUBLISHER_WS_BASE_URL]/{...}/fishery/publish-product?gcube-token=YOUR_TOKEN
The JSON object you must provide in input has the following structure (of course, not all fields are mandatory)
{ "description": ..., "license_id": ..., "version": ..., "maintainer": ..., "maintainer_contact": ..., "catches_or_landings": [...], "database_sources": [...], "source_of_information": [...], "uuid_knowledge_base" : ..., "traceability_flag": ..., "short_title": ..., "refers_to":[...], "data_owner": ..., "reporting_year": …, "type": ..., "fishery_name": ..., "fishery_id": ..., "scientific_name": ..., "fishing_area": [...], "exploited_stocks": [...], "management_entity": ..., "jurisdiction_area": ..., "production_system_type": ..., "flag_state": ..., "fishing_gear": ..., "status": ..., "environment”: ... }
The response of the method is a JSON object of this kind
{ "id": ... , // identifier of the created product in the catalogue "knowledge_base_id" : .... // identifier of the product in the knowledge base "product_url": ..., // url of the created product "error": ... // in case of error, check this field }
In case of success the HTTP code is 201 (CREATED) and the response contains the url and the unique identifier assigned to the product. In case of errors (BAD_REQUEST, INTERNAL_SERVER_ERROR, FORBIDDEN ... ) the "error" message of the above object reports what was wrong.
Example
A valid JSON, for example, is the following one
/* TODO */
the response obtained from the service is
/* TODO */
Delete a published product
If for some reason you need to delete a published product, you can invoke the following delete http methods. For fishery it is
[GRSF_PUBLISHER_WS_BASE_URL]/{...}/fishery/delete-product?gcube-token=YOUR_TOKEN
whereas for stock
[GRSF_PUBLISHER_WS_BASE_URL]/{...}/stock/delete-product?gcube-token=YOUR_TOKEN
You must provide the identifier (returned back at creation time) of the product, in a JSON that looks like
{"id": "product-to-delete-identifier"}
The response status of the service, in case of success is 200 (OK)
How To Publish a GRSF product using JAVA
Below you find a simple Java application publishing a GRSF product (i.e. a stock).
package [YOUR PACKAGE]; import org.apache.commons.httpclient.HttpClient; import org.apache.commons.httpclient.HttpException; import org.apache.commons.httpclient.MultiThreadedHttpConnectionManager; import org.apache.commons.httpclient.methods.ByteArrayRequestEntity; import org.apache.commons.httpclient.methods.PostMethod; import org.apache.http.HttpStatus; import org.apache.log4j.Logger; /** * The Class GRSFPublishMetadata. * * @author Francesco Mangiacrapa francesco.mangiacrapa@isti.cnr.it * Oct 13, 2016 */ public class GRSFPublishMetadata { public static final Logger logger = Logger.getLogger(GRSFPublishMetadata.class); private static final String GRSF_PUBLISHER_REST_SERVICE_BASE_URL = "https://next.d4science.org/grsf-publisher-ws/rest/"; /** * The Enum PRODUCT_TYPE. * * @author Francesco Mangiacrapa francesco.mangiacrapa@isti.cnr.it * Oct 13, 2016 */ private static enum PRODUCT_TYPE{stock, fishery} private static final String PUBLISH_PRODUCT_REQUEST = "publish-product"; private static final String GCUBE_TOKEN_PARAMETER = "gcube-token"; private static final String GCUBE_TOKEN_VALUE = [YOUR TOKEN]; //***********SET YOUR TOKEN************ private static final String CONTENTTYPE = "application/json"; private HttpClient httpClient = null; public static final int TIME_OUT_REQUESTS = 5000; //5 sec /** * Instantiates a new GRSF publish metadata. */ public GRSFPublishMetadata() { MultiThreadedHttpConnectionManager connectionManager = new MultiThreadedHttpConnectionManager(); connectionManager.getParams().setSoTimeout(TIME_OUT_REQUESTS); this.httpClient = new HttpClient(connectionManager); } /** * Publish product. * * @param type the type * @param body the body * @return the string * @throws Exception the exception */ public String publishProduct(PRODUCT_TYPE type, String body) throws Exception { // Create a method instance. String buildURL = GRSF_PUBLISHER_REST_SERVICE_BASE_URL + "/" + type.toString() +"/"+PUBLISH_PRODUCT_REQUEST +"?"+GCUBE_TOKEN_PARAMETER +"="+GCUBE_TOKEN_VALUE; PostMethod method = new PostMethod(buildURL); method.setRequestHeader("Content-type", CONTENTTYPE); logger.debug("call post to URI .... " + method.getURI()); logger.debug(" the body is..." + body); method.setRequestEntity(new ByteArrayRequestEntity(body.getBytes())); byte[] responseBody = null; try { // Execute the method. int statusCode = httpClient.executeMethod(method); if (statusCode != HttpStatus.SC_OK && statusCode != HttpStatus.SC_CREATED) { logger.error("Method failed: " + method.getStatusLine()+"; Response bpdy: "+method.getResponseBody()); method.releaseConnection(); throw new Exception("Method failed: " + method.getStatusLine()+"; Response body: "+new String(method.getResponseBody())); } // Read the response body. responseBody = method.getResponseBody(); } catch (HttpException e) { logger.error("Fatal protocol violation: ", e); method.releaseConnection(); throw new Exception("Fatal protocol violation: " + e.getMessage()); } catch (Exception e) { logger.error("Fatal transport error: ", e); method.releaseConnection(); throw new Exception("Fatal transport error: " + e.getMessage()); } method.releaseConnection(); return new String(responseBody); } /** * The main method. * * @param args the arguments */ public static void main(String[] args) { try { GRSFPublishMetadata grsfP = new GRSFPublishMetadata(); String minimal_json_stock = "{\"description\":\"This stock product was generated for testing purposes, to show how publication works. Please refer to https://wiki.gcube-system.org/gcube/GCube_Data_Catalogue_for_GRSF for more information\"," + "\"license_id\":\"CC-BY-SA-4.0\"," + "\"version\":1," + "\"maintainer\":\"Francesco Mangiacrapa\","+ "\"maintainer_contact\":\"francesco.mangiacrapa@isti.cnr.it\","+ "\"catches_or_landings\":\"unknown\","+ "\"database_sources\":["+ "{\"name\":\"RAM\",\"url\":\"test url\"}" + "]" + ",\"source_of_information\":[{\"name\":\"the source of information\",\"url\":\"http://www.google.com\"}" + "]," + "\"data_owner\":\"IATTC\","+ "\"type\":\"Assessment Unit\","+ "\"stock_name\":\"Skipjack tuna - Western Pacific\","+ //YOU MUST CHANGE THE STOCK NAME FOR TESTING "\"stock_id\":\"SKJ - EPO - TESTING\","+ "\"species_scientific_name\":\"SKJ\","+ "\"assessment_distribution_area\":\"Western Pacific\","+ "\"exploiting_fishery\":\"Tunas and billfishes fishery\","+ "\"management_entity\":\"DFO\","+ "\"assessment_methods\":\"Analytical assessment\","+ "\"state_of_marine_resource\":null,"+ "\"exploitation_rate\":\"High fishing mortality\","+ "\"abundance_level\":\"Intermediate abundance\","+ "\"narrative_state_and_trend\":\"Stock size and fishing pressure are considered to be close to their value at MSY.\","+ "\"scientific_advice\":\"The indices of abundance from two longline fleets available for this stock present divergent trends over the last few years, the differences observed in targeting are not fully explained.\","+ "\"reporting_entity\":\"GRP3\","+ "\"reporting_year\":2016,"+ "\"status\":\"pending\"}"; String response = grsfP.publishProduct(PRODUCT_TYPE.stock, minimal_json_stock); logger.info("The Response: "+response); }catch (Exception e) { e.printStackTrace(); } } }
The response should look like:
call post to URI .... https://next.d4science.org/grsf-publisher-ws/rest/stock/publish-product?gcube-token=[YOUR TOKEN] the body is...{"description":"This stock product was generated for testing purposes, to show how publication works. Please refer to https://wiki.gcube-system.org/gcube/GCube_Data_Catalogue_for_GRSF for more information","license_id":"CC-BY-SA-4.0","version":1,"maintainer":"Francesco Mangiacrapa","maintainer_contact":"francesco.mangiacrapa@isti.cnr.it","catches_or_landings":"unknown","database_sources":[{"name":"RAM","url":"test url"}],"source_of_information":[{"name":"the source of information","url":"http://www.google.com"}],"data_owner":"IATTC","type":"Assessment Unit","stock_name":"Skipjack tuna - Western Pacific Ocean 4","stock_id":"SKJ - EPO - TESTING","species_scientific_name":"SKJ","assessment_distribution_area":"Western Pacific Ocean 4","exploiting_fishery":"Tunas and billfishes fishery","management_entity":"DFO","assessment_methods":"Analytical assessment","state_of_marine_resource":null,"exploitation_rate":"High fishing mortality","abundance_level":"Intermediate abundance","narrative_state_and_trend":"Stock size and fishing pressure are considered to be close to their value at MSY.","scientific_advice":"The indices of abundance from two longline fleets available for this stock present divergent trends over the last few years, the differences observed in targeting are not fully explained.","reporting_entity":"GRP3","reporting_year":2016,"status":"pending"} The Response: {"id":"8426bce9-15b9-4c4a-a526-e829882b91ec","dataset_url":"https://next.d4science.org/group/nextnext/data-catalogue?path=/dataset/skipjack_tuna_-_western_pacific_ocean_4","error":null}
You must use the following dependencies (if you are using Maven):
<!-- COMMONS HTTP --> <dependency> <groupId>commons-httpclient</groupId> <artifactId>commons-httpclient</artifactId> <version>3.1</version> </dependency> <!-- LOGS --> <dependency> <groupId>log4j</groupId> <artifactId>log4j</artifactId> <version>1.2.16</version> </dependency>
GRSF records - manage facility
GRSF records will be validated by experts, in order to check that they have been properly generated starting from source (i.e., FIRMS, RAM and FishSource) records. The GRSF Admin VRE is the environment dedicated to this activity. It will offer a manage facility to users having the GRSF Administrator role's rights.
By pushing on the "Manage Item" button when a GRSF record is selected, the manage panel shows up to report a summary of the item and currently allows to:
- change the current status;
- add an annotation message for the ongoing change.
The updates to the record will be notified to the Knowledge Base of GRSF records before any change is reflected to the catalogue.
Troubleshooting
Here are reported solutions to some of the errors may arise while you try to publish items. Generally, the error is specified in the message field of the returned JSON. In case it is empty but still the success field is set to false, then you should take in consideration the following:
- Tags: they must have a length within the range [2, 100] characters