Difference between revisions of "ServiceManager Guide"
(→DTS) |
(→Resource Catalogue) |
||
(13 intermediate revisions by 3 users not shown) | |||
Line 657: | Line 657: | ||
The steps needed to have a working catalogue running in a given scope, namely a VRE, are the following: | The steps needed to have a working catalogue running in a given scope, namely a VRE, are the following: | ||
− | # add the '''CKAN Services''' to the scope | + | # add the '''CKAN Services''' to the scope (this step can be avoided if you use the gcat configuration APIs as described below) |
## add the CKAN Data Catalogue Service Endpoint to the scope (see [[#CKanDataCatalogue|CKan Data Catalogue instance]]) | ## add the CKAN Data Catalogue Service Endpoint to the scope (see [[#CKanDataCatalogue|CKan Data Catalogue instance]]) | ||
## add the CKAN Database Service Endpoint to the scope (see [[#CKanDatabase|CKan Database]]); | ## add the CKAN Database Service Endpoint to the scope (see [[#CKanDatabase|CKan Database]]); | ||
− | ## add the Zenodo Service Endpoint (see [[#Zenodo API|Zenodo API]]); | + | ## (optional) add the Zenodo Service Endpoint (see [[#Zenodo API|Zenodo API]]); |
# add the '''gCat Service''' to the scope; | # add the '''gCat Service''' to the scope; | ||
− | ## ... | + | ## Add the scope in file <code>d4science-smartgears-services/group_vars/gcat_service_production/gcat_service_production.yml</code> in <code>ansible-playbook</code> |
− | ## | + | ## Run the playbook as following: |
− | ## configure gCat by <code>/configurations</code> (see [https://api.d4science.org/ | + | ### <code>./run.sh gcat.yml -i inventory_production/hosts.openstack_isti_production -l gcat_service -e 'gcube_admin_token=<TOKEN>' -t smartgears_conf</code> The problem with this command is that it starts a workflow on conductor for each scope defined in <code>d4science-smartgears-services/group_vars/gcat_service_production/gcat_service_production.yml</code>. |
+ | ### The alternative is invoking this command <code>./run.sh gcat.yml -i inventory_production/hosts.openstack_isti_production -l gcat_service_production -e "gcube_admin_token=<TOKEN>" -e 'smartgears_conductor_scope=<NEW_SCOPE>' --tags=smartgears_conf</code> Please note that: You must provide as input parameter <code>smartgears_conductor_scope</code> with the scope to be added. Please note that the scope must also be added to <code>d4science-smartgears-services/group_vars/gcat_service_production/gcat_service_production.yml</code> in <code>ansible-playbook</code> as explained in the previous step. You must provide the tag <code>smartgears_conf</code>. The playbook role invokes the conductor also with tag <code>smartgears_conf</code>. This has been added to avoid adding a scope in the node without invoking the conductor role; Please note that the conductor invocation runs <code>add_workspace_client_to_context</code> workflow to enact gCat to interact with the '''workspace''' (eg for storing resources). | ||
+ | ## configure gCat by <code>/configurations</code> (see [https://api.d4science.org/catalogue/api-docs/resource_Configuration.html gCat Configuration API]); This step allows us to avoid adding the resource to the scope as described in the first step (apart the Zenodo part). The best way to create the configuration is by reading the configuration from another VRE which uses the same Ckan instance. The obtained configuration must be copied and changed in the parts it differs e.g the <code>default_organization</code>. Please note you must be Catalogue-Manager to be allowed to create/change the configuration. | ||
# add the '''CKAN Connector''' to the scope (see [[#CKAN Connector|CKAN Connector]]); | # add the '''CKAN Connector''' to the scope (see [[#CKAN Connector|CKAN Connector]]); | ||
− | # | + | # add the '''URI Resolver Map''' Generic Resource to the scope (see [[#gCat & Uri-Resolver-Manager|Uri-Resolver-Map]]); |
− | # (automatic) configure the Catalogue Resolver Generic Resource (see [[#Catalogue-Resolver|Catalogue-Resolver resource]]); | + | # create the '''CKan Portlet''' Generic Resource with the URL hosting the catalogue in the VRE (see [[#CkanPortlet: this is the Portlet URL|CKan Portlet resource]]) |
− | # (automatic) configure the Catalogue Generic Resource used by the social service (see [[#Catalogue|Catalogue Resource]]); | + | # (to configure a catalogue at VO or root VO level) configure the '''DataCatalogueMapScopesUrls''' Generic Resource with the URL hosting the catalogue in the VRE (see [[#DataCatalogueMapScopesUrls]]); |
− | # (automatic) configure the News Feed Generic Resource (see [[#News Feed & Catalogue]]); | + | # (automatic) configure the '''Catalogue Resolver''' Generic Resource (see [[#Catalogue-Resolver|Catalogue-Resolver resource]]); |
+ | # (automatic) configure the '''Catalogue''' Generic Resource used by the social service (see [[#Catalogue|Catalogue Resource]]); | ||
+ | # (automatic) configure the '''News Feed''' Generic Resource (see [[#News Feed & Catalogue]]); | ||
# (optional) define any namespace needed to group extra fields (see [[#DataCatalogueNamespace|Namespaces Resource]]); | # (optional) define any namespace needed to group extra fields (see [[#DataCatalogueNamespace|Namespaces Resource]]); | ||
# (optional) define the mappings driving the publishing to Zenodo | # (optional) define the mappings driving the publishing to Zenodo | ||
Line 964: | Line 968: | ||
'''NB : Please note that actual requirements for a specific service instance vary depending on which plugins extensions are deployed in that particular instance.''' | '''NB : Please note that actual requirements for a specific service instance vary depending on which plugins extensions are deployed in that particular instance.''' | ||
+ | |||
+ | == Health Checks == | ||
+ | Three types of health checks are implemented: ''Service'', ''Mongo'', ''Database''. They are contactable via REST API whose responses are compliant to the https://microprofile.io/specifications/microprofile-health/ specification. | ||
+ | |||
+ | '''Service health :''' | ||
+ | checks if the `geoportal` service is up at | ||
+ | <pre> | ||
+ | https://{geoportal_endpoint}/geoportal-service/srv/health | ||
+ | </pre> | ||
+ | |||
+ | e.g. In DEV at https://geoportal.cloud-dev.d4science.org/geoportal-service/srv/health | ||
+ | |||
+ | '''Mongo health :''' | ||
+ | checks if the `geoportal` service is able to communicate with MongoDB instance at | ||
+ | <pre> | ||
+ | https://{geoportal_endpoint}/geoportal-service/srv/health/mongo?context={GCUBE_CONTEXT} | ||
+ | or (to include the collections) | ||
+ | https://{geoportal_endpoint}/geoportal-service/srv/health/mongo?context={GCUBE_CONTEXT}&include_collections=true | ||
+ | </pre> | ||
+ | |||
+ | eg. In DEV at | ||
+ | https://geoportal.cloud-dev.d4science.org/geoportal-service/srv/health/mongo?context=/gcube/devsec/devVRE | ||
+ | or (to include the collections) | ||
+ | https://geoportal.cloud-dev.d4science.org/geoportal-service/srv/health/mongo?context=/gcube/devsec/devVRE&include_collections=true | ||
+ | |||
+ | '''Database health :''' | ||
+ | checks if the `geoportal` service is able to communicate with PostGIS instance at | ||
+ | <pre> | ||
+ | https://{geoportal_endpoint}/geoportal-service/srv/health/database?context={GCUBE_CONTEXT} | ||
+ | </pre> | ||
+ | |||
+ | eg. In DEV at | ||
+ | https://geoportal.cloud-dev.d4science.org/geoportal-service/srv/health/database?context=/gcube/devsec/devVRE | ||
== Service Requirements == | == Service Requirements == |
Latest revision as of 17:19, 28 October 2024
This part of the guide is intended to cover the installation and configuration of gCube services that are not mentioned in the Administration guide. Mainly we refer to services that are not Enabling and that can be installed dynamically by the Infrastructure/VO Managers. The list includes also for each component known issues and specific configuration steps to follow.
Search (DISMISSED)
==Search V 2.xx (DISMISSED)==
The installation of a Search Node in gCube is characterised by the installation of 2 web-services ( in the minimal configuration ) :
- SearchSystemService
- ExecutionEngineService
This is the minimal installation scenario but it's possible to enable distributed search as well and this will required the installation and configuration of several ExecutionEngineServices
HW requirements
The minimal installation requirements for a Search node are a Single CPU node with 2GB RAMm but it's more than recommended to have at least 3GB RAM on the node dedicated to the GHN.
Configuration
The SearchSystemService and ExecutionEngineService have to be automatically/manually deployed in a VRE scope. In addition if we want to configure the SearchSystemService to exploit the local ExecutionEngineService to run the queries ( minimal installation) we should configure the jndi service as follows:
- excludeLocal = false
- collocationThreshold = 0.3f
- complexPlanNumNodes = 800000
Search v 3.x.x (DISMISSED)
The 3.0 version has moved to Smartgears and tomcat.
The requirement of the codeployment with Execution Engine Service is also there , so the Execution Engine Service v 2.0.0 has been also ported to SmartGears
HW requirements
The minimal installation requirements for a Search node are a Single CPU node with 2GB RAMm but it's more than recommended to have at least 3GB RAM on the node dedicated to the GHN.
Configuration
in order to fix an issue with datanucleus compatibility and java 7 there is a change to be included in the tomcat configuration:
- uncomment and modify the following line on the $CATALINA_HOME/bin/catalina.sh file:
JAVA_OPTS="$JAVA_OPTS -noverify -Dorg.apache.catalina.security.SecurityListener.UMASK=`umask`"
- The conf file $CATALINA_HOME/conf/infrastructure.properties containing infra and scope informations needs to be present
# a single infrastructure infrastructure=d4science.research-infrastructures.eu # multiple scopes must be separated by a common (e.g FARM,gCubeApps) scopes=Ecosystem clientMode=false
- The conf file $CATALINA_HOME/webapps/<search>WEB-INF/classes/deploy.properties needs to be filled with this info:
hostname = xx startScopes = xx port=xx
Known Issues
Excecution Engine (DISMISSED)
The 2.0 version has moved to Smartgears and tomcat.
HW requirements
The minimal installation requirements for an Execution Engine node are a Single CPU node with 2GB RAMm but it's more than recommended to have at least 3GB RAM on the node dedicated to the GHN.
Installation
Different packagings of the Execution engine are available depending on the service they are going to be co-deployed with and invoked:
- DTS : <artifactId>executionengineservice-dts</artifactId>
- Search: <artifactId>executionengineservice-search</artifactId>
Configuration
in order to fix an issue with datanucleus compatibility and java 7 there is a change to be included in the tomcat configuration:
- uncomment and modify the following line on the $CATALINA_HOME/bin/catalina.sh file:
JAVA_OPTS="$JAVA_OPTS -noverify -Dorg.apache.catalina.security.SecurityListener.UMASK=`umask`"
- The conf file $CATALINA_HOME/conf/infrastructure.properties containing infra and scope informations needs to be present
# a single infrastructure infrastructure=d4science.research-infrastructures.eu # multiple scopes must be separated by a common (e.g FARM,gCubeApps) scopes=Ecosystem clientMode=false
- The conf file $CATALINA_HOME/webapps/<execution-engine>WEB-INF/classes/deploy.properties needs to be filled with this info:
hostname = xx startScopes = xx port=xx pe2ng.port = 4000
- in case the exeucution engine needs to call DTS on the container.xml add:
<property name='dts.execution' value='true' />
Executor and GenericWorker (DISMISSED)
HW requirements
The minimal installation requirements for an Executor node with a Generic Worker plugin are a Single CPU node with 2GB RAM but it's more than recommended to have at least 3GB RAM on the node dedicated to the GHN.
Configuration
The following Software should be installed on the VM:
- R version 2.14.1
whit the following components
- coda
- R2jags
- R2WinBUGS
- rjags
- bayesmix
- runjags
Known Issues
- The GenericWorker is exploited by the Statistical Manager service to run distributed computations. Given that the SM use the root scope to discover instances of the GenericWorker. the plugin must be deployed at root scope level
- Given that the GenericWorker plugin depends on the Executor Service, when dynamically deploying the plugin the Executor Service is also deployed.
SmartExecutor
HW requirements
The minimal installation requirements for an Executor node with a Generic Worker plugin are a Single CPU node with 2GB RAM but it's more than recommended to have at least 3GB RAM on the node dedicated to the vHN (Smartgears gHN).
Configuration
No specific configuration are needed for SmartExecutor
Known Issues
- When correctly started the SmartExecutor publishes a ServiceEndpoint with <Category>VREManagement</Category> and <Name>SmartExecutor</Name>. You can check the availability of the plugin on that resource. there is one <AccessPoint> per plugin.
SmartGenericWorker
HW requirements
The minimal installation requirements for an Executor node with a Generic Worker plugin are a Single CPU node with 2GB RAM but it's more than recommended to have at least 3GB RAM on the node dedicated to the vHN.
Configuration
The following Software should be installed on the VM:
- R version 2.14.1
whit the following components
- coda
- R2jags
- R2WinBUGS
- rjags
- bayesmix
- runjags
Known Issues
- The SmartGenericWorker is exploited by the Statistical Manager service to run distributed computations. Given that the SM use the root scope to discover instances of the SmartGenericWorker, the plugin must be deployed at root scope level
- To deploy SmartGenericWorker you need to copy the SmartGenericWorker jar-with-dependecies in $CATALINA_HOME/webapps/smart-executor/WEB-INF/lib/ directory. A container restart is needed to load the new plugin.
- When the container is restarted the plugin availability can be cheeked looking at the Service Endpoint published by the SmartExecutor.
This simple script can help the deployment process.
#!/bin/bash
$CATALINA_HOME/bin/shutdown.sh -force
rm -rf $CATALINA_HOME/webapps/smart-executor*
cp ~/smart-executor.war $CATALINA_HOME/webapps/
mkdir $CATALINA_HOME/webapps/smart-executor unzip $CATALINA_HOME/webapps/smart-executor.war -d $CATALINA_HOME/webapps/smart-executor
cp ~/smart-generic-worker-*.jar $CATALINA_HOME/webapps/smart-executor/WEB-INF/lib/
sleep 5s $CATALINA_HOME/bin/startup.sh
DTS (ABANDONWARE / DISMISSED)
DTS v2.x
HW requirements
The minimal installation requirements for an DTS node are a Single CPU node with 2GB RAMm but it's more than recommended to have at least 3GB RAM on the node dedicated to the GHN.
Configuration
DTS uses Execution Engine to run the transformations so at least one Execution Engine should be deployed in the same scope as DTS and the related GHNLabels.xml file should contain:
<Variable> <Key>dts.execution</Key> <Value>true</Value> </Variable>
Known Issues
none
DTS v3.x
HW requirements
The minimal installation requirements for an DTS node with a Generic Worker plugin are a Single CPU node with 2GB RAMm but it's more than recommended to have at least 3GB RAM on the node dedicated to the GHN.
Configuration
- The conf file $CATALINA_HOME/conf/infrastructure.properties containing infra and scope informations needs to be present
# a single infrastructure infrastructure=d4science.research-infrastructures.eu # multiple scopes must be separated by a common (e.g FARM,gCubeApps) scopes=Ecosystem clientMode=false
- The conf file $CATALINA_HOME/webapps/<dts>/WEB-INF/classes/deploy.properties needs to be filled with this info:
hostname = xx startScopes = xx port=xx
DTS uses Execution Engine to run the transformations so at least one Execution Engine should be deployed in the same scope as DTS and the related Smartgears conf file ( container.xml ) should have this properties:
<property name='dts.execution' value='true' />
= Index (DISMISSED)=
Index Service (DISMISSED)
The Index Service is the latest released Restful Service running on Smartgears. It implements both FW and FT index functionalitoes
HW requirements
Given codeployment with ElasticSearch ( embedded) it's recommended at least a VM with 4GB RAM and 2 CPUs.
Also open file limit should be raised to 32000
Configuration
Details on the Index Service configuration are available at https://gcube.wiki.gcube-system.org/gcube/index.php/Index_Management_Framework#Deployment_Instructions
ForwardIndexNode ( Dismissed)
The ForwardIndexNode service needs to be codeployed with an instance of CouchBase service
HW requirements
Given codeployment with Couchbase it's recommended at least a VM with 4GB RAM and 2 CPUs.
Configuration
The installation of Couchbase should be performed manually and it depends on the OS ( binary package, rpm, debs).
It's recommended to put an higher limit of the open files on the VM ( 32000 min).
The configuration for the FWIndexNode that should be customized (jndi file):
- couchBaseIP = IP of the server hosting Couchbase ( so the same as the GHN)
- couchBaseUseName = the username set when configuring Couchbase
- couchBasePassword = the password set when configuring Couchbase
Once configured it's needed to initialize Couchbase using the cb_initialize_node.sh script contained into the service configuration folder.
Known Issues
- Sometimes the cb_initialize_node.sh script fails, it could mean that there is not enough memory to inizialize the data bucket , try to reduce the value of ramQuota in the jndi file.<s>
<s>= Statistical Manager (DISMISSED) =
Resources
Runtime Resources | ' | ' |
DataStorage/StorageManager | VO/VRE | StorageManager |
Database/Obis2Repository | VRE | Trendylyzer |
Database/StatisticalManagerDatabase | INFRA/VO/VRE | Statistical |
Database/AquamapsDB | VO/VRE | Algorithms |
Database/FishCodesConversion | VO/VRE | Algorithms |
Database/FishBase | VO/VRE | Algorithms - TaxaMatch |
DataStorage/Storage Manager | INFRA/VO/VRE | All |
Gis/Geoserver1..n | VRE | Maps Algorithms |
Gis/TimeSeriesDatastore | VO/VRE | Maps Algorithms |
Gis/GeoNetwork | VRE | Maps Algorithms |
Service/MessageBroker | VO | Service |
BiodiversityRepository/CatalogofLife | VO/VRE | Occurrence Algorithms |
BiodiversityRepository/GBIF | VO/VRE | Occurrence Algorithms |
BiodiversityRepository/ITIS | VO/VRE | Occurrence Algorithms |
BiodiversityRepository/WoRDSS | VO/VRE | Occurrence Algorithms |
BiodiversityRepository/WoRMS | VO/VRE | Occurrence Algorithms |
BiodiversityRepository/OBIS | VO/VRE | Occurrence Algorithms |
BiodiversityRepository/NCBI | VO/VRE | Occurrence Algorithms |
BiodiversityRepository/SpeciesLink | VO/VRE | Occurrence Algorithms |
DataAnalysis/Dataminer | VRE | Required if Dataminer is needed in the VRE |
Database/UsersGisTablesDB | VRE | Required if Dataminer and SDI are needed in the VRE |
WS Resources | ' | ' |
Workers | INFRA/VO | Parallel Computations |
Generic Resources | ' | ' |
ISO/MetadataConstants | VO/VRE | Maps Algorithms |
Known Issues
Tested on ghn 4.0.0 and StatisticalManager service 1.4.0:
- install the SM on the same network where the database and the used resources are located. Otherwise it would imply to restart production databases because direct access could not be granted to such resources.
- remove lib axis-1.4.jar from gCore/lib
- replace the library hsqldb-1.8.jar with the library hsqldb-2.2.8.jar in gCore/lib
Additional Installation Steps
- create a suitable R environment[1]
- download the file following file gebco under /home/gcube/gCore/etc/statistical-manager-service-full-XXX/cfg and rename it as gebco_08.nc
- copy the gcube keys under /home/gcube/gCore/etc/statistical-manager-service-full-XXX/cfg/PARALLEL_PROCESSING
Services and Databases used by the Statistical Manager and Data Analysis facilities
GHN
gcube@statistical-manager1.d4science.org
gcube@statistical-manager2.d4science.org
gcube@statistical-manager3.d4science.org
gcube@statistical-manager4.d4science.org
gcube2@statistical-manager.d.d4science.org
TOMCAT
(root user)
thredds.research-infrastructures.eu
wps.statistical.d4science.org
rstudio.p.d4science.research-infrastructures.eu
geoserver.d4science.org
geoserver2.d4science.org
geoserver3.d4science.org
geoserver4.d4science.org
geoserver-dev.d4science-ii.research-infrastructures.eu
geoserver-dev2.d4science-ii.research-infrastructures.eu
geonetwork.geothermaldata.d4science.org
geonetwork.d4science.org
THIRD PARTY SERVICES
(root user)
rstudio.p.d4science.research-infrastructures.eu (sw rstudio, command: rstudio-server restart)
DATABASES
(root user)
geoserver-db.d4science.org
node49.p.d4science.research-infrastructures.eu
biodiversity.db.i-marine.research-infrastructures.eu
db1.p.d4science.research-infrastructures.eu
db5.p.d4science.research-infrastructures.eu
dbtest.research-infrastructures.eu
dbtest3.research-infrastructures.eu
geoserver.d4science-ii.research-infrastructures.eu
geoserver2.i-marine.research-infrastructures.eu
geoserver-db.d4science.org
geoserver-test.d4science-ii.research-infrastructures.eu
node50.p.d4science.research-infrastructures.eu
node49.p.d4science.research-infrastructures.eu
node59.p.d4science.research-infrastructures.eu
obis2.i-marine.research-infrastructures.eu
statistical-manager.d.d4science.org
WORKER NODES
(gcube2 user)
(production)
node3.d4science.org
node4.d4science.org
node11.d4science.org
node12.d4science.org
node13.d4science.org
node14.d4science.org
node15.d4science.org
node16.d4science.org
node18.d4science.org
node20.d4science.org
node21.d4science.org
node23.d4science.org
node27.d4science.org
node28.d4science.org
node29.d4science.org
node30.d4science.org
node31.d4science.org
node32.d4science.org
node33.d4science.org
node34.d4science.org
node35.d4science.org
node36.d4science.org
node37.d4science.org
node38.d4science.org
node39.d4science.org
(development)
node17.d4science.org
node19.d4science.org
node22.d4science.org
TESTING
Test plan for the Statistical Manager.
SDI / GIS Technologies
This section describes the configuration of gCube Spatial Data Infrastructure (SDI), responsible for handling GIS technologies and (meta)data. It comprises various technologies, both from gCube and from third party developers.
A brief summary :
- gCube technologies
- sdi-service : utility service for the management of SDI configuration in a context
- geonetwork : library for the interaction with GeoNetwork service
- gis-interface : library for the publication of dataset and related metadata
- ws-thredds [Deprecated] : library for the synchronization of a StorageHub folder with Thredds
- Gis -Viewer : GUI for the rendering of layers
- GeoExplorer : GUI for the browsing of metadata in GeoNetwork
- Third parties technologies
- GeoNetwork is used in contexts where ISO Metadata needs to be managed
- GeoServer is used for the registration of GIS datasets in certain formats (e.g. Shape files)
- Thredds is used for the registration of GIS datasets in certain formats (e.g. netcdf)
NB:
In order to handle GIS Technologies, developers should rely on java libraries geonetwork and gis-interface, both distributed under subsystem org.gcube.spatial.data. Please note that lots of scenarios do not involve java gCube libraries, so they directly contact third party services after getting context configuration from sdi-service.
For administration purposes, please note that reports on the current SDI configuration can be obtained by contacting the gCube sdi-service interfaces :
gCube Software
In this section we describe the IS resources needed by specific usages of gCube SDI software.
SDI Service
This service aim is to manage the available third party GIS technologies in the VRE, so it rely on their proper registration. Please refer to #Third party technologies for more details.
geonetwork library
Geonetwork library rely on the presence of a GeoNetwork service in the context. Please refer to #GeoNetwork Service for further details.
Metadata Publication
In cases where geonetwork library is used to generate ISO metadata, the following Generic Resource must be defined in the current context and filled with common/default metadata values.
- Secondary Type : ISO
- Name : MetadataConstants
ISO Metadata are published with a resolver http link generated by "Uri Resolver Manager", so this needs to be configured with a Generic Resource with the following coordinates :
- Secondary Type : UriResolverMap
- Name : Uri-Resolver-Map
gis-interface library
Gis-interface publishes (meta)data in the gCube SDI. It is built on top of #geonetwork library so it needs to be preperly configured. It also uses a GeoServer service in the context as repository, so such service should be configured.
GeoExplorer
In order to let GeoExplorer portlet work fine, you must copy the resources following from root scope (/d4science.research-infrastructures.eu/) to the VRE where it must run:
- Transect
<Type>RuntimeResource</Type> <Caegory>Application</Category> <Name>Transect</Name>
- Gis Resolver
https://gcube.wiki.gcube-system.org/gcube/URI_Resolver#GIS_Resolver
<Type>RuntimeResource</Type> <Category>Service</Category> <Name>Gis-Resolver</Name>
- Gis Viewer Application
<Type>GenericResource</Type> <SecondaryType>ApplicationProfile</SecondaryType> <Name>Gis Viewer Application</Name>
and then must edit the Generic Reosurce shown here: https://gcube.wiki.gcube-system.org/gcube/URI_Resolver#Generic_Resource_for_Gis_Viewer_Application
Third party technologies
gCube SDI heavily rely on third party GIS services capabilities in order to handle GIS (meta)data. Each of these services have specific configuration needs that should be adressed in provisioning rules, so they go beyond the scope of this page.
In this section we describe what IS resources are needed in a gCube context in order to declare the availablility of these services.
gCube SDI software will use these resources for the discovery and further management of these third-party service instances.
GeoNetwork Service
GeoNetwork services are registered in a gCube context with a Service Endpoint with the following coordinates:
- Category : Gis
- Platform/Name : geonetwork
NB The resource is expected to define credentials for admin user under an access point with the following characteristics (you can find more details here):
- Endpoint EntryName = geonetwork
- property priority (integer value)
- property suffixes (leave empty or blank)
GeoServer Service
GeoServer services are registered in a gCube context with a Service Endpoint with the following coordinates:
- Category : Gis
- Platform/Name : GeoServer
NB The resource is expected to define credentials for admin user under an access point with the following characteristics:
- Endpoint EntryName = geoserver
Thredds Service
Thredds services are registered in a gCube context with a Service Endpoint with the following coordinates:
- Category : Gis
- Platform/Name : thredds
=Tabular Data Manager (DISMISSED)=
Each service's operation may need a specific configuration. The following is a list of needed resources per operation module.
Operation View
The module requires GIS Technologies to be already configured in the operating scope. See Gis Technologies.
The module requires also the following Generic Resource :
- Secondary Type : TDMConfiguration
Since the operation needs to put data in a postgis database already connected with Geoserver, a Service Endpoint for such database must be present in the same scope. Constraints for retrieving such Service Endpoint are taken from the Generic Resource described above (values are indicated with their xml Element name as declared in the Generic Resource's body) :
- Category : <gisDBCategory>
- Platform/Name : <gisDBPlatformName>
- AccessPoint/<tdmDataStoreFlag> : true
Resource Catalogue
The steps needed to have a working catalogue running in a given scope, namely a VRE, are the following:
- add the CKAN Services to the scope (this step can be avoided if you use the gcat configuration APIs as described below)
- add the CKAN Data Catalogue Service Endpoint to the scope (see CKan Data Catalogue instance)
- add the CKAN Database Service Endpoint to the scope (see CKan Database);
- (optional) add the Zenodo Service Endpoint (see Zenodo API);
- add the gCat Service to the scope;
- Add the scope in file
d4science-smartgears-services/group_vars/gcat_service_production/gcat_service_production.yml
inansible-playbook
- Run the playbook as following:
-
./run.sh gcat.yml -i inventory_production/hosts.openstack_isti_production -l gcat_service -e 'gcube_admin_token=<TOKEN>' -t smartgears_conf
The problem with this command is that it starts a workflow on conductor for each scope defined ind4science-smartgears-services/group_vars/gcat_service_production/gcat_service_production.yml
. - The alternative is invoking this command
./run.sh gcat.yml -i inventory_production/hosts.openstack_isti_production -l gcat_service_production -e "gcube_admin_token=<TOKEN>" -e 'smartgears_conductor_scope=<NEW_SCOPE>' --tags=smartgears_conf
Please note that: You must provide as input parametersmartgears_conductor_scope
with the scope to be added. Please note that the scope must also be added tod4science-smartgears-services/group_vars/gcat_service_production/gcat_service_production.yml
inansible-playbook
as explained in the previous step. You must provide the tagsmartgears_conf
. The playbook role invokes the conductor also with tagsmartgears_conf
. This has been added to avoid adding a scope in the node without invoking the conductor role; Please note that the conductor invocation runsadd_workspace_client_to_context
workflow to enact gCat to interact with the workspace (eg for storing resources).
-
- configure gCat by
/configurations
(see gCat Configuration API); This step allows us to avoid adding the resource to the scope as described in the first step (apart the Zenodo part). The best way to create the configuration is by reading the configuration from another VRE which uses the same Ckan instance. The obtained configuration must be copied and changed in the parts it differs e.g thedefault_organization
. Please note you must be Catalogue-Manager to be allowed to create/change the configuration.
- Add the scope in file
- add the CKAN Connector to the scope (see CKAN Connector);
- add the URI Resolver Map Generic Resource to the scope (see Uri-Resolver-Map);
- create the CKan Portlet Generic Resource with the URL hosting the catalogue in the VRE (see CKan Portlet resource)
- (to configure a catalogue at VO or root VO level) configure the DataCatalogueMapScopesUrls Generic Resource with the URL hosting the catalogue in the VRE (see #DataCatalogueMapScopesUrls);
- (automatic) configure the Catalogue Resolver Generic Resource (see Catalogue-Resolver resource);
- (automatic) configure the Catalogue Generic Resource used by the social service (see Catalogue Resource);
- (automatic) configure the News Feed Generic Resource (see #News Feed & Catalogue);
- (optional) define any namespace needed to group extra fields (see Namespaces Resource);
- (optional) define the mappings driving the publishing to Zenodo
- ...
CKAN Connector
ServiceClass = DataAccess ServiceName = CkanConnector
This is the service that allows to perform login operation from the Gateways on CKAN. It runs on SmartGears so once it is published in the context there is no much left to do. However, it is fundamental.
Generic Resource
The following Generic Resources impact on the Catalogue Service behaviour.
CkanPortlet: this is the Portlet URL
SecondaryType = ApplicationProfile Name = CkanPortlet Description = The url of the gcube-ckan-datacatalog portlet for this scope
The content (body) of the resource has to report the url of the catalogue portlet for this context (VRE), e.g.
<url>https://services.research-infrastructures.eu/group/d4science-services-gateway/data-catalogue</url>
Catalogue-Resolver
SecondaryType = ApplicationProfile Name = Catalogue-Resolver Description = Used by Catalogue Resolver for mapping VRE NAME with its SCOPE so that resolve correctly URL of kind: https://[CATALOGUE_RESOLVER_SERVLET]/[VRE_NAME]/[entity_context value]/[entity_name value]
See wiki page at: CATALOGUE_Resolver
NOTE: the resource is automatically updated by the Catalogue Resolver
Catalogue
Update this configuration at ROOT VO level. It is used by social to support gCat social notifications/posts properly. Temporary solution: automatically updated by Catalogue Portlet accessing to portlet deployed in a new VRE
SecondaryType = ApplicationProfile Name = Catalogue Description = This is the Item Catalogue application profile for alerting items creation in the infrastructure catalogues <Body><AppId>service-account-gcat</AppId>...
The above Generic Resource stored at ROOT VO level must be updated by adding an entry of kind:
<EndPoint> <Scope>[THE SCOPE]</Scope> <URL>[THE PORTLET URL TO THE GATEWAY IN ACT FOR THE SCOPE]</URL> </EndPoint>
e.g. for /d4science.research-infrastructures.eu/D4OS/EOSCPillarServiceRegistry
<EndPoint> <Scope>/d4science.research-infrastructures.eu/D4OS/EOSCPillarServiceRegistry</Scope> <URL>https://eosc-pillar.d4science.org/group/eoscpillarserviceregistry/catalogue</URL> </EndPoint>
for the SCOPE where the gCat has been added
CKan to Zenodo Mappings
A set of generic resources with SecondaryType = Ckan-Zenodo-Mappings is expected in order to enable the upload to Zenodo. Since each of these generic resources maps a precise CKAN item profile, The required set may vary depending on the VRE. The user requesting the VRE creation is expected to specify the minimum set of these resources to be registered in the context.
DataCatalogueMapScopesUrls
SecondaryType = ApplicationProfile Name = DataCatalogueMapScopesUrls Description = EndPoints that map url to scope for the data catalogue portlet instances
This resource is deployed at root level. It contains a list of "exceptions", i.e. how to manage catalogues at VO or root VO level.
DataCatalogueNamespace
SecondaryType = DataCatalogueNamespace Name = Namespaces Catalogue Categories Description = This resource defines namespaces for the catalogue categories
This resource has been created at root level. To allow gcat to properly works must be added into every scopes where is present gcat.
Ckan
The organization to be assigned to the context must be created on Ckan via gCat by using Create Organization API.
Only a Catalogue-Manager (see Catalogue Roles can create an organization.
Please note that only if gcat has already been added to the context and properly configured can it create the organization properly.
If you don't create the organization, gCat will not be able to manage items.
You can check if gCat is properly configured and which is the configuration by using REad Catagloue Configuration API.
Catalogue Badge
Update the following GR at ROOT VO level. It is used by Catalogue Badge
SecondaryType = ApplicationProfile Name = DataCatalogueMapScopesUrls Description = EndPoints that map url to scope for the data catalogue portlet instances
You need to add an entry of kind:
<EndPoint> <Scope>[THE SCOPE]</Scope> <URL>https://[GATEWAY-HOSTNAME]/group/[GATEWAY-NAME]-gateway</URL> </EndPoint>
e.g. for /d4science.research-infrastructures.eu/SoBigData/TerritoriAperti
<EndPoint> <Scope>/d4science.research-infrastructures.eu/SoBigData/TerritoriAperti</Scope> <URL>https://territoriaperti.d4science.org/group/territoriaperti-gateway</URL> </EndPoint>
you need to add the above entries for the Gateway https://territoriaperti.d4science.org where the Catalogue Badge is in action
The URL https://territoriaperti.d4science.org/group/territoriaperti-gateway is built and used by ckan-util-library to get the VRE SCOPE (i.e. /d4science.research-infrastructures.eu/SoBigData/TerritoriAperti)
String clientURL = gatewaySiteURL+siteLandingPage; String appPerScopeURL = ApplicationProfileScopePerUrlReader.getScopePerUrl(clientURL);
needed to discover at VRE level the property `SOLR_INDEX_ADDRESS` stored into SeviceEndpoint `CKanDataCatalogue`
Catalogue For GRSF
Update this configuration at ROOT VO level. This resource is used only to support GRSF social posts
SecondaryType = ApplicationProfile Name = Catalogue Description = This is the Item Catalogue application profile for alerting items creation in the infrastructure catalogues <Body><AppId>org.gcube.datacatalogue.ProductCatalogue</AppId>...
The above Generic Resource stored at ROOT VO level must be updated by adding an entry of kind:
<EndPoint> <Scope>[THE SCOPE]</Scope> <URL>[THE PORTLET URL TO THE GATEWAY IN ACT FOR THE SCOPE]</URL> </EndPoint>
e.g. for /d4science.research-infrastructures.eu/D4OS/EOSCPillarServiceRegistry
<EndPoint> <Scope>/d4science.research-infrastructures.eu/FARM/GRSF_Admin</Scope> <URL>https://i-marine.d4science.org/group/grsf_admin/data-catalogue</URL> </EndPoint>
for the SCOPE where the Catalogue has been added.
Service Endpoint(s)
The following service endpoints are needed by the Service Catalogue to work.
CKanDataCatalogue
Category = Application Name = CKanDataCatalogue Description = A Tomcat Server hosting the ckan data catalogue
Among the other properties of the SE, these should be reported:
- HostedOn (in RunTime) is the url of the ckan instance, e.g. ckan-d4s.d4science.org;
- Username (in AccessData) is the username of the CKAN SYSAdmin;
- Property URL_RESOLVER, whose value is equal to the url of the URI-RESOLVER in the context;
- Encrypted property API_KEY, is the api key of the CKAN SYSAdmin;
- SOCIAL_POST: (true/false) instruct gCat to create the social post in the VRE. If this property is not present it is assumed as false. The value can be overridden by the gCat client on the item creation request.
- ALERT_USERS_ON_POST_CREATION: (true/false) instruct gCat to request to social service if notify users about the generated social post. If this property is not present it is assumed as false.
CKanDatabase
Category = Database Name = CKanDatabase Description = A Postgres Server hosting the ckan database
Among the other properties of the SE, these should be reported:
- HostedOn (in RunTime) is the machine hosting the postgres CKAN uses (e.g. ckan-pg-d4s.d4science.org);
- EndPoint (in AccessPoint) is the machine URL hosting the postgres CKAN uses followed by the port number (e.g., ckan-pg-d4s.d4science.org:5432);
- In AccessData please report the credentials (password must be encrypted) of the user allowed to access the database.
Please note that gCat requires to dial with postgres, hence the gCat host must be enabled on postgres installation
Zenodo API
Category = Repository Platform.Name = Zenodo
A service endpoint defining the Zenodo API address and credentials is expected in order to enable the "Upload to Zenodo" feature. Credentials may vary depending on the context.
Enable view per VRE
In order to enable this special view (which allows the catalogue portlet to render itself on a single organization), one should access the portal and as administrator enable a special custom field of the VRE. The custom field can be found, on the VRE Page, under "Admin > Pages > Configuration > Site Settings > Custom Field". Set it to true to enable the view.
gCat & SHUB & Catalogue
The GCat_Service must be authorized to operate in the VRE. Add the "gCat" user in the VRE as reported at https://gcube.wiki.gcube-system.org/gcube/StorageHub_REST_API#Add_User_To_Vre (this is a temporary solution it will be replaced by WORKFLOW).
gCat & Uri-Resolver-Manager
In order to operate properly in a VRE the GCat_Service uses the Uri-Resolver-Manager, so you check if the required GR Uri-Resolver-Map is published at the VRE level
News Feed & Catalogue
When you have added all GR/RR to serve a VRE with an instance of D4Science Catalogue, in order to be able to publish social posts via Social Networking Library must be added an entry of kind:
<EndPoint> <Scope>[THE SCOPE]</Scope> <URL>[THE RELATIVE URL OF SCOPE SAVED IN THE GATEWAY]</URL> </EndPoint>
e.g.
<EndPoint> <Scope>/d4science.research-infrastructures.eu/D4OS/EOSCPillarServiceRegistry</Scope> <URL>/group/eoscpillarserviceregistry</URL> </EndPoint>
into the following Generic Resource:
SecondaryType = ApplicationProfile Name = News Feed
published at ROOT VO level.
News Feed & gCat
see at: DataCatalogueNamespace
Accounting Dashboard & Catalogue
see the wiki page at Add_Google_Analytics_to_the_Accounting_Dashboard
SocialNetworking service
see the wiki page at Social Networking Library
Known Issues
The socialnetworking service must be restarted when liferay is up
=GFeed (ABANDONWARE / DISMISSED)=
The following is a list of minimal requirements for the execution of gFeed Service.
- Database : the service needs a dedicated DB for its logic and looks in the current context for a DB registered as Service Endpoint with
- Category : Database
- Name : Feeder_DB
- Common configuration : the service loads default plugins configurations from the IS by lookig for a Generic Resource registered as
- Secondary type : configuration
- Name : gcat-feeder
The following parameters need to be customized for every context in which the resource is pubilshed :
- DATAMINER_ALGORITHMS_COLLECTOR.GUI_BASE_URL : Expected value is the full url of the DataMiner GUI, i.e. https://aginfra.d4science.org/group/aginfraplusdev/data-miner
Please keep in mind that depending on deployed plugins these requirements may not be enough.
GeoPortal
The following instructions are meant in order to configure the "Geoportale Nazionale per l'Archeologia".
Interactions among the engines.
NB : Please note that actual requirements for a specific service instance vary depending on which plugins extensions are deployed in that particular instance.
Health Checks
Three types of health checks are implemented: Service, Mongo, Database. They are contactable via REST API whose responses are compliant to the https://microprofile.io/specifications/microprofile-health/ specification.
Service health : checks if the `geoportal` service is up at
https://{geoportal_endpoint}/geoportal-service/srv/health
e.g. In DEV at https://geoportal.cloud-dev.d4science.org/geoportal-service/srv/health
Mongo health : checks if the `geoportal` service is able to communicate with MongoDB instance at
https://{geoportal_endpoint}/geoportal-service/srv/health/mongo?context={GCUBE_CONTEXT} or (to include the collections) https://{geoportal_endpoint}/geoportal-service/srv/health/mongo?context={GCUBE_CONTEXT}&include_collections=true
eg. In DEV at https://geoportal.cloud-dev.d4science.org/geoportal-service/srv/health/mongo?context=/gcube/devsec/devVRE or (to include the collections) https://geoportal.cloud-dev.d4science.org/geoportal-service/srv/health/mongo?context=/gcube/devsec/devVRE&include_collections=true
Database health : checks if the `geoportal` service is able to communicate with PostGIS instance at
https://{geoportal_endpoint}/geoportal-service/srv/health/database?context={GCUBE_CONTEXT}
eg. In DEV at https://geoportal.cloud-dev.d4science.org/geoportal-service/srv/health/database?context=/gcube/devsec/devVRE
Service Requirements
This section states the resources needed by a vanilla geoportal-service instance (meaning no extension is deployed). Specific requirements of plugins extensions are reported in dedicated subsections.
Document Store
Geoportal service relies on a mongoDB instance registered in the gCube IS with the following coordinates :
- Profile/Category : Database
- Profile/Platform/Name : mongodb
- Profile/AccessPoint//Property/Name : GNA_DB
- Profile/AccessPoint//Property/Value : internal-db
Fileset Archive
Geoportal service relies on gCube StorageHub in order to archive registered Filesets. Please refere to specific section in this page.
UCDs
Current UCD provider implementation relies on a Generic Resource with the following coordinates in order to assess the available UCDs in a VRE :
- Secondary Type : CMS
- Name : UCDs
It is expected to declare links to UCD documents, which they are then loaded into the application. It's body must be like the following example :
<UCDs> <record label=.. ucdUrl=... /> <record label=.. ucdUrl=... /> ... </UCDs>
NB : this resource is strictly dependant on the context, since it declares supported projects collections
Service Plugins Requirements
In this section we report specific requirements introduced by plugins. Please note that plugins are optionally deployed in geoportal-service, so the following resources are needed only in specific cases.
SDI Plugins
NB : SDI Plugins exploit gCube SDI Resources available in the current context. In order to do this, the SDI should be properly configured in the context.
In summary :
- SDI Materializer uses a Geoserver instance enabled with gCube Data Transfer service.
- SDI Indexer uses a postgisDB and Geoserver
SDI Materializer
The SDI Materializer handles registered Filesets by creating layers in the context SDI's Geoserver. In order to do this, in most cases gCube Data Transfer service must be enabled in GeoServer. Please refer to dedicated sections on SDI', GeoServer and gCube Data Transfer for further details.
SDI Indexer
SDI Indexer creates Geoserver layers in the current context SDI, representing centroids of some projects registered in geoportal-service. In order to do this, it needs a 'postgis database registered in the IS as a Service endpoint with the following coordinates :
- Profile/Category : Database
- Profile/Platform/Name : postgis
- Profile/AccessPoint//Property/Name : GNA_DB
- Profile/AccessPoint//Property/Value : Concessioni
NB The postgis database will be registered in GeoServer by the plugin, so the database should be reachable from GeoServer.
Please refer to dedicated sections on SDI and GeoServer.
Notification Plugins
The plugin requires a proper configuration in the UCD. Please refer to notifications-plugins to create it.
Other requirements:
1. a service account for geoportal named service-account-geoportal
must be created at VRE level in KC (see more at https://support.d4science.org/issues/27108)
2. the clientId and the secret of the service-account-geoportal
must be registered in the SE with coordinates:
Name: geoportal Category: SystemWorkspaceClient
with the AccessPoint
<AccessPoint> <Description>service account credentials</Description> <Interface> <Endpoint EntryName="geoportal">none</Endpoint> </Interface> <AccessData> <Username>geoportal</Username> <Password>{ADD HERE THE SECRET}</Password> </AccessData> </AccessPoint>
3. in order to send VRE post via social service, the service-account-geoportal
requires a generic resource with coordinates:
Name: Geoportal SecondaryType: ApplicationProfile
and body
<AppId>service-account-geoportal</AppId> <ThumbnailURL>https://data.d4science.org/shub/E_OS9QOE5zcVl6UXJCcEsvUUFhMWFTRXY1OXh6TXhFbEplOERhNGhaZ1RLV1VBblErY3lxQW5RbXMrVEM1WC9UQQ==</ThumbnailURL> <EndPoint> <Scope>{SCOPE}</Scope> <URL>{GEOPORTAL_DATA_ENTY_URL_IN_THE_SCOPE}</URL> </EndPoint>
please refer to https://support.d4science.org/issues/27108
Catalogue Binding Plugins
The plugin requires a proper configuration in the UCD. Please refer to catalogue-binding-plugin to create it.
Other requirements:
1. a service account for geoportal named service-account-geoportal
is required at VRE level in KC (see 1. of the #Notification_Plugins)
2. the service-account-geoportal
must be able to operate with gCat at VRE level, so it must have the role of "Catalogue-Admin". Please assign it via KC.
Mapping from "Geoportal Project" to "Catalogue Dataset" for any UCD
- the plugin uses an apache freemaker (https://freemarker.apache.org/) template to transform a (JSON) Geoportal project to (JSON) Catalogue Item object. Template link must be added in the plugin configuration;
- a catalogue profile with type="UCD Title" must be created (per UCD) via gCat at VRE level (ex. see https://support.d4science.org/issues/27689#note-2)
Geoportal-Service requires the Geoportal_Resolver
The Geoportal-Service requires the Geoportal_Resolver published at VRE level. See the Geoportal_Resolver dependencies at Geoportal_Resolver or here ticket
NB. The Geoportal_Resolver requires:
- the "URI-Resolver" (gCoreEndpoint) published at VO level with option "authorizeChild"
- the Runtime Resource named "HTTP-URL-Shortener-DL" at VRE level
Export (as PDF) Requiriments
The Geoportal system allows the Export as PDF facivility if properly configured in the scope. In order to configure it, see at:
- https://support.d4science.org/issues/27308
- https://support.d4science.org/issues/26027
- https://support.d4science.org/issues/26062
GUI Requirements
The SE with the following coordinates has to be added in the proper VRE:
<Category>Service</Category> <Name>HTTP-URL-Shortener-DL</Name>
For geoportal-data-viewer-app:
The Service Endpoints with the following coordinates have to be added in the proper VRE.
1 - It is used by GNA Viewer to retrieve the list of base maps that should be displayed in the Viewer:
<Category>Application</Category> <Name>GNABaseMaps</Name>
2 - It is used by GNA Viewer to contact the Geoportal Service with guest/public access (from out of portal, no login required).
<Category>SystemClient</Category> <Name>geoportal-data-viewer-app</Name> <Description>IAM Client for geoportal-data-viewer-app</Description>
Generic Resources
For geoportal-data-entry-app:
1. All Generic Resources with
<SecondaryType>GeoNaMetadata</SecondaryType>
or
<SecondaryType>GeoportalMetadata</SecondaryType>
must be copied in the proper VRE. They are used by 'geoportal-data-entry-app' portlet (e.g. 'GNA-Data-Entry' in the context of GNA) to build dynamically the web-forms for data entries.
2. The Generic Resource with coordinates:
SecondaryType: ApplicationProfile Name: Geoportal-DataEntry-Configs
must be copied in the proper VRE. It is used by 'geoportal-data-entry-app' portlet (e.g. 'GNA-Data-Entry' in the context of GNA) to read the configurations: (i) the permissions on the operations for the roles (Data-Member, Data-Editor, Data-Manager), (ii) list of fields used by the searching facility.
For geoportal-data-viewer-app:
3. The Generic Resource (renamed from GeoNa-Viewer-Profile) with the following coordinates:
<SecondaryType>ApplicationProfile</SecondaryType> <Name>Geoportal-DataViewer-Configs</Name> /*having in the body the following AppId*/ <AppId>geoportal-data-viewer-app</AppId>
has to be copied in the proper VRE. Used by 'geoportal-data-entry-app' and 'geoportal-data-viewer-app' portlets to read several configurations: (i) common info like portlet URLs in the VRE, (ii) the URL of the centroid layer/s, (iii) list of fields used by the searching facility and so on
4. The Generic Resource named "Namespaces Catalogue Categories" must be added in the proper VRE, it is required for the Metadata Form Builder:
see at https://gcube.wiki.gcube-system.org/gcube/ServiceManager_Guide#DataCatalogueNamespace
Resolvers
These are the resources that must be updated when changing the URI-Resolver balancer and/or its hostname:
ServiceEndpoints:
<Category>Service</Category> <Name>HTTP-URI-Resolver</Name>
<Category>Service</Category> <Name>Gis-Resolver</Name>
<Category>Application</Category> <Name>Transect</Name>
<Category>Application</Category> <Name>CKanDataCatalogue</Name>
<Category>Service</Category> <Name>Analytics-Resolver</Name>
Generic Resources:
<SecondaryType>ApplicationProfile</SecondaryType> <Name>Workspace-Explorer-App</Name>
<SecondaryType>ApplicationProfile</SecondaryType> <Name>Gis Viewer Application</Name>