Difference between revisions of "Data Assessment, Harmonisation, and Certification Facilities"

From Gcube Wiki
Jump to: navigation, search
m (Key Features)
m (Key Features)
Line 18: Line 18:
  
 
;support for data curation and enrichment
 
;support for data curation and enrichment
:mechanisms for enriching species occurrence data with environmental data acquired by various data providers
+
:mechanisms for enriching species occurrence data with environmental data dynamically acquired by data providers
 +
:data provenance tracking
  
 
;standard-based data presentation
 
;standard-based data presentation

Revision as of 17:29, 18 May 2012

Overview

gCube is a software suite equipped with a rich array of services capable to interface with data sources having different characteristics both in terms of data types these sources offers (e.g. from document data, to statistical, biodiversity, and semantic data - see Data Access and Storage Facilities) and the heterogeneity of data belonging to the same type.

The goal of the Data Assessment, Harmonisation, and Certification Facilities is to deal with the above heterogeneity and provide unified views over diverse data items through a number of dedicated services. To meet this goal a number of components have been designed.

This page outlines the design rationale and high-level architecture of such components.

Key Features

The components part of the subsystem provide the following main key features:

workflow-oriented tabular data manipulation
user-defined definition and execution of workflows of data manipulation steps
comprehensive reference-data management support
uniform model for reference-data representation including versioning and provenance
support for data curation and enrichment
mechanisms for enriching species occurrence data with environmental data dynamically acquired by data providers
data provenance tracking
standard-based data presentation
OGC standard-based Geospatial data presentation

*** PREVIOUS FEATURES ***

Plug-able data consumption services
the data consumption service follow a plugin architecture where the plugin provide access to a type of datasource
Component reuse
Components are designed to be reused in different services
Geospatial data production in OGC standard format
Geospatial data can be returned to invoking clients in a OGC standard format
Species Occurrence points enrichment harmonization and reconciliation
species occurrence points can be associated to environmental data. Moreover they can be merged when coming from different data sources.
Single access point for geospatial data retrieval
A single access point service can be used to retrieve geospatial data.
General purpose tabular data processing
A set of services provide a tabular data flow mechanism and a set of components for tabular data visualization.
Data mining operation on species data
Occurrence data can be processed, clustered and hidden information can be extracted by means of data mining operations.

Main Components

Tabular Data
this family of components provides:
Time Series
this family of components provides:
  • Time Series: a service for performing assessment and harmonization on time series.
  • Codelist Manager: a library for performing import, harmonization and curation on code lists.
Biodiversity Data
this family of components provides: