Data Sources

From Gcube Wiki
Jump to: navigation, search

CQL-enabled Data Sources

The information hosted in a gCube infrastructure is partitioned among different Data Sources. All Data Sources build on a common interface, defined by the Contextual Query Language(CQL), and are able to execute CQL queries. For example consider the CQL query:

 Q : '((title any "marine polution") AND (type = "report")) NOT (location within "Europe")'. 

The results of Q are the report documents that have a title which contains any of the words "marine" and "polution", and define a location outside Europe. Each Data Source implements only a subset of all the CQL standards defined in the CQL Context Set 1.2. Depending on the underlying technologies of each Data Source, an efficient implementation of various CQL standards is impracticable in some cases. Consequently, different Data Sources support different CQL standards.

During the Planning process, it is necessary to detect which of the Data Sources must be involved and provide their contribution to the final outcome of the query. The Search Planner needs to take into consideration the CQL capabilities of each Source and the information they host. The search space explored by the Planner comprises all the, equivalent to the initial, queries and the corresponding alternative plans. These alternatives may involve different sets of Data Sources, but still generate the same final outcome for the initial query.

Currently there are four types of CQL-enabled Data Sources in the gCube System. In the following sections you will find information about the design choices and the approach followed for implementing the CQL standards, in each type of Data Source: