GCube Web Specifications and Standards Compliance

From Gcube Wiki
Revision as of 09:04, 27 September 2012 by Rena.tsantouli (Talk | contribs)

Jump to: navigation, search

This area collects the Standard Specifications supported by the gCube system APIs, as part of the WP11 activities and towards meeting the integration and interoperability objectives for promoting the openness of the e-Infrastructure to other neighbouring and external ones. The collection focuses on the widely used, HTTP-based Specifications and generic interchange protocols (data/content standards, metadata standards, Web interface standards, security standards, data sharing protocols) that service both disseminating and consuming system's needs. This analysis is conducted per functional category and addresses the use, need and relevance of the standards that fall under each gCube functional area.

Table of Protocols

Specification label Functional area Direction Adoption Status
OAI-PMH (Producer) Data Consumption Producer Completed
OAI-ORE (Producer) Data Consumption Producer Completed
OpenSearch (Consumer) Data Consumption Consumer Completed
OpenSearch (Producer) Data Consumption Producer Completed
SRU (Consumer) Data Consumption Consumer Planned


  • Functional areas: Data Consumption / Data Production / Computation Consumption / Infrastructure Management
  • Direction: Producer / Consumer
  • Adoption Status: Completed / On going / Planned

OAI-PMH

Specification Description

The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a well-established standard in the content management and library science worlds that is gaining in importance. It provides a low-barrier mechanism for repository interoperability and defines the following parties and software components:

  • Data Providers are repositories that expose structured metadata via OAI-PMH. A 'Data Provider' such as an academic library runs a Repository that supports OAI-PMH as a means of exposing metadata information about resources, for instance academic publications.
  • Service Providers then make OAI-PMH service requests to harvest that metadata. A 'Service Provider' uses Harvester software to harvest metadata from Data Providers. The harvested metadata can then be used to provide valued-added services, such as a website that allows browsing and searching through their catalog.

OAI-PMH is a set of six verbs or services that are invoked within HTTP. An implementation of OAI-PMH must support representing metadata in Dublin Core, but may also support additional representations.

gCube Use/Need/Relevance

Through OAI-PMH protocol, gCube infrastructure acts as a 'Data Provider' and disseminates the hosted metadata records in a standard fashion, thus allowing for interoperation with other data e-Infrastructures that run autonomously. Other infrasturctures can harvest the metadata descriptions of gCube content in archives so that their services can exploit the collections. The protocol provides an application-independent interoperability framework for metadata exchange between the online parties.

Functional Category

Data Consumption

Direction

  • Producer

gCube Adoption Status

The protocol has already been integrated in the gCube system, from the 'Data Provider' perspective. The description of the adopted methodology towards the integration is described here.

Related gCube components

  • aslHttp OAI_PMH: the http front end for the protocol
  • applicationSupportLayer_OAI_PMH: business logic back-end component for the protocol

References


OAI-ORE (Producer)

Specification Description

Open Archives Initiative Object Reuse and Exchange (OAI-ORE) defines standards for the description and exchange of aggregations of Web resources. These aggregations, sometimes called compound digital objects, may combine distributed resources with multiple media types including text, images, data, and video. The goal of these standards is to expose the rich content in these aggregations to applications that support authoring, deposit, exchange, visualization, reuse, and preservation. Although a motivating use case for the work is the changing nature of scholarship and scholarly communication, and the need for cyberinfrastructure to support that scholarship, the intent of the effort is to develop standards that generalize across all web-based information.

In the physical world we create, use, and refer to aggregations of things all the time. We collect pictures in a photo album, read journals that are collections of articles, and burn CDs of our favorite songs. This practice of aggregating extends to the Web. We accumulate URL's in bookmarks or favorites lists in our browser, collect photos into sets in popular sites, browse over multiple page documents that are linked together through "prev" and "next" tags, and talk about Web sites as if they had some real existence beyond the set of pages of which they consist. Despite our frequent use of these aggregations, their existence on the Web is quite ephemeral. One reason for this is that there is no standard way to identify an aggregation. We often use the URI of one page of an aggregation to identify the whole aggregation. For example, we use the URI of the first page of a multi-page Web document to identify the whole document, or we use the URI of the HTML page that provides access to a collection of images to identify the entire set of images. But those URIs really just identify those specific pages, and not the union of pages that makes up the whole document, or the union of all images in a Flickr set, respectively. In essence, the problem is that there is no standard way to describe the constituents or boundary of an aggregation, and this is what OAI-ORE aims to provide.

gCube Use/Need/Relevance

gCube Content Model aims to provide high-level functionality for manipualtion of content over the Grid-based environments. Content in gCube is stored and organized following a graph-based data model, the Information Object Model, that allows finer control of content, by incorporating the possibility to annotate content with arbitrary properties an to relate different content unities via arbitrary relationships.

Starting from this model a document model has been built, in which complex documents, composed of various, eventually nested subparts, are represented as chains of Information Objects linked via appropriate relationships. For instance, an HTML document that includes a number of images may be modelled as a complex object that provides references to Information Objects (containing the images). In this respect, gCube documents are managed as compound objects comprising metadata, annotations, alternative representations and multiple parts. The notion of gCube documents is implemented and mangaged by the gCube Information Organisation Services family of subsystems that include storage services, access services, plugins and a number of distinguished clients that can be internal or external to the system.

The aggregated information that constructs a gCube document can be transfered through the solution provided by OAI-ORE, without the need for clients to rely on the API's of the individual system architectures and their definition of document boundaries. The gCube ORE Provider allows the dissemination of the digital objects stored in gCube repository as OAI-ORE Resource Maps.

Functional Category

Data Consumption

Direction

Producer

gCube Adoption Status

The protocol has been recently integrated in the gCube System, from the producer perspective. The description of the adopted methodology towards the adoption and implementation is described here.

Components affected / relevant

  • aslHttp ORE_Provider: the http front end for the protocol
  • applicationSupportLayer_OAI_ORE: business logic back-end component for the protocol

References


Protocol XX

Specification Description

Description and useful information about the Specification.

gCube Use/Need/Relevance

Describe the use/need/relevance of the specification in respect to the functional area of the system.

Functional Category

The functional category under which the services underlying the protocol fall.

Direction

The direction towards the system (Producer/consumer), along with any information to clarify the perspective of the interpretation as one or the other or both, if needed

gCube Adoption Status

Information about status of adoption of Specification within Our system. Whether the specification has already been integrated and supported within the system, or it is under implementation, or soon to be implemented.

Components affected / relevant

  • component X: role

References

Tentative Compliance

Add here specifications that are not there, neither the project commits yet into supporting them, along with the need and relevance.

  • LDAP: Support integration of infrastructure structure with other systems (e.g. harvesting external infrastructure resources, or publishing D4Science infrastructure resources ).
  • WSDM: Provide standard's compliant web API for infrastructure management.