GCore Based Information System Specification
Overview
The Information System (IS) is the core subsystem connecting producers and consumers of resources. It acts as a registry of the infrastructure by offering global and partial views of its resources and their current status and notification instruments.
The approach provided by the IS is of great support for the dynamic deployment capabilities and the interoperability solutions offered by the Resource Management facilities.
Key features
- Resource Publication, Access and Discovery
- IS is the connecting point among the resources of the e-Infrastructure
- Consistency with the new Resource Model
- IS grants publication and access to resources compliant with the Resource Model (2nd generation)
- Production level QoS - Responsiveness
- each query served in milliseconds, thousands of queries served each hour
- Production level QoS - Scalability
- infrastructures with more than 10K of resources successfully powered
- Production level QoS - Permanent and Uninterrupted Functioning
- IS instances have been continuously up for more than one year without human intervention
- Support to Standards - WS-DAIX Specification v1.0
- full implementation of WS-DAIX v.1.0, a widely accepted standard defining a set of data access interfaces for XML data resources
- Support to Standards - XQuery 1.0
- Resource discovery can be performed through expressions compliant with XQuery 1.0
- Support to Standards - WS-Notifications
- Consumers of resources can subscribe to the IS for receiving WS-Notifications about any change occurred in they resources the are interested in
- Flexible deployment scenarios
- IS components can be deployed in several ways, to best fit the needs of an infrastructure or a specific VO
Design
Philosophy
The IS has been designed and implemented to:
- rely on standards
- support distribution at maximum and replication wherever it is possible
- abstract clients from the deployment scenario
A central role of the Information System is also to publicly manifest resources and connect them to their consumers. A consistent Resource Model has been created at the beginning of the gCube development and served for many years as a solid basis of gCube core-facilities. With the increasing openness of the system, a second generation of the model has been shaped and being integrated in the IS.
Architecture
To deliver the quality of service and performances and to handle growing amounts of information (scalability), the Information System is composed by a set of Web Services and client libraries.
They globally deliver the following functionalities with respect to the information handled:
- production and publication
- collection, indexing and storage
- discovery and consumption
The components belonging the production and publication phase are:
- IS-Registry: this service exposes an API for publishing/un-publishing profiles of resources compliant with the Resource Model of both first and second generation;
- IS-Notifier: this service builds on top to WS-Notification to deliver notifications about changes occurring in the resources registered in the IS-Registry; it also supports other services in subscribing/unsubscribing to topics produced by the various Services; this service decouples the actual producer of the topic from the actual consumer allowing for producers re-location
- IS-gLiteBridge: this service publishes and unpublishes resources gathered from a gLite based infrastructure that gCube services may access to
- IS-Publisher: a library available to gCube services for publishing/un-publishing information in the IS
The component supporting the collection, indexing and storage phase is:
- IS-InformationCollector: a service that collects and makes available information related to the actual state of the gCube infrastructure and/or of an assigned subset of it; it exposes APIs compliant with WS-DAIX for feeding and then accessing indexed resources
The components supporting the discovery and consumption phase are:
- IS-Client: a library available to gCube services for discovering information published in the IS
- IS-Notification: a library available to gCube services with publication/subscription/notification mechanism for Topics produced and consumed by any actor of the infrastructure compliant with WS-Notification
Deployment
As far as the client libraries, the deployment scheme is trivial: they reside on each node of the infrastructure equipped with the gCore platform in order to allow hosted services to interact wit the IS services. Their main role is in fact to hide the actual deployment scenario of the services. These can be variously distributed and replicated at many hosts.
The distribution criterion is the scope, meaning that each of the IS services can manage a single scope or aggregate multiple scopes. And each service can adopt a different distribution policy: for instance, an InformationCollector may work in scope A and B, while two different IS-Registry instances may independently manage A and B. Hence, the possible distribution schemes grow with the complexity of the infrastructure.
The replication criteria are the type of the resources handled and again the scope. There may exist IS services configured to accept only certain resources (such as nodes) and others configured for different resources. The most important thing is that at the end of the scheme, all the resource types are covered by the available services. However, replication holds for IS-InformationCollector and IS-Registry, while the IS-gLiteBridge and the IS-Notifier do not support replication.
About temporal constraints, IS-InformationCollector has to be deployed firstly, then IS-Registry and finally the IS-Notifier and IS-gLiteBridge (these two in no particular order).
Final remark, all the IS services must be hosted on dedicated nodes, i.e. no service (other than the Resource Management services working at node level) has to be co-deployed with them.
Large deployment
To obtain a balanced trade off between scalability and resource consumption, a scheme with IS instances distributed at VO level and infrastructure level could maximize the results in most of the cases. VREs are usually fairly handled by the instances at VO level. Regarding co-deployments of IS services, IS-InformationCollector and IS-Registry are highly contacted services and compete for the container's threads serving incoming calls; therefore, they work at their maximum when they are deployed on different hosts. IS-Notifier is a less stressed service that might be co-deployed with the IS-Registry to reduce the number of exclusively dedicated nodes.
Small deployment
To stay conservative in terms of resource consumption, a single instance of all the IS services may be deployed. How such a scheme affects the responsiveness of the system depends on how many resources compose the infrastructure.
Alternative deployment schemes may plan aggregation of VOs in the same IS instances.
Use Cases
The subsystem has been conceived to support a number of use cases moreover it will be used to serve a number of scenarios. This area will collect these "success stories".
Well suited Use Cases
The Information System has been longed used for serving the e-infrastructure purposes. Producers and consumers of resource belonging VOs and VREs with thousands of resources have been successfully connected through it over the years.
Because of the adoption of widely recognized standards, the IS is today an open system that can be exploited even by non-gCube native components.
The flexibility of its deployment solution offers great opportunities of (re-)configuration towards the optimal schema.
Less well suited Use Cases
Resource discovery is still partially compatible with WS-DAIX. Not too far from it, but not 100% compatible. Therefore, clients that want to query the IS-InformationCollector by their own (and not using the abstraction facilities offered by the IS-Client) must comply with its interface (still XQuery based).