Reference Model

From Gcube Wiki
Revision as of 16:35, 13 March 2009 by Leonardo.candela (Talk | contribs)

Jump to: navigation, search

According to [1] ‘A reference model is an abstract framework for understanding significant relationships among the entities of some environment. It enables the development of specific reference or concrete architectures using consistent standards or specifications supporting that environment. A reference model consists of a minimal set of unifying concepts, axioms and relationships within a particular problem domain, and is independent of specific standards, technologies, implementations, or other concrete details’. This section introduces the concepts characterising D4Science as a system. Before proceeding, it is fundamental to clarify the fact that the D4Science “system” is actually composed of two different materialisation of the term system, the “living system” users interact with (a.k.a. the D4Science Infrastructure) and the “software system” supporting the deployment and operation of the D4Science Infrastructure (a.k.a. gCube ). Clearly, these systems are closely related, the software system is the enabling technology of the living system. This guide will focus on the “software system” but the concepts presented in this reference model characterise both “systems”.

The concepts constituting the D4Science Reference Model are organised in different domains: the resource domain represents the entities managed by gCube system; the user domain represents the entities, both human and inanimate, interacting with the system; the policy domain represents the rules governing the system behaviour and functionality; the architecture domain represents the concepts and relations needed to characterise a gCube based architecture; the VRE domain represents the concepts and relations needed to characterise a Virtual Research Environment.

Resource Domain

The gCube system is a software system conceived to manage an infrastructure consisting of a set of heterogeneous entities.

All such heterogeneous resources share some commonalities (gCubeResource):

  • Each gCube resource has a unique identifier (ID);
  • Each gCube resource has a type (Type); this information characterizes the role the resource (e.g. collection, gLite Resource, Running Instance) plays in the infrastructure;
  • Each gCube resource has multiple scopes (Scopes) allowing to characterise the contexts the resource is supposed to operate (VO/VRE);
  • Each gCube resource has a profile (Profile) to capture the distinguishing features of the resource to support resource discovery and usage.

In Figure 1 the class diagram capturing this domain is presented.

Figure 1. D4Science Resource Model

Two abstract classes characterise this domain, the SoftwareResource and the SystemResource. The former is to capture the resources forming the software managed by the gCube system, the latter is to capture the rest of resources managed by the gCube system, e.g. the hosting nodes, the running services. This distinction is justified by one of the distinguishing feature of gCube, i.e. the capability to dynamically deploy software components to produce new resources by relying on other existing resources. Thus, the software forming gCube becomes itself a resource managed through gCube.

For what concerns SoftwareResources the following resource typologies exist:

  • Service, a SoftwareResource delivering its expected functionality through a web based interface. In a Service Oriented Architecture (SOA) it is a constituent unit of the system. gCube exploits the SOA paradigm and implement it by relying of the WSRF framework. Each service is comprised of a software (ServiceLogic) implementing the service-specific business logic and zero or more SoftwareLibrary acting as helper software implementing non-service-specific logic, i.e. piece of software implementing general purpose functions, e.g. XML parse functions.
  • SoftwareLibrary, a SoftwareResource delivering its expected functionality in a stand-alone manner via a programmatic API. It is important to model such piece of code as resource in order to promote the reuse.
  • ThirdPartyLibrary, a SoftwareLibrary delivering its expected functionality in a stand-alone manner via a programmatic API. This specialisation of SoftwareLibrary is due to the need to capture the peculiarity of such software at deployment time, i.e. the fact that such a piece of software has its own deployment procedures.

For what concerns SystemResources the following resource typologies exist:

  • gLiteResource, a SystemResource representing a gLite resource , i.e. a placeholder in the gCube infrastructure for a resource forming a gLite based infrastructure. It is further specialised in gLiteService, gLiteCE and gLiteSE to capture the main types a gLiteResource can be.
  • gHN, a SystemResource representing the hosting machine on which gCube can dynamically deploy a Service (along with all the needed SoftwareLibraries) to create a RunningInstance.
  • RunningInstance, a SystemResource representing a Service deployed in a gHN. It is the runtime manifestation of a Service and consequently the runtime implementation of the expected Service facilities.
  • ExternalRunningInstance, a RunningInstance representing an instance of a Service running outside the direct control and management of gCube, i.e. (re-)deployment of such a Service is not allowed since gCube does not manage the Service. An example of such a kind of RunningInstance is an up and running Web Service, e.g. one of the services forming the G-POD application[1], whose facilities are needed in a VRE;
  • ApplicationSpecificResource, a SystemResource representing a resource created and managed by a specific Service, e.g. a Collection managed by the CollectionService, a TransformationProgram managed by a MetadataBroker Service.

User Domain

In gCube the term “user” identifies entities entitled to interact with the infrastructure. This definition not only includes human users (externals to the infrastructure) but also services (part of the infrastructure) autonomously acting in the system. Batch services (e.g. monitoring services) are an example of such users that, reacting to status changes or on a temporal basis, interact with other entities. Thus, some gCube services need to be identified and managed, from the authentication and authorization point of view, as users.

gCube architecture exploits the Public Key Infrastructure (PKI) paradigm to uniquely identify users in the infrastructure. gCube users are provided with a private key and a public certificate that can be used to authenticate to other entities. Private keys and public certificates are issued and revoked to users by a trusted entity, named Certification Authority (CA). The gCube infrastructure is designed to support CAs acknowledged by the International Grid Trust Federation (IGTF) as well as an infrastructure-specific CA, named D4Science CA.

VO-Management components provide functionalities to manage gCube users and their credentials.

Policy Domain

In gCube, the Virtual Organization (VO) concept is used to define authorization policies in the infrastructure. A Virtual Organization is a dynamic pool of distributed resources shared by dynamic sets of users from one or more real organizations. Resource Providers (RP) usually make resources available to other parties under certain sharing rules. Users are allowed to use resources under Resource Provider (RP) conditions and with the respect of a set of VO policies.

Following this approach, in the gCube VO model a policy is defined as a permission for a user to perform an operation on a specific resource. In the model resources are univocally identified through a resource id and must belong to a resource type. Each resource type is associated to a set of logical operations. These operations can be performed over resources of that type in that model. It is worth notice that operations in the VO model are just identifiers used to define logical operations that can be performed over resources (e.g. read, modify, delete). They not necessarily identify methods exposed by resource implementation (e.g. get, put). Logical operations are useful to describe operations a resource exposes; these operations map to actual methods provided by resource implementation.

The gCube VO model also leverages the concept of role, as introduced by the RBAC model , to decouple the association between users and permissions. Furthermore, roles are organized in hierarchies, thus allowing a natural way to capture organizational lines of authority and responsibility. Role hierarchies are not constrained to be trees; each role can have several ancestors with the only constraint that cycles are not allowed in the structure.

The gCube VO model is shown in Figure 2.

Figure 1. The gCube VO Model

The model takes into account two different actors managing policies over resources:

  • Resource Providers – users defining sharing rules, i.e. policies to resource access with respect to Virtual Organizations as a whole;
  • VO Managers – users defining permissions, i.e. policies to access resources with respect to VO members individually. Permissions have to comply with sharing rules defined by Resource Providers for a specific VO.

Architecture Domain

The gCube system relies on a component-oriented Architectural model which subsumes a ‘component-based approach’, i.e. a kind of application development in which:

  • The system is assembled from discrete executable components, which are developed and deployed somewhat independently of one another, and potentially by different players.
  • The system may be upgraded with smaller increments, i.e. by upgrading some of the constituent components only. In particular, this aspect is one of the key points for achieving interoperability, as upgrading the appropriate constituents of a system enables it to interact with other systems.
  • Components may be shared by systems; this creates opportunities for reuse, which contributes significantly to lowering the development and maintenance costs and the time to market.
  • Though not strictly related to their being component-based, component-based systems tend to be distributed.

The components characterising a gCube based system from an architectural point of view have been introduced in the Resource Domain section: all the gCubeResource types with the exception of the ApplicationSpecificResource are considered first class citizens in a component-oriented architecture. However, the Architecture Domain itself can be described through different and focused views each capturing a different facet of this multifaceted domain. Each of these views potentially uses a subset of the architectural components. For instance, if the goal of the view is to describe the current running instance of the gCube based Infrastructure serving D4Science the concepts that will be primarily used are the notions of: gHN (to capture the machines implementing the Infrastructure), RunningInstance and ExternalRunningInstance (to capture the running services supporting the operation of the whole), and gLiteResource (to capture the nodes of a gLite based infrastructure conceptually contributing to the whole) are used. On the other hand, if the goal of the view is to describe the system from a static point of view, i.e. its software constituents, the notions of Service, SoftwareLibrary and ThirdPartyLibrary are used.

VRE Domain

The notion of Virtual Research Environment is a cornerstone of the whole D4Science Infrastructure.

From a user perspective, a Virtual Research Environment is an integrated and coordinated working environment providing participants with the resources (data, instruments, processing power, communication tools, etc.) they need to accomplish the envisaged task. This environment is expected to be particularly dynamic because of the potential dynamicity of the participating community, both in terms of the constituents as well as of their requirements. In fact, it is envisioned to serve the e-Science vision that demands for the realisation of research environments enabling scientists to collaborate on common research challenges through a seamless access to all the resources they need regardless of their physical location. The resources shared can be of very different nature and vary across application and institutional domains. Usually they include content resources, application services that manipulate these content resources to produce new knowledge, and computational resources, which physically store the content and support the processing of the services.

From a system point of view, a VRE is a pool of gCubeResources dynamically aggregated to behave as a unit with respect to the application context the VRE is expected to serve. Each VRE is a view over the potentially unlimited pool of resources made available through the Infrastructure that (i) is regulated by the user community needs and the resources sharing policies and (ii) produces a new VO constraining the scope and usage of resources actors playing in the VRE are subject to.

Notes

  1. MacKenzie, M.; Laskey, K.; McCabe, F.; Brown, P.; Metz, R. Reference Model for Service Oriented Architecture 1.0. OASIS Committee Specification, August 2006