Resource Management Specification

From Gcube Wiki
Jump to: navigation, search

Overview

Resource Management focuses on the optimal exploitation of software and hardware resources. Software resources include a range of entities from packages ready to be deployed to already activated services, while hardware resources model platforms hosting a service suitable to be exploited by the e-infrastructure.

Resource Management covers the entire lifecycle of such resources through a set of services and libraries. This document outlines their design rationale, key features, and high-level architecture as well as the opportunities they collectively offer.

Key features

Dynamic Deployment
First-class features for deploying gCube software and external software packaged following the gCube policies, support for deployment of software hosted on recognized Maven repositories
Optimal Resource Allocation
Optimal selection and usage of resources during the deployment phase
Monitoring
Tools for pro-actively monitoring the current state of nodes and active services. Local information are collected and published in the Information System as well as returned on demand
Easy Remote Administration
Remotely reconfiguration of nodes and active services upon the e-infrastructure/manager's needs
Failover capabilities
Self-recovering from erroneous states
Extensible Bridging over Virtual Platforms
A flexible model for transparently interfacing a potentially unlimited number of hosting environments
Maven Integration
Any software available in a Maven repository is eligible to become part of the Data e-Infrastructure
Coordination and Elastic Management
New services can be fired up on demand and through self-government
Cloud-orientation
Gateway to Cloud platforms with Java support, looking forward for the proposed Topology and Orchestration Specification for Cloud Applications (or TOSCA) for short, standards specification
Scalability
e-Infrastructures composed by thousands of nodes and software can be concurrently managed

Design

Philosophy

The Data e-Infrastructure requires dynamic allocation of services and nodes and a strong capability to adapt itself to changes occurring during its operation. Components of Resource Management subsystem address these requirements by providing a robust ground for the overall infrastructure lifetime.

Software has different externalizations at different level, ranging from a tarball of binaries to an activated instance hosted on a platform. Resource management has to be able to deal with such different forms and the designed architecture reflect this role.

Different components have different perceptions of the resources to manage. There are components operating at a logical level (VO and VRE) that only matter about how to optimize the VO/VRE delivered functionality by replicating/duplicating services when they see this need. These components do not obviously care of the physical form of the software and will never change if it will. Other components work at service level (the constituents of the VO/VRE) and know how to optimize their allocation. Finally, there exist components that operates at local level (node) and manage only local deployed software.

Other principles on which the design of the presented architecture is piled are:

  • components must be generic with respect to the managed resources
  • fault recovery is a crucial feature: nodes and services must be guaranteed to stay in a consistent state under any condition
  • external software has to be as transparently as possible managed in the same way of the gCube software

Architecture

Resource Management Architecture
  • VRE-Creation-Wizard: this graphical wizard guides a VRE-manager in the creation of a VRE in a step-by-step fashion
  • VRE-Modeler: this service receives the high level requirements and criteria produced by the wizards and translates them in a concrete set of services to deploy or join
  • Resource-Management-Portlet: integrated in the portal and accessible to authorized users, this portlet exposes a graphical visualization of the current state of the resources belonging the e-infrastructure and provides instruments for corrective and management actions (such as restarting a node, adding/removing node and service to/from a VO/VRE, etc.)
  • Resource-Manager: this service stays on top of many other services and coordinates them to satisfy the VRE-Modeler requests.More specifically, the service manages resources belonging the Scope and creates new instances of services within it. It interacts with the Software gateway (for dependencies resolution), the Resource Broker (for allocating services on gHNs), the Deployer (for deploying services), the gHN Manager (for managing node's scopes) and the Information System (for publishing and retrieving resource's profiles
  • Resource-Broker: this service promotes and supports the optimal selection and usage of resources during the deployment phase. In particular, it is invoked to select the most appropriate pool of gHNs to be used, among those available in the context of the deployment operation
  • Resource-Broker-Serialization: a collection of structures and definitions needed to (de-)serialize data managed by the Resource-Broker
  • Software-Gateway: a gateway over a cluster of Maven Repositories granting access to the stored information and software for deployment purposes
  • gHN-Manager: a service providing operations for the remote management of a node
  • Deployer: a service providing facilities for deploying, managing and un-deploying software components on a node
  • Virtual-Platform: a library with classes and interfaces to be extended/implemented to bridge and manage the lifecycle of software running on a different platform than gCore
  • Virtual-Platform for Tomcat6: an implementation of the Virtual-Platform model to manage Web Applications on Apache Tomcat 6.x
  • RainyCloud: an interface to the VENUS-C platform;

Deployment

Resource Management components have almost a fixed deployment scheme (with a few variations) because of their nature and responsibilities.

Large deployment

VRE creations have their natural place inside a VO. Therefore, VRE-Wizard, VRE-Modeler and Resource Broker, which play their respective role at VRE creation, are mandatory to be deployed in each VO aiming at supporting dynamic VREs.

Resource-Management-Portlet is also suggested to be deployed at VO level. VO views are more restricted that the whole infrastructure and can be better administrated by VO managers.

Then, each organized subset of resources (VO or VRE) requires an instance (and only one) of the Resource Manager that is charge of moving resources inside and outside the subset.

The Software Gateway is an entry point for Maven Repositories, hence one instance at infrastructure level is enough.

Finally, the class of components (gHN-Manager, Deployer, Virtual Platforms) working at Package/Node level, are optimally deployed in each gHN supporting their respective platform (gCore, Tomcat, Venus-C or future bridged platforms). This way, the functionality they expose are available on every and each node by maximizing the possible targets of deployments, replica creation and balancing decided by managers or by the self-government of the system.

As general constraint, all instances at VO level must be statically deployed, while the ones at VRE level can be dynamically deployed (by their parents at VO level). No temporal restriction is in place on the deployment sequence.

Typical Deployment Scenario

Small deployment

There exist few variations of the optimal scheme presented above. They can be adopted in some specific situations where a subset of the management functions is not requested or supported.

no support for dynamic deployment

If a VO does not want to support dynamic deployment, several services may be excluded from the deployment scheme:
  • Deployers do not need to stay on the nodes, since nothing will be never deployed there by the system (manual deployments are always possible)
  • VRE-Wizard, VRE-Modeler, Software Gateway and Resource Broker loose their roles

limited or zero Web Applications available

If a VO does not have any Web Application to deploy, Virtual Platforms for Tomcat do not need to be deployed. If the VO has Web Applications that do not support replication or distribution, the same cardinality of nodes supporting the Tomcat platform should be made available.

VENUS-C interoperability

RainyCloud provides an interface to VENUS-C platform. If this is not exploited by the current configuration of a VO, there is no point of having it deployed.

Use Cases

Resource Management has been conceived to serve a number of use cases. This section collects these cases.

Well suited Use Cases

Virtual Research Environment creation is indeed the most well suited use case satisfied by Resource Management. Within few clicks on the VRE-Wizard, even a non-skilled user is able to create a fully functional environment by aggregating existing resources or creating new ones upon his needs. Such a complex activity exploits all the key features of the subsystem and gives back the most tangible results of its capabilities.

Moreover, one-shot deployment and undeployment operations (through the Resource Management Portlet) are indeed one of the most common tasks requested and satisfied.

The supervision of a VO is also task accomplished by the subsystem. Following the VO manager input and creation criteria, resources are moved in and out of the VO and continuously monitored.

Integration of control of Web Applications on the Tomcat6 platform is another well served use cases. This case proves the quality of the work done in the Virtual Platform layer. With few extensions to the provided classes, it has been possible to manage Web Applications (in an unsupported package format, WAR, until then) on an new hosting platform by greatly opening the hosting capabilities of the e-infrastructure to a broader class of supported applications.

Less well suited Use Cases

Hot deployment (the process of adding new components to a running server without having to stop the application server process and start it again) is not possible on the gCore platform because it does not support it. However, the subsystem is able to work with it and this feature is available on the new platforms the system is targeting (such as Tomcat 6) or will target next.