Difference between revisions of "Data Transfer Scheduler & Agent components"
Andrea.manzi (Talk | contribs) (→Deployment) |
Andrea.manzi (Talk | contribs) (→Large Deployment) |
||
Line 64: | Line 64: | ||
− | + | [[Image:DataTransferDeployment.jpg|frame|center|Data Transfer Scheduler & Agent Large Deployment schema]] | |
− | + | ||
=== Small Deployment === | === Small Deployment === |
Revision as of 12:12, 24 April 2012
THIS PAGE IS UNDER CONSTRUCTION
Contents
Overview
This class of components manages transfer capabilities among gCube infrastructure nodes, in particular but not only it can handles data transfer between Data Sources and Data Storages exploiting the interfaces and the services implemented under the Data Access and Storage Facilities subsystem.
This document outlines the design rationale, key features, and high-level architecture, the options for their deployment and as well some use cases.
Key features
The components belonging to this class are responsible for:
- reliable data transfer between Infrastructure Data Sources and Data Storages
- by exploiting the uniform access interfaces provided by gCube and standard transfer protocols
- structured and unstructured Data Transfer
- it guarantees both Tree based and File based transfer to cover all possible iMarine use-cases
- transfers to local nodes for data staging
- data staging for particular use cases can be enabled on each node of the infrastructure
- advanced transfer scheduling and transfer optimization
- a dedicated gCube service responsible fot data transfer scheduling combined to transfer optimization at the level of protocols and Access interfaces.
- transfer statistics availability
- transfers are traced by the system and make available to interested consumers.
- transfer shares per scopes and users
- a management interface is used to configure transfer shares per scopes and users at the level of Data Sources and Storages.
Design
Philosophy
Data transfer on a distributed infrastructure has to guarantee in first place transfer reliability and optimization in the sense of the resource usage (minimize network load while not causing storage overload). In addition compared to most of the solution developed for data transfer, the solution designed has to take into account not only the standard "unstructured" data transfer ( file transfer) but the capability of "structured" data transfer peculiar to the iMarine data infrastructure.
Architecture
The main components forming this class of Data transfer facilities are two gCube service plus the related libraries.
- The Data Transfer Scheduler service ( gDT Scheduler)
- The service is responsible for the transfer scheduling activity delegating and spawning the transfer logic to the series of gDT Agent deployed on the infrastructure. It relies on Messaging to consume transfer results from gDT Agent. The service has two main porttypes, one for the transfer scheduling and one for the management of transfer share per scopes and users
- The Data Transfer Scheduler DB interface ( gDT Scheduler DB interface)
- it's a component that models the gDT Scheduler DB, by abstracting the particular DB technology underneath.
- The Data Transfer Agent service ( gDT Agent)
- the component implementing the transfer logic. It accesses Data Sources trough the interfaces made available by the gCube Data Access facilities and store it locally on the infrastructures by relying on Data Storages. It handles several transfer protocol by exploiting the facilities provided by the gCube Result Set components. It relies on Messaging to publish transfer result, to be consumed by the gDT Scheduler. The service is responsible as well of data staging to infrastructure nodes.
- The Data Transfer Scheduler Library ( gDT Scheduler Lib)
- The library that clients uses to schedule data transfer among Data Sources and Storages. It offers a uniform interfaces for both structured and unstructured data transfers.
- The Data Transfer Agent Library ( gDT Agent Lib)
- The library can be used to contact gDT agent and staged data on an infrastructure node.
- The Data Transfer widget ( gDT Widget)
- is a component that can be integrated into any gCube Portlets to enable Data transfer. It relies on both gDT Scheduler library and gDT Agent library.
the following diagram depicts the dependencies among the described components:
Deployment
The component of the subsystem can be deployed according of their nature in different execution environment:
- The gDT Widget can be included in any Web Applications therefore it's deployable on a Tomcat Container.
- The gDT Scheduler and the Agent Services can be deployed on gCube enabled container.
- The gDT Scheduler & Agent Library can be integrated in other gCube Services or run as standalone libraries.
Large Deployment
Small Deployment
figure