Difference between revisions of "Execution Engine Specification"
(Created page with 'This is part of the Facilities Specification Template. == Overview == A brief overview of the subsystem should be here. It should include the key features. === Key featur…') |
(→Key features) |
||
(One intermediate revision by the same user not shown) | |||
Line 1: | Line 1: | ||
− | |||
− | |||
== Overview == | == Overview == | ||
+ | The Execution Engine aims to execute arbitrarily complex Execution Plans. Execution Plans are plans for the invocation of code components (aka invocables, i.e. services, binary executables, scripts, …) that ensures that prerequisite data are prepared and delivered to their consumers by defining the flow of data and/or control. The initial Execution Plans provided for execution by the Execution Engine originate from a [[Workflow Engine Specification | WorkflowEngine]] instance. In addition, since the Execution Engine supports distributed execution, it can forward subplans of its initial Execution Plan to other Execution Engine Instances. In this way, one can execute any kind of workflow on top of a distributed computational infrastructure. | ||
− | + | When the Workflow Engine is acting in the context of the gCube platform, it provides a gCube compliant Web Service interface. This Web Service acts as the front end not only to Execution facilities, but it is also the "face" of the component with respect to the gCube platform. The Running Instance profile of the service is the placeholder where the underlying Execution Engine instance Execution Environment Providers pushes information that need to be made available to other engine instances. Configuration properties that are used throughout the Workflow Engine instance are retrieved by the appropriate technology specific constructs and are used to initiate services and providers once the service starts. | |
=== Key features === | === Key features === | ||
− | |||
− | |||
− | |||
− | |||
;Control and monitoring of a processing flow execution. | ;Control and monitoring of a processing flow execution. | ||
− | : | + | :The Execution Engine provides progress reporting and control constructs based on an event mechanism for tasks operating both on the D4Science and external infrastructures. |
− | + | ||
− | + | ||
;Handling of data streaming among computational elements. | ;Handling of data streaming among computational elements. | ||
− | :PE2ng exploits the high throughput point to point on demand communication facilities offered by [[Result Set | + | :PE2ng exploits the high throughput point to point on demand communication facilities offered by [[Result Set components|gRS2]] |
;Expressive and powerful execution plan language | ;Expressive and powerful execution plan language | ||
− | :The execution plan elements comprising the language can execute literally anything. | + | :The execution plan elements comprising the language can execute literally anything. In addition, the Execution Engine is technology unaware regarding the components it can invoke, handling in the same uniform manner executables such as SOAP Web Services & WSRF, HTTP API (RESTful WS), various executables (including shell scripts), Java Objects etc. |
+ | ;Multiple ways of invoking executables | ||
+ | :In-process: ultra-high performance, no security boundary crossing, low need for data exchanges | ||
+ | :Intra-process: high throughput and performance, local security boundaries crossed | ||
+ | :Intra-node: low throughput (depending on network), organisational security boundaries crossed | ||
+ | ;Advanced error handling support through contigency reaction | ||
+ | :Each Execution Plan element which invokes executables can be annotated with contingency reaction triggers. | ||
;Unbound extensibility via providers for integration with different environments. | ;Unbound extensibility via providers for integration with different environments. | ||
:The system is designed in an extensible manner, allowing the transparent integration with a variety of providers for storage, resource registries, reporting systems, etc. | :The system is designed in an extensible manner, allowing the transparent integration with a variety of providers for storage, resource registries, reporting systems, etc. | ||
− | |||
− | |||
− | |||
− | |||
== Design == | == Design == | ||
=== Philosophy === | === Philosophy === | ||
− | + | The Execution Engine is designed to support an expressive, feature-rich workflow language. It aims to enable the execution of arbitrarily complex workflows of literally all kinds by offering a wide array of constructs, namely Execution Plan Elements, which can be used to invoke any kind of executable or to group collections of elements in execution flow structures. The uniform handling of such constructs by the Execution Engine allows the construction of such arbitrarily complex workflows. | |
+ | |||
+ | As a constituent part of PE2ng, the Execution Engine is designed with a layered architecture decoupling the business domain, the infrastructure specific logic and the core execution functionality therefore allowing core re-usage to a multitude of use cases and avoiding sub-optimal compromises of strictly agnostic solutions. | ||
=== Architecture === | === Architecture === | ||
− | The | + | The Execution Engine comprises a single component, whose internal architecture corresponds to the constructs it provides. |
+ | This grouping can be summarized as follows: | ||
+ | *Execution Elements | ||
+ | *Data Types | ||
+ | *Events | ||
+ | *Contingencies | ||
== Deployment == | == Deployment == | ||
− | + | ||
+ | The Execution Engine, in its service wrapped version, should be deployed at: | ||
+ | *Each node which should participate in the execution of Execution Plans of local or remote origin. | ||
+ | *Each node which is aimed to act as a gateway to external infrastructures. | ||
=== Large deployment === | === Large deployment === | ||
− | + | In case of high demands for computational power, the Execution Engine should be deployed on as many nodes as possible, so that the Workflow Engine instances which contact it are able to contact a large number of nodes and distribute the computational load evenly across the infrastructure. | |
− | + | [[File:ExecutionEngine_LargeDeployment.png|800px|center|Execution Engine large deployment]] | |
− | + | ||
=== Small deployment === | === Small deployment === | ||
− | + | If the processing requirements in the infrastructure are low and/or there is no need to contact external infrastructures, the Execution Engine can be deployed only at the node which hosts also the Workflow Engine and acts as an entry point for incoming workflow processing requests. This means that execution will take place only at that node, locally. In this minimal deployment scenario, one need just deploy the Execution Engine as a library. | |
+ | [[File:ExecutionEngine_SmallDeployment.png|800px|center|Execution Engine small deployment]] | ||
== Use Cases == | == Use Cases == | ||
− | |||
=== Well suited Use Cases === | === Well suited Use Cases === | ||
− | + | The Execution Engine has been successfully used at the execution of all workflows involved in the [[Workflow_Engine_Specification#Use_Cases | use cases]] of the Workflow Engine, as the enabling element of the latter. | |
=== Less well suited Use Cases === | === Less well suited Use Cases === | ||
− | + | As the Execution Engine aims to provide a generic facility for executing workflows, it cannot know the semantics of its input and output data. Applications which need such kind of data comprehension should instead opt for implementing special adaptors for the Workflow Engine. |
Latest revision as of 21:43, 30 April 2012
Contents
Overview
The Execution Engine aims to execute arbitrarily complex Execution Plans. Execution Plans are plans for the invocation of code components (aka invocables, i.e. services, binary executables, scripts, …) that ensures that prerequisite data are prepared and delivered to their consumers by defining the flow of data and/or control. The initial Execution Plans provided for execution by the Execution Engine originate from a WorkflowEngine instance. In addition, since the Execution Engine supports distributed execution, it can forward subplans of its initial Execution Plan to other Execution Engine Instances. In this way, one can execute any kind of workflow on top of a distributed computational infrastructure.
When the Workflow Engine is acting in the context of the gCube platform, it provides a gCube compliant Web Service interface. This Web Service acts as the front end not only to Execution facilities, but it is also the "face" of the component with respect to the gCube platform. The Running Instance profile of the service is the placeholder where the underlying Execution Engine instance Execution Environment Providers pushes information that need to be made available to other engine instances. Configuration properties that are used throughout the Workflow Engine instance are retrieved by the appropriate technology specific constructs and are used to initiate services and providers once the service starts.
Key features
- Control and monitoring of a processing flow execution.
- The Execution Engine provides progress reporting and control constructs based on an event mechanism for tasks operating both on the D4Science and external infrastructures.
- Handling of data streaming among computational elements.
- PE2ng exploits the high throughput point to point on demand communication facilities offered by gRS2
- Expressive and powerful execution plan language
- The execution plan elements comprising the language can execute literally anything. In addition, the Execution Engine is technology unaware regarding the components it can invoke, handling in the same uniform manner executables such as SOAP Web Services & WSRF, HTTP API (RESTful WS), various executables (including shell scripts), Java Objects etc.
- Multiple ways of invoking executables
- In-process: ultra-high performance, no security boundary crossing, low need for data exchanges
- Intra-process: high throughput and performance, local security boundaries crossed
- Intra-node: low throughput (depending on network), organisational security boundaries crossed
- Advanced error handling support through contigency reaction
- Each Execution Plan element which invokes executables can be annotated with contingency reaction triggers.
- Unbound extensibility via providers for integration with different environments.
- The system is designed in an extensible manner, allowing the transparent integration with a variety of providers for storage, resource registries, reporting systems, etc.
Design
Philosophy
The Execution Engine is designed to support an expressive, feature-rich workflow language. It aims to enable the execution of arbitrarily complex workflows of literally all kinds by offering a wide array of constructs, namely Execution Plan Elements, which can be used to invoke any kind of executable or to group collections of elements in execution flow structures. The uniform handling of such constructs by the Execution Engine allows the construction of such arbitrarily complex workflows.
As a constituent part of PE2ng, the Execution Engine is designed with a layered architecture decoupling the business domain, the infrastructure specific logic and the core execution functionality therefore allowing core re-usage to a multitude of use cases and avoiding sub-optimal compromises of strictly agnostic solutions.
Architecture
The Execution Engine comprises a single component, whose internal architecture corresponds to the constructs it provides. This grouping can be summarized as follows:
- Execution Elements
- Data Types
- Events
- Contingencies
Deployment
The Execution Engine, in its service wrapped version, should be deployed at:
- Each node which should participate in the execution of Execution Plans of local or remote origin.
- Each node which is aimed to act as a gateway to external infrastructures.
Large deployment
In case of high demands for computational power, the Execution Engine should be deployed on as many nodes as possible, so that the Workflow Engine instances which contact it are able to contact a large number of nodes and distribute the computational load evenly across the infrastructure.
Small deployment
If the processing requirements in the infrastructure are low and/or there is no need to contact external infrastructures, the Execution Engine can be deployed only at the node which hosts also the Workflow Engine and acts as an entry point for incoming workflow processing requests. This means that execution will take place only at that node, locally. In this minimal deployment scenario, one need just deploy the Execution Engine as a library.
Use Cases
Well suited Use Cases
The Execution Engine has been successfully used at the execution of all workflows involved in the use cases of the Workflow Engine, as the enabling element of the latter.
Less well suited Use Cases
As the Execution Engine aims to provide a generic facility for executing workflows, it cannot know the semantics of its input and output data. Applications which need such kind of data comprehension should instead opt for implementing special adaptors for the Workflow Engine.