Workflow Layer

From Gcube Wiki
Jump to: navigation, search

Overview

The Workflow Layer provides a high level interface where tasks can be described and handled. It allows the composition of flexible workflows by hiding the differences of the underlying execution engines and thus allowing the cooperation of heterogeneous ones. To this end, the Workflow Layer exploits PE2ng's ability to execute web service calls. Every job described in the workflow, depending on the execution engine it concerns, is matched to a web service call that will eventually use the corresponding adaptor of the Workflow Enigne. The Workflow Layer translates all the jobs of the workflow to a new plan, which describes these service calls, and is submitted to PE2ng through the JDL adaptor. The plan, is described using a JDL variant, namely gJDL, which was introduced to enable some needed functionality, such as the execution of web service calls and the description of the execution environment . An example of such a plan is presented below.

Usage example

This section presents an example of the possibilities the Workflow Layer offers to users. In order to use the Workflow Layer, one must compose a resource file containing the following:

  • Scope
  • gJDL description of the workflow
  • Various execution parameters (chokeProgressEvents, chokePerformanceEvents)
  • Resource files for each Workflow Layer job given as local input files

An example of such a resource is the following:

A sample resource file

scope # /gcube/devNext
jdl # /home/someuser/somedirectory/gJDLFile
chokeProgressEvents # false
chokePerformanceEvents # false
inData # resource0 # local # /home/someuser/somedirectory/resource0
inData # resource1 # local # /home/someuser/somedirectory/resource1

The first line defines that the scope that is to be used and then the path to the gJDL is given. After that, two parameters are set to false and finally two input files, resources for the jobs described in the gJDL file are passed to the service with their local paths.

A sample workflow described in a gJDL file is the following:

A sample workflow

 [
 Type = "DAG";
 ParsingMode="Plan";
 ConnectionMode="Callback";
 NodesCollocation = false;
 Max_Running_Nodes = 1;
 
 Nodes =
 	[
 node0 =
 [
 	Description =
  	[
 		JobType = "EE";
 		Executable = "PE2NG";
 		Arguments = "resource0";
 		RetryCount = 0;
 		RetryInterval = 5000;
 	];
 ];
 node1 =
 [
 	Description =
  	[
 		JobType = "EE";
 		Executable = "GRID";
 		Arguments = "resource1";
 		RetryCount = 0;
 		RetryInterval = 5000;
 	];
 ];
 ];
 	Dependencies = {{node1,node0}};
 ]

gJDL

Information about the syntax of the above sample file can also be found here. However, the JDL language has been slighty extended in order to allow the description of jobs in different execution engines. Thus, a new job type with the name "EE" has been introduced and the value of its executable defines which underlying processing infrastructure should handle the corresponding job. The possible values are: "PE2NG", "GRID", "HADOOP", "CONDOR". Along with that, the value of the Arguments property defines the resource file that is to be used. This value is the key that is assigned to the file in the resource file above. With the use of the Dependencies feature this example describes a sequential execution of the two jobs; i.e. the job described in node0 will only start its execution after the execution of the job described in node1 has been terminated.

The syntax of the two input resource files follows the syntax described in the above link, exactly as it used to do.

Workflow Layer Service

In order to expose the Workflow Layer, a web service has been developed. Using the description above this Workflow Layer Service creates an execution plan that is then submitted to the Execution Engine. For each of the jobs described in it, the service creates a SOAPPlanElement that will execute a call to the operation of the WorkflowEngineService that handles the corresponding processing infrastructure.