Difference between revisions of "Common-accounting-model ABANDONED"

Revision as of 13:32, 24 May 2013

Scope

This library contains the definition of the resource accounting record.

Data-model

The structure of a generic accounting record (Usage Record, UR) will be composed of a set of common fields for all resource types, in particular:

id : an unique identifier for the UR
consumerId : the user actually consuming the resource (optional, for future purposes)
createTime : when the UR was created
startTime, endTime : the time window the UR refers to
resourceType : the type of resource the UR tracks
scope : the scope of the resource
resourceOwner : who owns the resource and/or who creates the UR

Furthermore, for each UR there will be a section to be filled with the specific properties per resource type (key-value pairs).

Resource Types

The resource types we've identified are: Execution, Service, Data-access and Storage.

Execution

This specification will be used to take into account information about services running jobs on the infrastructure (Workflow Engine, Execution Engine, Statistical Manager, Aquamaps).

For this resource type, there are two sub-types:

Job

Contains the information about the overall job, that will be partitioned in N Tasks.

Specific Job properties:

jobId : an unique identifier for the job
jobQualifier : qualifies the job in terms of algorithm type or job type (e.g. search, data-transformation, etc)
jobName : name of the job
jobStart : the instant the job start running
jobEnd : the instant the job ends its execution
jobStatus: completed/failed
vmsUsed : number of the VMs (gHNs) used by the job.
wallDuration : duration between the instant the job start running and the instant the job ends its execution.

Task

Contains the information about one slice of the overall Job.

Specific Task properties:

jobId : reference to the Job that generated this Task
refHost : hostname of the virtual machine (gHN)
refVM : virtual machine id (gHN)
usageStart : the earlier usage time of the Task
usageEnd: the latest usage time of the Task
usagePhase: completed/failed
inputFilesNumber : number of input files to the Task
inputFilesSize : dimension of input files to the Task
outputFilesNumber : number of output files from the Task
outputFilesSize : dimension of output files from the Task
overallNetworkIn : overhead of the input traffic over the network to the Task
overallNetworkOut : overhead of the output traffic over the network from the Task
cores : number of cores per Task.
processors : number of processors per Task.

Service

This specification will be used to take into account information about the services invocations.

Specific service attributes

callerIP : IP address that originated the service call
invocationCount : number of invocations (aggregated information)
averageInvocationTime : average invocation time (aggregated information)
serviceClass : name of the service class
serviceName : name of the service

Data-access

Specific Data-access properties:

sourceId: the identifier of the Tree Manager source which is the target of a read/write operation
operation : the name of the read/write operation performed via the Tree Manager over a given source
treeId : the identfier of a tree within the data source which is the target of a given read/write operation performed via the Tree Manager
treeCount : the number of trees within the data source which are accessed/written as the result of a given read/write operation performed via the Tree Manager

Storage

This model specification will be used to take into account storage resources and timeseries/services using DBs backend.

Specific storage attributes:

operationType : GET, PUT, UPDATE, DELETE
targetResource : URI representing the storage resource
fileDimension : dimension of the storage resource
hostname: hostname of the host where the storage library is invoked

@@ Line 19: / Line 19: @@
 === Execution ===
-Regarding the Execution resource type, there are two sub-types, according to the PE2ng's structure which is composed by two main layers. There is the Workflow layer that is more abstract, constructing workflow plans, supporting various adaptors and is aware of jobs as a whole. There is also the Execution layer, also a Service, where the actual execution takes place and is aware of more detailed stuff.
+This specification will be used to take into account information about services running jobs on the infrastructure (Workflow Engine, Execution Engine, Statistical Manager, Aquamaps).
-Discriminating those layers:
+For this resource type, there are two sub-types:
-* Workflow layer is aware of:
-Number of jobs submitted and adaptor that were used
-Execution nodes that will be used (scale out) per job
-* Execution layer is aware of:
+==== Job ====
-Statuses of execution jobs (success/fail)
+Contains the information about the overall job, that will be partitioned in N Tasks.
-also GHN hosting node information of every execution node is available to Workflow, harvested through Registry, containing info such as location, cpu load (week, day, hour,...), memory, disk space etc.
+Specific Job properties:
-==== Plan ====
-Specific Plan properties:
-* adaptorInUse : adaptor in use for the job (e.g. search, data-transformation, etc.)
-* vmsUsed : number of the VMs used by the job.
 * jobId : an unique identifier for the job
+* jobQualifier : qualifies the job in terms of algorithm type or job type (e.g. search, data-transformation, etc)
 * jobName : name of the job
 * jobStart : the instant the job start running
 * jobEnd : the instant the job ends its execution
 * jobStatus: completed/failed
+* vmsUsed : number of the VMs (gHNs) used by the job.
 * wallDuration : duration between the instant the job start running and the instant the job ends its execution.
-* cores : number of available cores per job.
-* processors : number of available processors per job.
-==== Execution Engine ====
-Specific Execution Engine properties:
+==== Task ====
+Contains the information about one slice of the overall Job.
+Specific Task properties:
-* refHost : hostname of the vm
+* jobId : reference to the Job that generated this Task
-* refVM : Execution Engine resource id or gHN id
+* refHost : hostname of the virtual machine (gHN)
-* usageStart : the earlier usage time of the Execution Engine
+* refVM : virtual machine id (gHN)
-* usageEnd: the latest usage time of the Execution Engine
+* usageStart : the earlier usage time of the Task
+* usageEnd: the latest usage time of the Task
 * usagePhase: completed/failed
-* inputFilesNumber : number of input files to the Execution Engine
+* inputFilesNumber : number of input files to the Task
-* inputFilesSize : dimension of input files to the Execution Engine
+* inputFilesSize : dimension of input files to the Task
-* outputFilesNumber : number of output files from the Execution Engine
+* outputFilesNumber : number of output files from the Task
-* outputFilesSize : dimension of output files from the Execution Engine
+* outputFilesSize : dimension of output files from the Task
-* overallNetworkIn : overhead of the input traffic over the network to the Execution Engine
+* overallNetworkIn : overhead of the input traffic over the network to the Task
-* overallNetworkOut : overhead of the output traffic over the network from the Execution Engine
+* overallNetworkOut : overhead of the output traffic over the network from the Task
+* cores : number of cores per Task.
+* processors : number of processors per Task.
 === Service ===

Difference between revisions of "Common-accounting-model ABANDONED"

Revision as of 13:32, 24 May 2013

Contents

Scope

Data-model

Resource Types

Execution

Job

Task

Service

Data-access

Storage

Navigation menu

Views

Personal tools

gCube Wiki

gCube features

gCube documentation

Integration and Distribution

Search

Tools