Messaging Infrastructure

From Gcube Wiki
Revision as of 10:26, 17 August 2010 by Andrea.manzi (Talk | contribs) (Configuration)

Jump to: navigation, search

gCube Messaging Architecture

Tools to monitor the infrastructure and gather accounting data are important tasks within the infrastructure operation work. As a consequence, D4Science decided to implement:

  • A monitoring tool based on a messaging system to compliment the monitoring tools already available based on the gCube IS;
  • An accounting tool also based on a messaging system to satisfy the need to provide accounting information.

These monitoring and accounting tools have been implemented under a common gCube subsystem called gCube Messaging. This section presents the architecture and core components of such subsystem.

The gCube Messaging subsystem is composed by seven components:

  • Message Broker – receives and dispatches messages;
  • Local Producer – provides facilities to send messages from each node;
  • Node Monitoring Probes – produces monitoring info for each node;
  • Node Accounting Probes – produces accounting info for each node;
  • Portal Accounting Probes – produces accounting info for the portal;
  • Messages – defines the messages to exchange;
  • Messaging Consumer – subscribes for messages from the message broker, checks metrics, stores messages, and notifies administrators.
  • Messaging Consumer Library – hides the Consumer DB details helping clients to query for accounting and monitoring information
  • Portal Accouting portlet – a GWT based portlet, that shows to Infrastructure managers portal usage information
  • Node Accouting portlet – a GWT based portlet, that shows to Infrastructure managers service usage information

Message Broker

Following the work that has been done by the [1]WLCG Monitoring group at CERN on Monitoring using MoM systems, and to potentially make interoperable the EGEE and D4science Monitoring solution, the [2][EGEE MSG Broker component has been adopted in D4Science has standard Message Broker service.

The EGEE MSG Broker is based on the [3]Apache ActiveMQ message broker, a very powerful Open Source solution having the following main features:

  • Message Channels
    • Publish-Subscribe (Topics)
    • Point-to-Point (Queue)
    • Virtual Destination, WildCards
    • Synchronous, Asynchronous sending
  • Wide Range of supported protocol for clients
    • Open Wire for high performance clients
    • STOMP
    • REST, JMS
  • Extremely good performance and reliability
    • Is it possible to check the [4] Performance Test executed by WLCG Monitoring group.

Installation

The Installation instruction for the EGEE MSG Broker can be found on EGEE MSG Wiki [5].

Local Producer

The Local Producer is the entity deployed on each node of the infrastructure responsible for the messages exchange. It defines the methods to communicate with the Message Broker and is activated at node start-up (if configured to do so).


gCube local Producer


The Local Producer is structured in two main components:

  1. An abstract Local Producer interface. This interface is part of the gCore Framework (gCF) and models a local producer, a local probe, and the base message.
  2. An implementation of the abstract Local Producer. This implementation class has been named GCUBELocalProducer.

The GCUBELocalProducer, at node start-up, sets up two types of connections towards the Message Broker:

  1. Queue connections: exploited by accounting probes that produce messages consumed by only one consumer;
  2. Topic connections: exploited by monitoring probes that produce messages consumed by multiple consumers.

Configuration

In order to configure the GHN to run the gCube Local Monitor, at least one MessageBroker ( an Active MQ endpoint) must be configured in one of the ServiceMap related to the GHN scope as follows:

<ServiceMap>
        <Service name ="ISICAllQueryPT" endpoint ="http://dlib01.isti.cnr.it:8080/wsrf/services/diligentproject/informationservice/disic/DISICService"/>
        <Service name ="ISICAllRegistrationPT" endpoint ="http://dlib01.isti.cnr.it:8080/wsrf/services/diligentproject/informationservice/disic/DISICRegistrationService"/>        
         ......................
        <Service name ="MessageBroker" endpoint ="tcp://ui.grid.research-infrastructures.eu:6166"/>
</ServiceMap>


One parameter can been added also to the [6]GHN configuration :

  • testInterval: The interval in seconds between test executions ( default = 1800)


In case none of MessageBroker parameters are present on GHN ServiceMaps, the gCube Local Monitor is not enabled on the GHN.

Monitoring Probes

The monitoring probes can be of two types: gHN and Running Instance (RI). A gHN probe produces messages related to the gHN node itself while the RI probe produces messages concerning the gCube services running on the gHN. The following probes are currently available:

  1. GHNDiskProbe – monitors the local available disk space;
  2. GHNLoadProbe – monitors the CPU load of the gHN;
  3. GHNMemoryProbe – monitors the memory available on the gHN;
  4. GHNInformationProbe – gathers information related to the gHN HW;
  5. GHNNotificationProbe – subscribes for local gHN events (scope changed, scope added, node start, etc);
  6. RINotificationProbe – subscribes for local RI events (scope changed, scope added, deployment, etc).

All the above probes exploit the GCUBELocalProducer to contact the Message Broker and send messages.

Probes can also be grouped according to types of message that they produce and according to their behaviour:

  • Test Probes – Perform local tests on the gHN and send messages containing the test results. Probes 1 to 4 above.
  • Notification Probes – Exploits the gCF local event mechanism to consume events related to GHN/RI actions (GHN Ready, RI/GHN scope changed, etc). Probes 5 and 6 above.

Node Accounting Probe

The Node Accounting Probe is in charge of collecting information about local usage of gCube services. The probe is a library deployed on each gHN that exploits the mechanisms offered by gCF to understand the usage of the services on the infrastructure. For each incoming method call, gCF produces a record log as follows:

END CALL FROM (146.48.85.127) TO (Messaging:Consumer:queryAccountingDB),/d4science.research-infrastructures.eu,Thread[ServiceThread-1039,5,main],[0.847]

Each “END CALL” line contains information about:

  • Time
  • Running Instance invoked
  • Method invoked
  • Caller scope
  • Caller IP
  • Invocation Time

The probe parses this type of log files at the end of each day and aggregates information per running instance. In particular the information is aggregated following this schema: RI -> CallerScope -> CallerIP -> Number of Invocations and Average Invocation Time.

The information about the invoked method is not parsed since it has been decided not to expose this granularity of information.

At the end of the aggregation process, the probe creates node accounting messages that are sequentially send to the Message Broker using the Local Producer.

For this particular type of messages a queue receiver is exploited on Message Broker side.

IS Integration

The Node Accounting Probe is able to publish some aggregated Accounting information on the gCube IS. In order to allow publication of information the gCube Running instance profile has been extended as in the following example:

<Accounting>
<ScopedAccounting scope="/d4science.research-infrastructures.eu">
<TotalINCalls>2353</TotalINCalls>
<AverageINCalls interval="10800" average="0.0" />
<AverageINCalls interval="3600" average="0.0" />
<AverageINCalls interval="18000" average="0.0" />
<AverageInvocationTime interval="10800" average="1.786849415204678" />
<AverageInvocationTime interval="3600" average="5.031631578947368" />
<AverageInvocationTime interval="18000" average="2.3127022417153995" />
<TopCallerGHN avgHourlyCalls="30.4142155570835507" avgDailyCalls="730.0" totalCalls="730">
<GHNName>137.138.102.215</GHNName>
</TopCallerGHN>
</ScopedAccounting>
</Accounting>

The Accounting section contains, for each RI caller scope:

  • Total Number of calls
  • Average number of Incoming calls ( calculated over 3 different value of interval )
  • Average Invocation time ( calculated over 3 different value of interval )
  • Top caller GHN
    • average daily calls
    • average hourly calls
    • total calls

Configuration

Node Accounting configuration is driven by a the NodeAccounting.properties file contained in the $GLOBUS_LOCATION/config folder

#Configuration for NodeAccounting probe

#the probing interval ( logs Aggregation interval in sec and  IS publication interval)
PROBING_INTERVAL=3600

#publication of accounting info for each RI on the IS
PUBLISH_ON_IS=true
<pre>

==Portal Accounting Probe==
The Portal Accounting Probe is in charge of aggregating information about portal usage. As for the node accounting, the portal (and in particular the ASL library) produces a log record, describing the following operations:

*Login
*Browse Collection
*Simple Search 
*Advanced Search
*Content Retrieval

This is an example of one log record produced by the D4science portal:
<pre>
2009-09-03 12:22:37, VRE -> EM/GCM, USER -> andrea.manzi, ENTRY_TYPE -> Simple_Search, MESSAGE -> collectionName = Earth images AND collectionID = 12345 | collectionName = Landsat 7 AND collectionID = 54321 | term = satellite

The common information between each type of log is:

  • Time
  • User
  • VRE
  • OperationType
  • Message

The message part differs between each type of log record, for example for the simple search record contains info about the collections included in the operation and term searched. The portal accounting probe aggregates portal accounting information by creating a number of Portal Accounting messages aggregated by: User -> VRE -> OperationType. Each message contains a certain number of records of one OperationType.

As in node accounting, at the end of the aggregation process the probe sequentially sends to the Message Broker using the Local Producer a number of portal accounting messages. For this particular type of messages a queue receiver is exploited on Message Broker side. Messages

Different message types are defined for monitoring and for accounting.

Portal Accounting portlet

Monitoring Messages

The monitoring probes (GHN and RI probes) exchange with the Message Broker a particular type of messages (extensions to the base GCUBEMessage) named respectively GHNMessage and RIMessage. Both of them contain a particular object named “Test” that represents the test performed on the GHN (together with the result) or a Notification:

  • TestType – either TEST or NOTIFICATION;
  • Description – the test/notification description;
  • TestNumber – a unique Identifier;
  • TestResult – object that stores the TEST results; (in case of NOTIFICATION no results are expected_
  • Priority – either HIGH or LOW.

The RIMessage also contains information about the ServiceClass and the ServiceName of the Running Instance where the probe is running. At message creation time, depending on the type of messages and type of probes, different combinations of topic names and message selectors are possible:

  • GHN Message, TEST probe:
    • scope.MONITORING.GHN.sourceGHN/MessageType='TEST'
  • GHN Message, NOTIFICATION probe:
    • scope.MONITORING.GHN.sourceGHN/MessageType='NOTIFICATION'
  • RI Message, TEST probe:
    • scope.MONITORING.RI.sourceGHN/MessageType='TEST'
  • RI Message, NOTIFICATION probe:
    • scope.MONITORING.RI.sourceGHN/MessageType='NOTIFICATION'

   The monitoring probes, following the above topic structure, send messages for each scope of the GHN/RI. For example on a gHN running on node pcd4science.cern.ch and port 8080, that belongs to  both /gcube and /gcube/devsec scopes, the GHNDiskProbe probe will send two messages with the following topic names:

  • gcube.MONITORING.GHN.pcd4science_cern_ch:8080
  • gcube.devsec.MONITORING.GHN.pcd4science_cern_ch:8080

Accounting Messages

Node and portal accounting probes use particular types of messages, named respectively NodeAccountingMessage and PortalAccountingMessage.

The NodeAccountingMessage is a specialization of the generic GCUBEMessage. It’s used to transfer the details about the invocations received by a RI on a particular scope. It includes:

  • RI service name and class
  • Caller scope
  • Caller IP
  • Invocation date
  • Hourly records composed by:
  • Time frame
  • Service invocation number
  • Average invocation time

For accounting messages, the JMS destination is a queue. Instead of a topic naming structure, the message follows a queue naming structure:

scope.ACCOUNTING.GHN.SourceGHN

For example:

  • gcube.ACCOUNTING.GHN.pcd4science_cern_ch:8080
  • gcube.devsec.ACCOUNTING.GHN.pcd4science_cern_ch:8080

The PortalAccountingMessage is a specialization of the generic GCUBEMessage. The type of information to transport is rich and can vary considerably. The basic fields are: User and VRE. Then the message is structured to contain a list of Basic Record specialized in:

  • LoginRecord
  • AdvancedSearchRecord
  • SimpleSearchRecord
  • QuickSearch
  • GoogleSearch
  • BrowseRecord
  • ContentRecord
  • GenericRecord (for generic operation logs)

All of the above records have in common only timestamp information. Also for these messages, there is queue naming structure as follows:

scope.ACCOUNTING.PORTAL.SourceGHN

For example: *gcube.ACCOUNTING.PORTAL.pcd4science_cern_ch:8080


Messaging Consumer

The Consumer Monitor is a gCube WSRF service that is deployed on the infrastructure to consume messages coming from Message Brokers. The main features of the service are:

  • Subscribe to monitoring/accounting messages for different scopes;
  • Check monitoring message test result against metrics;
  • Store monitoring/accounting messages on local database;
  • Send email notifications to admins in case of abnormal tests results;
  • Provides a GUI with summary information and query facilities.

This WSRF service exposes public operations to allow queries to the underneath database and export information outside the infrastructure.

gCube Messaging consumer

Following the messages topic structure the Messaging Consumer, at start-up time, creates (1) durable subscriptions towards topics, and (2) queue receiver towards queues. The Message Broker server will hold messages for a client subscriber after it has formally subscribed. Durable topic subscriptions receive messages published while the subscriber is not active. Subsequent subscriber objects specifying the identity of the durable subscription can resume the subscription in the state it was left by the previous subscriber. This means that using the same subscription ID the Messaging Consumer can resume the receipt of messages from the Message Broker server. This is very powerful, and it's useful in case of a node-crash or service re-deployment.

The Messaging Consumer also embeds a Message Broker for testing purposes. However in the production environment a dedicate Message Broker is deployed.

The Messaging Consumer can dynamically run in one or more scopes. According to the topic/queue structure defined, when a scope is added to its RI the service automatically subscribes for the following topics/queues:

  • <scope>.MONITORING.GHN.*
  • <scope>.MONITORING.RI.*
  • <scope>.ACCOUNTING.GHN.*
  • <scope>.ACCOUNTING.PORTAL.*

The Consumer Service can be configured using the “subscriptions” configuration variable, to subscribe only to a subset of the available information. In addiction the Messaging Consumer can be configured to use JMS message selectors. This means that for each scope 2*nOfSelectors durable subscribers are created using the wildcard (.*) syntax for TopicNames (all topic names of the same scope and type are subscribed for).

An important functionality of the Messaging Consumer is the capability to send notifications and daily reports to administrators by elaborating on the stored incoming messages. The administrators are selected trough a local configuration file, directly retrieved from VOMS, or by a configuration file stored on IS. The Messaging Consumer is configured to send email notification in two situations:

  • When a message of type NOTIFICATION with HIGH priority (e.g. gHN start, shutdown) is received;
  • When a message of type TEST and the test result exceed some threshold parameters (e.g. CPU usage, disk quota) is received.

The Messaging Consumer embeds a Jetty web server in order to give access to the database content (for debug purposes) and to publish daily report. A number of servlets show the DB content grouped by gHN name. The first version of the report GUI, allows the admin to navigate trough reports grouped by day, scope, gHN name, and shows the related messages consumed by the service. In order to include daily/monthly graphs, a first integration with Google char has been developed. The Consumer Service can be configured ( as any of the other gCube services) by adding/changing configuration parameters on the [7]JNDI service file. The following table describe the list of service parameters.


Parameter Type Description
DBFile String The FIle Name containing the DB Structure
MailRecipients String The FIle containing the list of Fixed administrators mail , if present the list of admin mail is not downoaded from VOMS peridically
NotifiybyMail Boolean Specify if the mail notification feature has to be turned on
startScopes String List of scopes the Service belongs to
httpServerBasePath String the container related base path for the embedded [8]Jetty Webserver
httpServerPort String the port for the embedded [9]Jetty Webserver
monitorRoleString String the Role on the VOMS related to Site/VO Admin ( to be used when the service downloads info from VOMS)
UseEmbeddedBroker Boolean The Service can run an embedded ActiveMQ instance ( to be configured only for testing purpose, not suggested for Production environments)
DailySummary Boolean Specify if the service has to create a daily report containing the messages received for each scope
MessageSelectors String Specify if to use MessageSelectors on Broker Subscriptions


A sample JNDI:

   <!-- DB Structure file -->
      <environment
         name="DBFile" 
         value="dbqueries.file" 
         type="java.lang.String"
         override="false" />      
      
      <environment
         name="MailRecipients" 
         value="recipients.txt" 
         type="java.lang.String"
         override="false" />
         
         
      <!-- Notify By Mail-->
      <environment
         name="NotifiybyMail" 
         value="true" 
         type="java.lang.Boolean"
         override="false" />
         
      <environment 
         name="startScopes" 
          value="/gcube/devsec" 
          type="java.lang.String"
          override="false" />
         
         
      <environment
           name="httpServerBasePath" 
           value="jetty/webapps" 
           type="java.lang.String"
           override="false" />

      <environment
           name="httpServerPort" 
           value="6900" 
           type="java.lang.String"
           override="false" />
         
      <environment
         name="monitorRoleString" 
         value="Role=VO-Admin" 
         type="java.lang.String"
         override="false" />
         
      <environment
         name="UseEmbeddedBroker" 
         value="false" 
         type="java.lang.Boolean"
         override="false" />
      
      <environment
         name="MailSummary" 
         value="true" 
         type="java.lang.Boolean"
         override="false" />

   <environment
    name="MessageSelectors" 
    value="MessageType = 'TEST', MessageType = 'NOTIFICATION'" 
    type="java.lang.String"
    override="false" />

DB Structure

The Database that stores the information related to messages, is composed by three tables:

  • VO
  • GHNMESSAGE
  • RIMESSAGE

The VO table store the VOs the Service Monitors, The GHNMESSAGE table stores information about Running Instance messages, they can be both NOTIFICATION and TEST type :

  • MessageId
  • ServiceName
  • ServiceClass
  • GHNName
  • description
  • testType
  • result
  • scope
  • date
  • time

The GHNMESSAGE table structure cotains the same fields except for ServiceClass and ServiceName.


Software Dependencies

The Service depends on the following list of Third-party libraries: