Difference between revisions of "GRS2"

From Gcube Wiki
Jump to: navigation, search
(Created page with '== gRS == === Introduction === The goal of the gRS framework is to enable point to point producer consumer communication. For this to be achieved and connect collocated or remote…')
 
(HTTP Protocol)
 
(68 intermediate revisions by 4 users not shown)
Line 1: Line 1:
== gRS ==
+
<!-- CATEGORIES -->
=== Introduction ===
+
[[Category:Developer's Guide]]
 +
<!-- END CATEGORIES -->
 +
== Introduction ==
 
The goal of the gRS framework is to enable point to point producer consumer communication. For this to be achieved and connect collocated or remotely located actors the framework protects the two parties from technology specific knowledge and limitations. Additionally it can provide upon request value adding functionalities to enhance the clients communication capabilities.
 
The goal of the gRS framework is to enable point to point producer consumer communication. For this to be achieved and connect collocated or remotely located actors the framework protects the two parties from technology specific knowledge and limitations. Additionally it can provide upon request value adding functionalities to enhance the clients communication capabilities.
  
Line 11: Line 13:
 
In modern systems the situation frequently appears where the producers output is already available for access through some protocol. The gRS has a very extendable design as to the way the producer will serve its payload. A number of protocols can be used, ftp, http, network storage, local filesystem, and many more can be implemented very easily and modularly.
 
In modern systems the situation frequently appears where the producers output is already available for access through some protocol. The gRS has a very extendable design as to the way the producer will serve its payload. A number of protocols can be used, ftp, http, network storage, local filesystem, and many more can be implemented very easily and modularly.
  
In all producer consumer scenarios the common delimiter is always the synchronization of the two. The gRS handles this synchronization transparently to the clients who can either use the framework's mechanisms to block waiting for desired circumstances or event to be notified when some criteria is met. The whole design of the framework is event based. It is possible to have a producer consumer scenario basing all the interactions of the two clients solely on events that the two exchange attaching custom objects and information. Additionally to making the framework more versatile at its core, events also allow to have a two way flow of communication not only for control events, but also as feedback from the consumer to the producer and vise versa, external to the usual control signals in common producer consumer scenarios.
+
gRS2 component is available in our Maven repositories with the following coordinates:
 +
<source lang="xml">
 +
<groupId>org.gcube.execution</groupId>
 +
<artifactId>grs2library</artifactId>
 +
<version>...</version>
 +
</source>
  
=== Overview ===
+
== TCP Connection Manager ==
One can distinguish between two common cases in gRS usage as will be detailed in later sections. The following two overviews depict this two cases.
+
In case the gRS created is to be shared with a remote, or in general in a different address space than the one the writer is in, a TCPConnectionManager should be used. A TCPConnectionManager is static in the context of the address space it is used and is initialized though a singleton pattern. Therefore, one can safely initialize it in any context it is useful for him and do not worry if other components are also trying to initialize it. An example of initializing the TCPConnectionManager is shown in the following snippet.
  
[[Image:]]
+
<source lang="java">
 +
List<PortRange> ports=new ArrayList<PortRange>(); //The ports that the TCPConnection manager should use
 +
ports.add(new PortRange(3000, 3050));            //Any in the range between 3000 and 3050
 +
ports.add(new PortRange(3055, 3055));            //Port 3055
 +
TCPConnectionManager.Init(
 +
  new TCPConnectionManagerConfig("localhost", //The hostname by which the machine is reachable
 +
    ports,                                    //The ports that can be used by the connection manager
 +
    true                                      //If no port ranges were provided, or none of them could be used, use a random available port
 +
));
 +
TCPConnectionManager.RegisterEntry(new TCPConnectionHandler());      //Register the handler for the gRS2 incoming requests
 +
TCPConnectionManager.RegisterEntry(new TCPStoreConnectionHandler()); //Register the handler for the gRS2 store incoming requests
 +
</source>
  
 +
A gCube service could use the onInit event to make this initialization. The TCPConnectionManager is located in the MadgikCommons package.
  
[[Image:]]
+
== Definitions ==
 +
Every gRS needs to have explicitly defined the definitions of the records it holds. The record definitions in turn contain the field definitions they contain. Every record that is then handled by the specific gRS must comply to one of the definitions initially provided to the gRS. For example, creating a gRS that handles only one type of records that in turn contains only a single type of field is presented in the following snippet
  
 +
<source lang="java">
 +
//The gRS will contain only one type of records which in turn contains only a single field
 +
RecordDefinition[] defs=new RecordDefinition[]{          //A gRS can contain a number of different record definitions
 +
    new GenericRecordDefinition((new FieldDefinition[] { //A record can contain a number of different field definitions
 +
    new StringFieldDefinition("ThisIsTheField")          //The definition of the field
 +
  }))
 +
};
 +
</source>
  
== Records and Record Pools ==
+
== Record Writer Initialization ==
=== Record Pool (gr.uoa.di.madgik.grs.pool) ===
+
To initialize a gRS2 writer one must simply create a new instance of the available writers. Depending on the configuration detail one wishes to set, a number of different constructors are available
The main data store of the gRS framework and main synchronization point is the record pool. Any record pool implementation must conform to a specific interface. Different implementations of the record pool may use different technologies to store their payload. Currently there is one available implementation that keeps all the records stored in an in memory list. The records are consumed in the same order they are produced although consumers can choose to consume them in a random order. Additionally to the list based storage of the records, a map based interface is also provided for easy access of records based on their ids.
+
  
A record pool is associated with a number of management observing, monitoring and facilitating a number of operations that can be performed on the pool.
+
<source lang="java">
 +
writer=new RecordWriter<GenericRecord>(
 +
  proxy, //The proxy that defines the way the writer can be accessed
 +
  defs  //The definitions of the records the gRS handles
 +
);
  
* ''DiskRecordBuffer'' – A disk buffer that can be used as temporary storage for records and fields as will be explained in later sections.
+
writer=new RecordWriter<GenericRecord>(
* ''HibernationObserver'' – An observer of events of specific type that manage temporal storage and retrieval from a disk buffer. More will be explained in later sections.
+
  proxy, //The proxy that defines the way the writer can be accessed
* ''DefinitionManager'' – A common container class that manages definitions of records and fields associated with the pool as will be explained in later sections.
+
  defs,  //The definitions of the records the gRS handles
* ''PoolState'' – A class holding events that can be emitted by the pool or forwarded through the pool. More will be detailed in later sections.
+
  50,    //The capacity of the underlying synchronization buffer
* ''PoolStatistics'' – A class holding and updating usage statistics of the record pool as will be explained in later sections.
+
  2,    //The maximum number of parallel records that can be concurrently accessed on partial transfer 
* ''LifeCycleManager'' – Policies that govern the lifetime of the record pool can be defined and implemented through this class as will be detailed in later sections.
+
  0.5f  //The maximum fraction of the buffer that should be transferred during mirroring 
 +
);
  
The pool instantiation is handled internally by the framework and is done through a set of configuration parameters. This parameters include the following information.  
+
writer=new RecordWriter<GenericRecord>(
 +
  proxy,          //The proxy that defines the way the writer can be accessed
 +
  defs,            //The definitions of the records the gRS handles
 +
  50,              //The capacity of the underlying synchronization buffer
 +
  2,              //The maximum number of parallel records that can be concurrently accessed on partial transfer 
 +
  0.5f,            //The maximum fraction of the buffer that should be transferred during mirroring
 +
  60,              //The timeout in time units after which an inactive gRS can be disposed 
 +
  TimeUnit.SECONDS //The time unit in timeout after which an inactive gRS can be disposed
 +
);
 +
</source>
  
* Transport directive – Whenever a pool is transported to a remote location, a directive can be set to describe the transport method preferred for the pool's items. This directive will be used for all its records and their fields unless some record or field has specifically requested some other type of preferred transport. The available directives are full and partial. The former implies that all the records and fields payload will be immediately made available to the remote site once they will be transferred. The later will not send any payload initially for any of the fields unless the field attributes specify that some prefetching is to be done.
+
For convenience, if one wishes to create a writer with the same configuration as a already available reader, one can use one of the following constructors
* Pool Type – The pool type property indicates the type of pool to be instantiated. Since the pool instantiation is done internally through the framework the available pools must be known to the framework to perform their instantiation and handle differently cases that may require different parametrization or configuration. The only currently available pool type is the memory list pool.
+
* Pool Synchronization – When a record pool is consumed remotely to avoid continuous unnecessary communication between remote locations, a synchronization interval can be set. This interval is used by the synchronization protocol as will be explained in later sections. Depending on the protocol used this interval can be used or not based on the protocols implementation and synchronization capabilities. The synchronization hint will be available for the protocols that need to use it.
+
* Incomplete Pool Synchronization factor – During synchronization of remote pools situations will arise where one of the two parties will be requesting information that the other party will not be able to respond immediately to. In such cases one may need to alter the synchronization interval hint to require more frequent or less frequent communication. This is performed through the factor provided through this parameter.
+
  
There are two cases that can be distinguished for a record pool consumption. The first is the degenerated one where the producer and consumer are collocated and within the boundaries of VM. This means that they have a shared memory address space and references to shared classes can be used. In this situation for optimization purposes the reference to the authored record pool can simply be passed to the consumer. The consumer sees directly the record pool the producer is authoring, accesses the references the producer creates and no marshaling is performed. The synchronization is performed directly through events producer and consumer emit and there is a single instance of all the pool's configuration and management classes. The second case is that of a different address space. This may imply the same physical machine but a different process, or a completely different host altogether. In that case the reference to the authored record pool cannot be shared and a new instance must be created. In this case a synchronization protocol stands between the two instances of the record pool and based on the produced events handles transparently the synchronization of the two instances. The two instances of the record pool are created with the same configuration parameters and therefor have exactly the same behavior. The mediation of the remote pool and their synchronization is detailed in depth in later sections.
+
<source lang="java">
 +
writer=new RecordWriter<GenericRecord>(
 +
  proxy, //The proxy that defines the way the writer can be accessed
 +
  reader //The reader to copy configuration from
 +
);
  
 +
writer=new RecordWriter<GenericRecord>(
 +
  proxy, //The proxy that defines the way the writer can be accessed
 +
  reader, //The reader to copy configuration from
 +
  50,    //The capacity of the underlying synchronization buffer
 +
  2,    //The maximum number of parallel records that can be concurrently accessed on partial transfer 
 +
  0.5f  //The maximum fraction of the buffer that should be transferred during mirroring 
 +
);
  
[[Image:]]
+
writer=new RecordWriter<GenericRecord>(
 +
  proxy, //The proxy that defines the way the writer can be accessed
 +
  reader, //The reader to copy configuration from
 +
  50,    //The capacity of the underlying synchronization buffer
 +
  2,    //The maximum number of parallel records that can be concurrently accessed on partial transfer 
 +
  0.5f,  //The maximum fraction of the buffer that should be transferred during mirroring 
 +
  60,              //The timeout in time units after which an inactive gRS can be disposed 
 +
  TimeUnit.SECONDS //The time unit in timeout after which an inactive gRS can be disposed
 +
);
 +
</source>
  
 +
In the first of the above examples, all configuration is copied from the reader. The extra configuration parameters override the ones of the reader, meaning that in the last example only the record definitions are duplicated from the reader configuration.
  
=== Buffering (gr.uoa.di.madgik.grs.pool.buffering) ===
+
== Custom Records ==
The gRS is meant to mediate between to sides. A producing and a consuming side. Either of the two sides can have inherent knowledge on their processing requirements and their production / consumption needs. A pr4oducer for example may know that in order to produce the next record he will require some processing that takes a curtain amount of time. A consumer may likewise know that the processing of its input will require some amount of time. In order to better optimize their communication they may deem necessary that some amount of data will always be there ready for the other party to have available. The Producer may request that at all times no matter how many records the Consumer has actually requested this far, there will be some amount of records produced by him to be readily available in case the consumer requests for them. The consumer respectively may want to specify that some number of records will already be available to his side whether or not he has already specifically requested them or not so that when he requests them he will not have to wait until they are transferred to him.  
+
All records that are handled through gRS2 must extend a base class of gr.uoa.di.madgik.grs.record.Record. Extending this class means implementing a couple of required methods which are mainly used to properly manage the transport of the custom record and any additional information it might cary. Although the actual payload of the record is to be managed through fields, one can also assign additional information in the record itself. In the example that follows, a custom record with its respective definition is created. The record hides the fields that it manages by exposing direct getters and setters over a known definition that it provides and should be used with. Additionally, it defines additional information that describes the record as a whole and instead of using additional fields in order to do this, it defines placeholders within the record itself.
  
[[Image:]]
+
=== Definition ===
 +
<source lang="java">
 +
import java.io.DataInput;
 +
import java.io.DataOutput;
 +
import gr.uoa.di.madgik.grs.buffer.IBuffer.TransportDirective;
 +
import gr.uoa.di.madgik.grs.record.GRS2RecordSerializationException;
 +
import gr.uoa.di.madgik.grs.record.RecordDefinition;
 +
import gr.uoa.di.madgik.grs.record.field.FieldDefinition;
 +
import gr.uoa.di.madgik.grs.record.field.StringFieldDefinition;
  
 +
public class MyRecordDefinition extends RecordDefinition
 +
{
 +
  public static final String PayloadFieldName="payload";
 +
  public static final boolean DefaultCompress=false;
  
This is achieved though buffering policies that may be applicable to either, both or none of the two sides. Each one of the actors can specify the buffering policy it want to have applied. The framework will try to honor the actor's request by having always the number of records specified available for him in case of the Consumer, or already produced in case of the Producer. The Buffer policy is configurable through the ''BufferPolicyConfig'' configuration class. There are three configurable values the configuration class exposes.
+
  public MyRecordDefinition()
 +
  {
 +
    StringFieldDefinition def=new StringFieldDefinition(MyRecordDefinition.PayloadFieldName);
 +
    def.setTransportDirective(TransportDirective.Full);
 +
    def.setCompress(MyRecordDefinition.DefaultCompress);
 +
    this.Fields = new FieldDefinition[]{def};
 +
  }
  
* Use buffering – Whether or not the client wants to have buffering enabled or not. If the buffering is disabled the producer will have to specifically request when and how many more items he wants to have delivered to him and wait until they are produced (depending on the Producer behavior) and then transferred to him (depending on the mediating proxy and the producer and consumer relative position)
+
  public MyRecordDefinition(boolean compress)
* Number of records to have buffered – The number of records the buffer policy will try to keep for each side. For the Writer side, this means that the buffer policy will emit items requested events to the client to always keep him ahead of the number of items the Reader has requested by the specified number. For the Reader side it means that the policy will request more items from the Writer than the number that the consumer has actually requested. Since while this process takes place the producer and consumer are also active, depending on the clients usage the buffering policy may not be able to be fully complied with. This means that at some point the number of items that the producer will have produced will be greater or less than what the Buffering policy needs, and / or that the consumer will have available more or less items than the ones that the policy needs. The policy will make a best effort attempt to stay within the bounds specified.
+
  {
* Flexibility Factor – Having an absolute number of items as a buffering policy attribute means that there might be cases that a lot of traffic can be produced from the Reader to the Writer and from the Writer to the Producer client. If for example the Reader has requested a buffering of 20 records ahead of what he is consuming, the moment he will request the next record, the request will be served by the buffered records and a request for a new item will be immediately send to the Writer. This means a new event for each reader consumption. Analogous examples can be made for the Writer's side. The flexibility factor allows for some softening of the policy. It can be specified that some percentage of the buffer can be allowed to be emptied without necessarily immediately requesting to be filled. For example a flexibility factor of 0.5 will let the buffer become half empty until an event is emitted to cover the half missing buffer.
+
    StringFieldDefinition def=new StringFieldDefinition(MyRecordDefinition.PayloadFieldName);
 +
    def.setTransportDirective(TransportDirective.Full);
 +
    def.setCompress(compress);
 +
    this.Fields = new FieldDefinition[]{def};
 +
  }
  
Another interesting point is the one of event flooding. The buffer becomes half empty and an event is emitted. Before the needed items are supplied the consumer requests for another record. The buffering policy is consulted and decides that half+1 items are now needed. This process continues and until the initially requested items have been delivered, the Reader has already requested a large number of items quite possibly a great deal far from the number actually needed. To protect it self from this situation, whenever a Reader or a Writer is instructed by the ''BufferPolicy'' to request for some records, he temporarily suspends the buffer until the requested items have been made available. That way until then the client works with the number of items still available in the buffer without producing unnecessary requests until the initially requested number of records have arrived in which case the buffer will recheck what is now needed and may produce a new request event.
+
  @Override
 +
  public boolean extendEquals(Object obj)
 +
  {
 +
    if(!(obj instanceof MyRecordDefinition)) return false;
 +
    return true;
 +
  }
  
 +
  @Override
 +
  public void extendDeflate(DataOutput out) throws GRS2RecordSerializationException
 +
  {
 +
    //nothing to deflate
 +
  }
  
=== Definition Management (gr.uoa.di.madgik.grs.pool.defmanagment) ===
+
  @Override
All records that are added to a record pool must always contain a definition. A record definition (''RecordDef)'' contains information that enable the framework to handle the record in a uniform manner regardless of its actual implementation. The contents of a record definition is discussed in a different section. Similarly, every field that may be contained in a record must also contain a field definition (''FieldDef'') for the same purpose.
+
  public void extendInflate(DataInput in) throws GRS2RecordSerializationException
 +
  {
 +
    //nothing to inflate
 +
  }
 +
}
 +
</source>
  
The common case as expected is that most record pools will contain largely similar in their definition records. Similarly each record is expected largely similar field in their definition. Having even though each record and each field of a record define their own definitions is an obvious waste of resources and an overhead that can be easily dealt with. Each record pool contains a ''DefinitionManager. ''This class is the placeholder of all definitions of records and fields within that pool.  
+
=== Record ===
 +
<source lang="java">
 +
import java.io.DataInput;
 +
import java.io.DataOutput;
 +
import java.io.IOException;
 +
import gr.uoa.di.madgik.grs.GRS2Exception;
 +
import gr.uoa.di.madgik.grs.buffer.GRS2BufferException;
 +
import gr.uoa.di.madgik.grs.record.GRS2RecordDefinitionException;
 +
import gr.uoa.di.madgik.grs.record.GRS2RecordSerializationException;
 +
import gr.uoa.di.madgik.grs.record.Record;
 +
import gr.uoa.di.madgik.grs.record.field.Field;
 +
import gr.uoa.di.madgik.grs.record.field.StringField;
  
[[Image:]]
+
public class MyRecord extends Record
 +
{
 +
  private String id="none";
 +
  private String collection="none";
 +
 +
  public MyRecord()
 +
  {
 +
    this.setFields(new Field[]{new StringField()});
 +
  }
 +
 +
  public MyRecord(String payload)
 +
  {
 +
    this.setFields(new Field[]{new StringField(payload)});
 +
  }
 +
 +
  public MyRecord(String id,String collection)
 +
  {
 +
    this.setId(id);
 +
    this.setCollection(collection);
 +
    this.setFields(new Field[]{new StringField()});
 +
  }
 +
 +
  public MyRecord(String id,String collection,String payload)
 +
  {
 +
    this.setId(id);
 +
    this.setCollection(collection);
 +
    this.setFields(new Field[]{new StringField(payload)});
 +
  }
 +
 +
  public void setId(String id) { this.id = id; }
  
Record definitions and field definitions can be added and removed from the manager. Every time a definition is added to the manager, the existing definitions are checked. If the definition is found to be already contained in the manager's collection, a reference counter for this definition is incremented and a reference to this already stored definition is returned to the caller. This reference is then stored in the respective record or field thereby sharing the definition with all other elements that have the same definition. In case a definition must be disposed, instead of destroying the actual definition the operation is passed to the manager which will locate the referenced definition and decrease its reference counter. In case the reference counter has reached 0, the definition can be disposed.
+
  public String getId() { return id; }
  
This approach although it serves its purpose very well and very efficiently, has two implications. Firstly, after a record has been added to a pool and its definition shared with the rest of the records available, its definition cannot be modified. This of course although it is a side affect is not really a limitation as this restrictions must be imposed anyway from the framework. After a record is added to a pool it is immediately available to the pool's consumers. Changing any of its properties does not necessarily mean that the consumer will be notified of this change as he might have already consumed it and moved on. The second implication is the need to be able to identify when two definitions are considered equal. Equality is handled at the level of the ''RecordDef'' and ''FieldDef ''classes and is simply a matter of identifying the distinct values of the definitions that make each definition unique.  
+
  public void setCollection(String collection) { this.collection = collection; }
  
 +
  public String getCollection() { return collection; }
  
=== Pool Events (gr.uoa.di.madgik.grs.pool.events) ===
+
  public void setPayload(String payload) throws GRS2Exception
The whole function of the gRS is event based. This package is the hart of the framework. The events declared here are the ones that define the flow of control governing the lifetime of a record pool, its creation, consumption, synchronization and destruction. Each record pool has associated a ''PoolState'' class in which the events that the pool can emit are defined and through which an interested party can register for the event. The ''PoolState'' keeps the events in a named collection through which a specific event can be requested.  
+
  {
 +
    this.getField(MyRecordDefinition.PayloadFieldName).setPayload(payload);
 +
  }
 +
 +
  public String getPayload() throws GRS2Exception
 +
  {
 +
    return this.getField(MyRecordDefinition.PayloadFieldName).getPayload();
 +
  }
 +
 +
  @Override
 +
  public StringField getField(String name) throws GRS2RecordDefinitionException, GRS2BufferException
 +
  {
 +
    return (StringField)super.getField(name);
 +
  }
  
The event model used is a derivative of the Observer – Observable pattern. Each one of the events declared in the package and stored in the PoolState event collection extends java Observable. Observers can register for state updates of the specific instance of the event. This of course means that there is a single point of update for each event which automatically raises a lot of issues of concurrency, synchronization and race conditions. So the framework uses the Observables for Observers to register with, but the actual state update is not carried through the Observabe the Observers have registered with, but rather as an instance of the same event that is passed as an argument in the respective update call. That way the originally stored Observable instances in the ''PoolState ''act as the registration point while a new instance of the Observable that is created by the emitting party is the one that actually carries the notification data.  
+
  @Override
 +
  public void extendSend(DataOutput out) throws GRS2RecordSerializationException
 +
  {
 +
    this.extendDeflate(out);
 +
  }
  
All events extend the ''PoolStateEvent'' which in turns implements the Observable interface and the ''IProxySubPacket'' . The later is used during transport to serialize and deserialize an event and is explained in later sections. The events defined and are the bases of the framework are the following:
+
  @Override
 +
  public void extendReceive(DataInput in) throws GRS2RecordSerializationException
 +
  {
 +
    this.extendInflate(in, false);
 +
  }
  
* MessageToReaderEvent – This event is send with a defined recipient. It can be send by either some component of the framework to notify the consumer client of some condition, such as an error that the consumer may want to be notified for, or from the client of the writer, the producer him self. The message is composed of two parts. One is a simple string payload that can contain any kind of information the sender may want to include. The other part is an attachment. The sender of the object can attach an object implementing the IMarshlableObject interface and the message will be delivered to the consumer along with the attachment object as submitted.
+
  @Override
* MessageToWriterEvent – This event is send with a defined recipient. It can be send by either some component of the framework to notify the producer client of some condition, such as an error that the producer may want to be notified for, or from the client of the reader, the consumer him self. The message is composed of two parts. One is a simple string payload that can contain any kind of information the sender may want to include. The other part is an attachment. The sender of the object can attach an object implementing the IMarshlableObject interface and the message will be delivered to the produer along with the attachment object as submitted.
+
  public void extendDeflate(DataOutput out) throws GRS2RecordSerializationException
* ItemAddedEvent – This event is generated by the writers when a number of records are inserted in the record pool. Based on this event the reader can evaluate when there are more records for him to consume.
+
  {
* ItemRequestedEvent – This event is generated by the readers to request a number of records to be produced. Based on this the writer can evaluate when he should produce more records to be consumed.
+
    try
* ProductionCompletedEvent – This event is generated by the writers when they finish authoring the record pool. It can be used by the readers to identify whether or not they can continue requesting more records than the ones already available to them or not. This event does not imply that the producer is disposed. The Writer instance is still available to respond to control signals as they may be needed. The producer can still use the record pool to exchange messages with the consumer.
+
    {
 +
      out.writeUTF(this.id);
 +
      out.writeUTF(this.collection);
 +
    }catch(IOException ex)
 +
    {
 +
      throw new GRS2RecordSerializationException("Could not deflate record", ex);
 +
    }
 +
  }
  
[[Image:]]
+
  @Override
 +
  public void extendInflate(DataInput in, boolean reset) throws GRS2RecordSerializationException
 +
  {
 +
    try
 +
    {
 +
      this.id=in.readUTF();
 +
      this.collection=in.readUTF();
 +
    }catch(IOException ex)
 +
    {
 +
      throw new GRS2RecordSerializationException("Could not inflate record", ex);
 +
    }
 +
  }
  
* ConsumptionCompletedEvent – This event is generated by the readers when they have finished consuming the record pool. It can be used by the consumers to stop producing more records and stop authoring the record pool. This event does not imply that the consumer is disposed. The Reader can still be used to access records already made available to it, and the records that have already been accessed can be used. The producer and consumer can still exchange messages.
+
  @Override
* ItemTouchedEvent – This event is emitted whenever a record is accessed in some way. Whether it is modified, transfered, accessed, some of its payload moved to the consumer from the producer, etc. It is used by the pool life cycle management for pool housekeeping as explained in later section.
+
  public void extendDispose()
* ItemNotNeededEvent – This event is emitted by a consumer whenever he wants to declare that he is done accessing the record and all its payload and that the record is no longer needed. It is used by the pool life cycle management for pool housekeeping as explained in later section. After this event is emitted for some record, the specific record is candidate for disposal at any time the framework sees fitting.  
+
  {
* FieldPayloadRequestedEvent – The framework supports multiple transfer functionalities and parameterizations. In any but the simple cases of collocated producers and consumers and full payload being transferred immediately across the wire to the consumer, the payload of a record's field may need to become gradually available to the consumer through partial transfer. For a payload part to be transferred, the consumer will need to request explicitly through a call the additional payload or implicitly through the use of one of the frameworks mediation utilities. This request is done through this event which requests for a specific record, a specific field of the record, a number of bytes to be made available. There are two ways that this event can be handled by the framework. The first case is when the producer and consumer are connected through a ''LocalProxy''. In this case both actors are collocated and the payload of every field is already available to the consumer. In this case each writer must be able to catch these events and immediately respond with a FieldNoMorePayloadEvent to indicate to the consumer that all the available payload for the field is already available for him to consume. The other case is to have some other type of mediation between the two actors where there must be some kind of transport between the two actors. In this case the protocol connecting the producer side of the proxy must catch the event, retrieve the needed payload from the respective field and respond with the appropriate event, FieldPayloadAddedEvent or FieldNoMorePayloadEvent depending on whether more payload has been made available or no more payload can be made available as all the payload is already transferred,  
+
    this.id=null;
* FieldPayloadAddedEvent – This event is send as a response to a FieldPayloadRequestedEvent when more data has been made available to the consumer as per request as is sketched above.
+
    this.collection=null;
* FieldNoMorePayloadEvent – This event is send as a response to a FieldPayloadRequestedEvent when no more data can be send to the consumer as per request because all the available data is already there, as is sketched above.
+
  }
* LocalPoolItemAddedEvent – This event indicate when a new item has been added to the local pool. Local to the component that catches the event. For example a reader catching the event will interpret it as new item created by the writer, and made available to the reader side as a response to an ItemRequestedEvent that he send. This event is used by the reader and writer classes to update their buffer policies.
+
}
* LocalPoolItemRemovedEvent – This event indicate when an item has been retrieved from the local pool. Local to the component that catches the event. For example a writer catching the event will interpret it as an item requested previously by the reader and created and added to the pool by the writer has been transferred to the reader who consumed it. This event is used by the reader and writer classes to update their buffer policies.
+
</source>
* ItemHibernateEvent – A record during its lifetime can be considered as needed to be kept for later use by a consumer but for the time being it is not needed and should be somehow hidden to free up resources. For this purpose any record and its included fields which is already contained in a record pool can be hibernated. The record is persisted to disk and can be subsequently waken up in its initial state. When a record is marked to be hibernated this event is emitted and both the pools that may contain it, producer and consumer pool if the pools are remote and synchronized, will catch the event and hibernate the record. This event processing may be asynchronous.  
+
 
* ItemAwakeEvent – At some point a hibernated record may be needed again. At this point both the record in the consumer and in the producer side must be awaken and restored to the exact state that is was before the hibernation. This event is emitted and both pools will resurrect the record and its fields from the disk buffer.
+
== Simple Example ==
* DisposePoolEvent – This event is emitted when the record pool and all its records are no longer needed. The dispose method is called either because the producer decided that he wants to revoke the produced record pool or because the reader decided that he is done consuming the record pool. Since the communication is point to point, once the consumer stops consuming, the record pool is fully disposed. The record pool disposal disposes each record, and if the record is hibernated it wakes it up before it is disposed to make sure each field performs its cleanup. Once this event is emitted all state associated with the record pool is cleaned up whether it is data within the record pool it self or registry entries, open sockets, etc
+
In the following snippets, a simple writer and respective reader is presented to show the full usage of the presented elements
 +
 
 +
=== Writer ===
 +
<source lang="java">
 +
import java.net.URI;
 +
import java.util.concurrent.TimeUnit;
 +
import gr.uoa.di.madgik.grs.buffer.IBuffer.Status;
 +
import gr.uoa.di.madgik.grs.proxy.IWriterProxy;
 +
import gr.uoa.di.madgik.grs.record.GenericRecord;
 +
import gr.uoa.di.madgik.grs.record.GenericRecordDefinition;
 +
import gr.uoa.di.madgik.grs.record.RecordDefinition;
 +
import gr.uoa.di.madgik.grs.record.field.Field;
 +
import gr.uoa.di.madgik.grs.record.field.FieldDefinition;
 +
import gr.uoa.di.madgik.grs.record.field.StringField;
 +
import gr.uoa.di.madgik.grs.record.field.StringFieldDefinition;
 +
import gr.uoa.di.madgik.grs.writer.GRS2WriterException;
 +
import gr.uoa.di.madgik.grs.writer.RecordWriter;
 +
 
 +
public class StringWriter extends Thread
 +
{
 +
  private RecordWriter<GenericRecord> writer=null;
 +
 
 +
  public StringWriter(IWriterProxy proxy) throws GRS2WriterException
 +
  {
 +
    //The gRS will contain only one type of records which in turn contains only a single field
 +
    RecordDefinition[] defs=new RecordDefinition[]{          //A gRS can contain a number of different record definitions
 +
        new GenericRecordDefinition((new FieldDefinition[] { //A record can contain a number of different field definitions
 +
        new StringFieldDefinition("ThisIsTheField")          //The definition of the field
 +
      }))
 +
    };
 +
    writer=new RecordWriter<GenericRecord>(
 +
        proxy, //The proxy that defines the way the writer can be accessed
 +
        defs  //The definitions of the records the gRS handles
 +
      );
 +
    }
 +
 
 +
    public URI getLocator() throws GRS2WriterException
 +
    {
 +
      return writer.getLocator();
 +
    }
 +
 
 +
    public void run()
 +
    {
 +
      try
 +
      {
 +
        for(int i=0;i<500;i+=1)
 +
        {
 +
          //while the reader hasn't stopped reading
 +
          if(writer.getStatus()!=Status.Open) break;
 +
          GenericRecord rec=new GenericRecord();
 +
          //Only a string field is added to the record as per definition
 +
          rec.setFields(new Field[]{new StringField("Hello world "+i)});
 +
          //if the buffer is in maximum capacity for the specified interval don;t wait any more
 +
          if(!writer.put(rec,60,TimeUnit.SECONDS)) break;
 +
        }
 +
        //if the reader hasn't already disposed the buffer, close to notify reader that no more will be provided
 +
        if(writer.getStatus()!=Status.Dispose) writer.close();
 +
      }catch(Exception ex)
 +
      {
 +
        ex.printStackTrace();
 +
      }
 +
    }
 +
}
 +
</source>
 +
 
 +
=== Reader ===
 +
<source lang="java">
 +
import gr.uoa.di.madgik.grs.reader.ForwardReader;
 +
import gr.uoa.di.madgik.grs.reader.GRS2ReaderException;
 +
import gr.uoa.di.madgik.grs.record.GenericRecord;
 +
import gr.uoa.di.madgik.grs.record.field.StringField;
 +
import java.net.URI;
 +
 
 +
public class StringReader extends Thread
 +
{
 +
  ForwardReader<GenericRecord> reader=null;
 +
 
 +
  public StringReader(URI locator) throws GRS2ReaderException
 +
  {
 +
    reader=new ForwardReader<GenericRecord>(locator);
 +
  }
 +
 
 +
  public void run()
 +
  {
 +
    try
 +
    {
 +
      for(GenericRecord rec : reader)
 +
      {
 +
        //In case a timeout occurs while optimistically waiting for more records form an originally open writer
 +
        if(rec==null) break;
 +
        //Retrieve the required field of the type available in the gRS definitions
 +
        System.out.println(((StringField)rec.getField("ThisIsTheField")).getPayload());
 +
      }
 +
      //Close the reader to release and dispose any resources in boith reader and writer sides
 +
      reader.close();
 +
    }catch(Exception ex)
 +
    {
 +
      ex.printStackTrace();
 +
    }
 +
  }
 +
}
 +
</source>
 +
 
 +
=== Main ===
 +
<source lang="java">
 +
import java.util.ArrayList;
 +
import java.util.List;
 +
import gr.uoa.di.madgik.commons.server.PortRange;
 +
import gr.uoa.di.madgik.commons.server.TCPConnectionManager;
 +
import gr.uoa.di.madgik.commons.server.TCPConnectionManagerConfig;
 +
import gr.uoa.di.madgik.grs.proxy.tcp.TCPConnectionHandler;
 +
import gr.uoa.di.madgik.grs.proxy.tcp.TCPStoreConnectionHandler;
 +
import gr.uoa.di.madgik.grs.proxy.tcp.TCPWriterProxy;
 +
 
 +
public class MainTest
 +
{
 +
  public static void main(String []args) throws Exception
 +
  {
 +
    List<PortRange> ports=new ArrayList<PortRange>(); //The ports that the TCPConnection manager should use
 +
    ports.add(new PortRange(3000, 3050));            //Any in the range between 3000 and 3050
 +
    ports.add(new PortRange(3055, 3055));            //Port 3055
 +
    TCPConnectionManager.Init(
 +
      new TCPConnectionManagerConfig("localhost", //The hostname by which the machine is reachable
 +
        ports,                                    //The ports that can be used by the connection manager
 +
        true                                      //If no port ranges were provided, or none of them could be used, use a random available port
 +
    ));
 +
    TCPConnectionManager.RegisterEntry(new TCPConnectionHandler());      //Register the handler for the gRS2 incoming requests
 +
    TCPConnectionManager.RegisterEntry(new TCPStoreConnectionHandler()); //Register the handler for the gRS2 store incoming requests
 +
 
 +
    // gr.uoa.di.madgik.grs.proxy.local.LocalWriterProxy Could be used instead for local only, all in memory, access
 +
    StringWriter writer=new StringWriter(new TCPWriterProxy());
 +
    StringReader reader=new StringReader(writer.getLocator());
 +
 
 +
    writer.start();
 +
    reader.start();
 +
 
 +
    writer.join();
 +
    reader.join();
 +
  }
 +
}
 +
</source>
 +
 
 +
== Events ==
 +
As was already mentioned, the gRS2 serves as a bidirectional communication channel through events that can be propagated from either the writer to the reader or form the reader to the writer along the normal data flow. These events can serve any purpose fitting to the application logic, from control signals to full data transport purposes, although in the latter case, the payload size should be kept moderate as it is always handled in memory and cannot use services such as partial transfer, or be declared using internally managed files.
 +
 
 +
An example that has been taken to the extremes follows. In this example, a writer and a reader use only events to propagate their data. One should notice that the communication protocol is now up to the reader and writer clients to be implemented. In the example case this has been handled in a quick and dirty way using sleep timeouts. Additionally, it should be noted that other than the basic key - value internally provided event placeholder, an extendable event "placeholder" has been provided to facilitate any type of custom object transport.
 +
 
 +
=== Writer ===
 +
<source lang="java">
 +
import java.net.URI;
 +
import gr.uoa.di.madgik.grs.buffer.IBuffer.Status;
 +
import gr.uoa.di.madgik.grs.events.BufferEvent;
 +
import gr.uoa.di.madgik.grs.events.KeyValueEvent;
 +
import gr.uoa.di.madgik.grs.events.ObjectEvent;
 +
import gr.uoa.di.madgik.grs.proxy.IWriterProxy;
 +
import gr.uoa.di.madgik.grs.record.GenericRecord;
 +
import gr.uoa.di.madgik.grs.record.GenericRecordDefinition;
 +
import gr.uoa.di.madgik.grs.record.RecordDefinition;
 +
import gr.uoa.di.madgik.grs.record.field.FieldDefinition;
 +
import gr.uoa.di.madgik.grs.record.field.StringFieldDefinition;
 +
import gr.uoa.di.madgik.grs.writer.GRS2WriterException;
 +
import gr.uoa.di.madgik.grs.writer.RecordWriter;
 +
 
 +
public class EventWriter extends Thread
 +
{
 +
  private RecordWriter<GenericRecord> writer=null;
 +
 
 +
  public EventWriter(IWriterProxy proxy) throws GRS2WriterException
 +
  {
 +
    //The gRS will contain only one type of records which in turn contains only a single field
 +
    RecordDefinition[] defs=new RecordDefinition[]{ //A gRS can contain a number of different record definitions
 +
        new GenericRecordDefinition((new FieldDefinition[] { //A record can contain a number of different field definitions
 +
        new StringFieldDefinition("ThisIsTheField") //The definition of the field
 +
      }))
 +
    };
 +
    writer=new RecordWriter<GenericRecord>(
 +
      proxy, //The proxy that defines the way the writer can be accessed
 +
      defs //The definitions of the records the gRS handles
 +
    );
 +
  }
 +
 
 +
  public URI getLocator() throws GRS2WriterException
 +
  {
 +
    return writer.getLocator();
 +
  }
 +
 
 +
  public void receiveWaitKeyValue() throws GRS2WriterException
 +
  {
 +
    while(true)
 +
    {
 +
      BufferEvent ev=writer.receive();
 +
      if(ev==null) try{ Thread.sleep(100);}catch(Exception ex){}
 +
      else
 +
      {
 +
        System.out.println("Received event from reader : "+((KeyValueEvent)ev).getKey()+" - "+((KeyValueEvent)ev).getValue());
 +
        break;
 +
      }
 +
    }
 +
  }
 +
 
 +
  public void receiveWaitObject() throws GRS2WriterException
 +
  {
 +
    while(true)
 +
    {
 +
      BufferEvent ev=writer.receive();
 +
      if(ev==null) try{ Thread.sleep(100);}catch(Exception ex){}
 +
      else
 +
      {
 +
        System.out.println("Reveived event from reader : "+((ObjectEvent)ev).getItem().toString());
 +
        break;
 +
      }
 +
    }
 +
  }
 +
 
 +
  public void run()
 +
  {
 +
    try
 +
    {
 +
      writer.emit(new KeyValueEvent("init", "Hello"));
 +
      for(int i=0;i<5;i+=1) this.receiveWaitKeyValue();
 +
      this.receiveWaitObject();
 +
      if(writer.getStatus()!=Status.Dispose) writer.close();
 +
    }catch(Exception ex)
 +
    {
 +
      ex.printStackTrace();
 +
    }
 +
  }
 +
}
 +
</source>
 +
 
 +
=== Reader ===
 +
<source lang="java">
 +
import gr.uoa.di.madgik.grs.events.BufferEvent;
 +
import gr.uoa.di.madgik.grs.events.KeyValueEvent;
 +
import gr.uoa.di.madgik.grs.events.ObjectEvent;
 +
import gr.uoa.di.madgik.grs.reader.ForwardReader;
 +
import gr.uoa.di.madgik.grs.reader.GRS2ReaderException;
 +
import gr.uoa.di.madgik.grs.record.GenericRecord;
 +
import java.net.URI;
 +
 
 +
public class EventReader extends Thread
 +
{
 +
  ForwardReader<GenericRecord> reader=null;
 +
 
 +
  public EventReader(URI locator) throws GRS2ReaderException
 +
  {
 +
    reader=new ForwardReader<GenericRecord>(locator);
 +
  }
 +
 
 +
  public void run()
 +
  {
 +
    try
 +
    {
 +
      BufferEvent ev=reader.receive();
 +
      System.out.println("Reveived event from writer : "+((KeyValueEvent)ev).getKey()+" - "+((KeyValueEvent)ev).getValue());
 +
      reader.emit(new KeyValueEvent("info", "Hello..."));
 +
      reader.emit(new KeyValueEvent("info", "Hello Again..."));
 +
      reader.emit(new KeyValueEvent("info", "Hello And Again..."));
 +
      reader.emit(new KeyValueEvent("info", "And again..."));
 +
      reader.emit(new KeyValueEvent("info", "Bored already ?"));
 +
      reader.emit(new ObjectEvent(new MyEvent("This is a custom object event", 234)));
 +
      reader.close();
 +
    }catch(Exception ex)
 +
    {
 +
      ex.printStackTrace();
 +
    }
 +
  }
 +
}
 +
</source>
 +
 
 +
=== MyEvent ===
 +
<source lang="java">
 +
import java.io.DataInput;
 +
import java.io.DataOutput;
 +
import gr.uoa.di.madgik.grs.record.GRS2RecordSerializationException;
 +
import gr.uoa.di.madgik.grs.record.IPumpable;
 +
 
 +
public class MyEvent implements IPumpable
 +
{
 +
  private String message=null;
 +
  private int count=0;
 +
 
 +
  public MyEvent(String message,int count)
 +
  {
 +
    this.message=message;
 +
    this.count=count;
 +
  }
 +
 
 +
  public String toString()
 +
  {
 +
    return "event payload message is : '"+this.message+"' with count '"+this.count+"'";
 +
  }
 +
 
 +
  public void deflate(DataOutput out) throws GRS2RecordSerializationException
 +
  {
 +
    try
 +
    {
 +
      out.writeUTF(message);
 +
      out.writeInt(count);
 +
    }catch(Exception ex)
 +
    {
 +
      throw new GRS2RecordSerializationException("Could not serialize event", ex);
 +
    }
 +
  }
 +
 
 +
  public void inflate(DataInput in) throws GRS2RecordSerializationException
 +
  {
 +
    try
 +
    {
 +
      this.message=in.readUTF();
 +
      this.count=in.readInt();
 +
    }catch(Exception ex)
 +
    {
 +
      throw new GRS2RecordSerializationException("Could not deserialize event", ex);
 +
    }
 +
  }
 +
 
 +
  public void inflate(DataInput in, boolean reset) throws GRS2RecordSerializationException
 +
  {
 +
    this.inflate(in);
 +
  }
 +
}
 +
</source>
 +
 
 +
=== Main ===
 +
As in [[GRS2#Main | Simple Main]], using the new writer and reader classes
 +
 
 +
<source lang="java">
 +
  /*...*/
 +
  EventWriter writer=new EventWriter(new TCPWriterProxy());
 +
  EventReader reader=new EventReader(writer.getLocator());
 +
  /*...*/
 +
</source>
 +
 
 +
== Multi Field Records ==
 +
=== Writer ===
 +
<source lang="java">
 +
import java.io.File;
 +
import java.net.URI;
 +
import java.net.URL;
 +
import java.util.concurrent.TimeUnit;
 +
import gr.uoa.di.madgik.grs.buffer.IBuffer.Status;
 +
import gr.uoa.di.madgik.grs.proxy.IWriterProxy;
 +
import gr.uoa.di.madgik.grs.record.GenericRecord;
 +
import gr.uoa.di.madgik.grs.record.GenericRecordDefinition;
 +
import gr.uoa.di.madgik.grs.record.RecordDefinition;
 +
import gr.uoa.di.madgik.grs.record.field.Field;
 +
import gr.uoa.di.madgik.grs.record.field.FieldDefinition;
 +
import gr.uoa.di.madgik.grs.record.field.FileField;
 +
import gr.uoa.di.madgik.grs.record.field.FileFieldDefinition;
 +
import gr.uoa.di.madgik.grs.record.field.StringField;
 +
import gr.uoa.di.madgik.grs.record.field.StringFieldDefinition;
 +
import gr.uoa.di.madgik.grs.record.field.URLField;
 +
import gr.uoa.di.madgik.grs.record.field.URLFieldDefinition;
 +
import gr.uoa.di.madgik.grs.writer.GRS2WriterException;
 +
import gr.uoa.di.madgik.grs.writer.RecordWriter;
 +
 
 +
public class MultiWriter extends Thread
 +
{
 +
  private RecordWriter<GenericRecord> writer=null;
 +
 +
  public MultiWriter(IWriterProxy proxy) throws GRS2WriterException
 +
  {
 +
    //The gRS will contain only one type of records which in turn contains only a single field
 +
    RecordDefinition[] defs=new RecordDefinition[]{ //A gRS can contain a number of different record definitions
 +
      new GenericRecordDefinition((new FieldDefinition[] { //A record can contain a number of different field definitions
 +
        new StringFieldDefinition("MyStringField"), //The definition of a string field
 +
        new StringFieldDefinition("MyOtherStringField"), //The definition of a string field
 +
        new URLFieldDefinition("MyHTTPField"), //The definition of an http field
 +
        new FileFieldDefinition("MyFileField") //The definition of a file field
 +
      }))
 +
    };
 +
    writer=new RecordWriter<GenericRecord>(
 +
      proxy, //The proxy that defines the way the writer can be accessed
 +
      defs //The definitions of the records the gRS handles
 +
    );
 +
  }
 +
 +
  public URI getLocator() throws GRS2WriterException
 +
  {
 +
    return writer.getLocator();
 +
  }
 +
 
 +
  public void run()
 +
  {
 +
    try
 +
    {
 +
      for(int i=0;i<500;i+=1)
 +
      {
 +
        //while the reader hasn't stopped reading
 +
        if(writer.getStatus()!=Status.Open) break;
 +
        GenericRecord rec=new GenericRecord();
 +
        //Only a string field is added to the record as per definition
 +
        Field[] fs=new Field[4];
 +
        fs[0]=new StringField("Hello world "+i);
 +
        fs[1]=new StringField("Hello world of Diligent");
 +
        fs[2]=new URLField(new URL("http://www.d4science.eu/files/d4science_logo.jpg"));
 +
        fs[3]=new FileField(new File("/bin/ls"));
 +
        rec.setFields(fs);
 +
        //if the buffer is in maximum capacity for the specified interval don't wait any more
 +
        if(!writer.put(rec,60,TimeUnit.SECONDS)) break;
 +
      }
 +
      //if the reader hasn't already disposed the buffer, close to notify reader that no more will be provided
 +
      if(writer.getStatus()!=Status.Dispose) writer.close();
 +
    }catch(Exception ex)
 +
    {
 +
      ex.printStackTrace();
 +
    }
 +
  }
 +
}
 +
</source>
 +
 
 +
=== Reader ===
 +
<source lang="java">
 +
import gr.uoa.di.madgik.grs.reader.ForwardReader;
 +
import gr.uoa.di.madgik.grs.reader.GRS2ReaderException;
 +
import gr.uoa.di.madgik.grs.record.GenericRecord;
 +
import gr.uoa.di.madgik.grs.record.field.FileField;
 +
import gr.uoa.di.madgik.grs.record.field.StringField;
 +
import gr.uoa.di.madgik.grs.record.field.URLField;
 +
import java.net.URI;
 +
 
 +
public class MultiReader extends Thread
 +
{
 +
  ForwardReader<GenericRecord> reader=null;
 +
 +
  public MultiReader(URI locator) throws GRS2ReaderException
 +
  {
 +
    reader=new ForwardReader<GenericRecord>(locator);
 +
  }
 +
 +
  public void run()
 +
  {
 +
    try
 +
    {
 +
      for(GenericRecord rec : reader)
 +
      {
 +
        //In case a timeout occurs while optimistically waiting for more records form an originally open writer
 +
        if(rec==null) break;
 +
        //Retrieve the required field of the type available in the gRS definitions
 +
        System.out.println(((StringField)rec.getField("MyStringField")).getPayload());
 +
        System.out.println(((StringField)rec.getField("MyOtherStringField")).getPayload());
 +
        System.out.println(((URLField)rec.getField("MyHTTPField")).getPayload());
 +
        //In case of in memory communication, the target file will be the same as the original. In other case
 +
        //the target file will be a local copy of the original file, and depending on the transport directive
 +
        //set by the writer, it will be fully available, or it will gradually become available as data are read
 +
        System.out.println(((FileField)rec.getField("MyFileField")).getPayload() +" : "+((FileField)rec.getField("MyFileField")).getOriginalPayload());
 +
      }
 +
      //Close the reader to release and dispose any resources in both reader and writer sides
 +
      reader.close();
 +
    }catch(Exception ex)
 +
    {
 +
      ex.printStackTrace();
 +
    }
 +
  }
 +
}
 +
</source>
 +
 
 +
=== Main ===
 +
As in [[GRS2#Main | Simple Main]], using the new writer and reader classes
 +
 
 +
<source lang="java">
 +
  /*...*/
 +
  MultiWriter writer=new MultiWriter(new LocalWriterProxy());
 +
  MultiReader reader=new MultiReader(writer.getLocator());
 +
  /*...*/
 +
</source>
 +
 
 +
== Retrieving Field Payload ==
 +
As mentioned, depending on the transport directive set by the writer and the type of proxy used for the locator creation, the payload of some fields may not be fully available to the reader. However, the reader's clients should not have to deal with the details of this and should be able to uniformly access the payload of any field. Additionally, each field, depending on its characteristics, may support different transport directives, and may carry internally its payload, or it may only keep a reference to it. Therefore, only the field itself can be aware of how to access its own payload.
 +
 
 +
For example, the URLField, that is set to point to a remotely accessible http location, the field opens an HttpConnection and passes the stream to the client so that from then on he does not need to know the payloads actual location. For a FileField that is accessed remotely from the original file location, the data that are requested from the client are fetched remotely if not already available and are provided to the client through an intermediate local file. All these actions are performed in the background without the client having to perform any action.
 +
 
 +
The way this is achieved is by the MediationInputStream, an input stream implementation that wraps around the Field provided, protocol specific, input stream and from then on handles any remote transport that needs to be made. The field payload mediating input stream is provided by each field. Using the Mediating stream, data are transferred upon request using the field definition's defined chunk size to make the transport buffered. This does of course mean that depending on the size of the payload, a number of round trips need to be made. This will allow for less data been transferred in case only a part of the payload is needed, and also processing can be made on the initial parts while more data is retrieved. On the other hand, if the full payload is needed, the on demand transfer will impose additional overhead. In this cases, a specific field or in fact all the fields of a record can be requested to be made available if supported, in one blocking operation.
 +
 
 +
In the [[GRS2#Multi_Field_Records | Multi Field]] example, if someone wants to read the HttpField and the FileField, the following snippet in the reader is what is needed
 +
<source lang="java">
 +
BufferedInputStream bin=new BufferedInputStream(rec.getField("MyHTTPField").getMediatingInputStream());
 +
byte[] buffer = new byte[1024];
 +
int bytesRead = 0;
 +
while ((bytesRead = bin.read(buffer)) != -1) { /*...*/ }
 +
 
 +
/*...*/
 +
 
 +
BufferedInputStream bin=new BufferedInputStream(rec.getField("MyFileField").getMediatingInputStream());
 +
byte[] buffer = new byte[1024];
 +
int bytesRead = 0;
 +
while ((bytesRead = bin.read(buffer)) != -1) { /*...*/ }
 +
</source>
 +
 
 +
If the use case is that the a specific field is needed to be read entirely, as previously mentioned, one can request for the entire payload to be retrieved beforehand and not on request. For this, the following line needs to be added before the client starts reading the field payload as shown above
 +
<source lang="java">
 +
rec.getField("MyFileField").makeAvailable();
 +
</source>
 +
 
 +
== Random Access Reader ==
 +
The example presented this far were targeting a forward only reader. There is also the alternative provided for a random access reader. This reader allows for random seeks forward and backward in the reader side. Since as was already mentioned, a record after it has been accessed by the reader is disposed by the writer, in order to achive going back to a previously accessed record, which is already disposed by the writer, all past records must be locally maintained in the reader side. To avoid bloating the in memory structures of the reader, already read records are persisted. When the client then wants to move back to a record he has already accessed, the desired record as well as a configurable record window ahead of it is restored from the persistency store. The window is always in front of the requested record assuming that forward movement is the predominent one. Note that in case of completely random seeks, a good strategy to achieve maximum performance is to disable the window by setting its size equal to 1. The persistency store that the past records are kept is internally configured. The only currently available one is a disk backed Random Access File. This of course inforces an overhead that for large data is not negligable. Short term plans include the usage of a different persistency backend to make the operation more performant.
 +
 
 +
One important detail in this, is that every record that is persisted in the reader side must be fully available once restored. That means that all payload that was previously served by the writer side must now be served directly from reader side. This means that whenever a record is persisted, the makeAvailable method is invoked on the record ensuring that any data that would be lost when the record would be disposed in the writer side, will now be directly accessible from the reader side.
 +
 
 +
In the following example, we assume a gRS created as in the case of the [[GRS2#Writer | Simple Writer]] previously described. We present two ways to access the gRS in a random fashion, one using a list iterator, and one that provides random seeks.
 +
 
 +
=== Bidirectional Iterator ===
 +
<source lang="java">
 +
import gr.uoa.di.madgik.grs.reader.GRS2ReaderException;
 +
import gr.uoa.di.madgik.grs.reader.RandomReader;
 +
import gr.uoa.di.madgik.grs.record.GenericRecord;
 +
import gr.uoa.di.madgik.grs.record.field.StringField;
 +
import java.net.URI;
 +
import java.util.ListIterator;
 +
 
 +
public class RandomIteratorReader extends Thread
 +
{
 +
  RandomReader<GenericRecord> reader=null;
 +
 +
  public RandomIteratorReader(URI locator) throws GRS2ReaderException
 +
  {
 +
    reader=new RandomReader<GenericRecord>(locator);
 +
  }
 +
 +
  public void run()
 +
  {
 +
    try
 +
    {
 +
      ListIterator<GenericRecord> iter= reader.listIterator();
 +
      while(iter.hasNext())
 +
      {
 +
        GenericRecord rec=iter.next();
 +
        if(rec==null) break;
 +
        System.out.println(((StringField)rec.getField("ThisIsTheField")).getPayload());
 +
      }
 +
      while(iter.hasPrevious())
 +
      {
 +
        GenericRecord rec=iter.previous();
 +
        if(rec==null) break;
 +
        System.out.println(((StringField)rec.getField("ThisIsTheField")).getPayload());
 +
      }
 +
      while(iter.hasNext())
 +
      {
 +
        GenericRecord rec=iter.next();
 +
        if(rec==null) break;
 +
        System.out.println(((StringField)rec.getField("ThisIsTheField")).getPayload());
 +
      }
 +
      reader.close();
 +
    }catch(Exception ex)
 +
    {
 +
      ex.printStackTrace();
 +
    }
 +
  }
 +
}
 +
</source>
 +
 
 +
=== Random Access ===
 +
<source lang="java">
 +
import gr.uoa.di.madgik.grs.buffer.GRS2BufferException;
 +
import gr.uoa.di.madgik.grs.buffer.IBuffer.Status;
 +
import gr.uoa.di.madgik.grs.reader.GRS2ReaderException;
 +
import gr.uoa.di.madgik.grs.reader.RandomReader;
 +
import gr.uoa.di.madgik.grs.record.GRS2RecordDefinitionException;
 +
import gr.uoa.di.madgik.grs.record.GenericRecord;
 +
import gr.uoa.di.madgik.grs.record.field.StringField;
 +
import java.net.URI;
 +
import java.util.concurrent.TimeUnit;
 +
 
 +
public class RandomAccessReader extends Thread
 +
{
 +
  RandomReader<GenericRecord> reader=null;
 +
 +
  public RandomAccessReader(URI locator) throws GRS2ReaderException
 +
  {
 +
    reader=new RandomReader<GenericRecord>(locator);
 +
  }
 +
 +
  private void read(int count) throws GRS2ReaderException, GRS2RecordDefinitionException, GRS2BufferException
 +
  {
 +
    for(int i=0;i<count;i+=1)
 +
    {
 +
      if(reader.getStatus()==Status.Dispose || (reader.getStatus()==Status.Close && reader.availableRecords()==0)) break;
 +
      GenericRecord rec=reader.get(60, TimeUnit.SECONDS);
 +
      if(rec==null) break;
 +
      System.out.println(((StringField)rec.getField("ThisIsTheField")).getPayload());
 +
    }
 +
  }
 +
 +
  public void run()
 +
  {
 +
    try
 +
    {
 +
      this.read(20);
 +
      reader.seek(-10);
 +
      this.read(9);
 +
      reader.seek(-5);
 +
      this.read(3);
 +
      reader.seek(10);
 +
      this.read(5);
 +
      reader.close();
 +
    }catch(Exception ex)
 +
    {
 +
      ex.printStackTrace();
 +
    }
 +
  }
 +
}
 +
</source>
 +
 
 +
=== Main ===
 +
As in [[GRS2#Main | Simple Main]], using the appropriate respective new reader classe
 +
 
 +
<source lang="java">
 +
  StringWriter writer=new StringWriter(new LocalWriterProxy());
 +
  RandomIteratorReader reader1=new RandomIteratorReader(writer.getLocator());
 +
  RandomAccessReader reader2=new RandomAccessReader(writer.getLocator());
 +
</source>
 +
 
 +
== Store ==
 +
As already mentioned, the gRS2 is point to point only and once something has been read it is disposed. Furthermore, once a reader has been initialized, any additional request from a different reader seeking to utilize the same locator to access the previously accessed gRS2, will fail.  However, there are cases where one might want to have a created gRS2 buffer that is reutilized by multiple readers. This is a case different to the one described in [[GRS2#Random_Access_Reader | Random Access Reader]], as in the latter, the gRS2 is read once from its source and it is persisted in local storage from then on. Also, the gRS2 is accessible by the same reader multiple times. In contrast, in the former case, every accessor of the stored gRS2, is a new reader, who can in turn be read by a random access one or only a forward one.
 +
 
 +
Creating a gRS2 store, a client can control its behaviour with some parameterization
 +
* The Store input is provided and it is not necessarily limited to a single incoming gRS2. Any number of incoming gRS2 can be provided and as a result the content of all of them will be multiplexed in a single new gRS2
 +
* The type of multiplexing that will take place between the number of available inputs, is a matter of parametrization and currently only two ways are available, a FIFO and a First Available policy
 +
* The amount of time that the created gRS2 will be maintained while inactive and not accessed by any reader is also provided at initialization time
 +
 
 +
The gRS2 Store is meant to be used as a checkpoint utility and is to be used only at appropriate use cases because of the performance penalties and resource consumptions it imposes. These can be identified in the following two:
 +
* A persistency store is utilized by the gRS2 Store in order to have all content of the incoming locators locally available and be able to serve them without relying on the possibly remote or even local input buffers which will be disposed and become unavailable. For this reason everything is brought under the management of the gRS2 Store management and persisted. The only store backend currently available is one based on a Random Access File which imposes non neglectable performance penalties. A new more efficient backend is planed for the short term evolution of the gRS
 +
* Although the gRS2 aims to minimize unnecessary record creation and movement, at most equal to the buffer capacity, the gRS2 Store operates on a different hypothesis. As already mentioned, its primary use is as part of a checkpointing operation. To this end, the Store once initialized will greedily read its inputs until the end no matter if some reader is accessing the stored buffer
 +
 
 +
As already mentioned, when using the gRS2 Store, there is no need to modify anything in either the writer or the reader. the Store is a utility and thus it can be used externally and without affecting the implicated components. The following example demonstrates the needed addition to persist the output of a number of writers and have a number of readers accessing the multiplexed output.
 +
 
 +
<source lang="java">
 +
  StringWriter writer1=new StringWriter(new TCPWriterProxy());
 +
  StringWriter writer2=new StringWriter(new LocalWriterProxy());
 +
  StringWriter writer3=new StringWriter(new TCPWriterProxy());
 +
  StringWriter writer4=new StringWriter(new LocalWriterProxy());
 +
 
 +
  URI []locators=new URI[]{writer1.getLocator(),writer2.getLocator(),writer3.getLocator(),writer4.getLocator()};
 +
  //The respective TCPStoreWriterProxy.store operation could be used to create a locator with a different usage scope
 +
  URI locator=LocalStoreWriterProxy.store(locators, IBufferStore.MultiplexType.FIFO, 60*60, TimeUnit.SECONDS);
 +
 
 +
  StringReader reader1=new StringReader(locator);
 +
  RandomAccessReader reader2=new RandomAccessReader(locator);
 +
  StringReader reader3=new StringReader(locator);
 +
  RandomIteratorReader reader4=new RandomIteratorReader(locator);
 +
  StringReader reader5=new StringReader(locator);
 +
 
 +
  //Start everything
 +
  writer1.start();
 +
  /*...*/
 +
  reader5.start();
 +
 
 +
  //Join everyhting
 +
  writer1.join();
 +
  /*...*/
 +
  reader5.join();
 +
</source>
 +
 
 +
Since it is evident that through the gRS2 Store the writer and reader are disconnected, any events that are emitted by the writer, will reach the reader either the time it is emitted, if the reading is performed wile the store is actively persisting its inputs, or at initialization time of the reader where all the events that have been emitted up until that point are emitted all at once. Events that are emitted by the reader will never reach the writers whose output is persisted.
 +
 
 +
==Utilities and Convenience tools==
 +
===Record Import===
 +
It is quite common that a consumer might wish to forward a record to a producer using the same record definition, either completely unaltered or with minor modifications.
 +
To avoid explicitly copying the record, thus wasting valuable computational resources, one has the option of importing records directly to a producer.
 +
This can be done with the <code>importRecord</code> methods, which have the same signatures as the <code>put</code> methods used to add entirely new records to a writer.
 +
 
 +
The following snippet demonstrates how one can forward all records read from a reader to a writer without making any modification
 +
<source lang="java">
 +
for(GenericRecord rec: reader) {
 +
  if(writer.getStatus() == Status.Close || writer.getStatus() == Status.Dispose)
 +
      break;
 +
  if(!writer.importRecord(rec, 60, TimeUnit.SECONDS))
 +
break;
 +
}
 +
</source>
 +
 
 +
The following snippet demonstrates how one can modify a known record field, leaving all other fields unchanged. In fact, one can be unaware of the existence of fields other than those which one is interested.
 +
 
 +
<source lang="java">
 +
for(GenericRecord rec: reader) {
 +
  ((StringField)rec.getField("payloadField")).setPayload("Modified payload");
 +
  if(writer.getStatus() == Status.Close || writer.getStatus() == Status.Dispose)
 +
      break;
 +
  if(!writer.importRecord(rec, 60, TimeUnit.SECONDS))
 +
break;
 +
}
 +
</source>
 +
 
 +
===Keep-Alive Functionality===
 +
Both forward and random access readers can have keep-alive functionality added to them using a typical decorator pattern. This functionality can be useful if a consumer wishes to delay the consumption of records for long periods of time and the producer has no indication of this behavior. In simple cases when the implementation of a transport protocol using events is not desirable, a keep-alive reader can be used in order to keep the communication channel open, by periodically reading records. Keep-alive functionality is transparent to the client, as each time a <code>get</code> method is called, the record either originates from the set of prefetched records, or it is actually read directly from the underlying reader. The order of records is guaranteed to be the same as if the underlying reader was used on its own.
 +
The following snippet demonstrates the creation of a keep-alive reader with a period of record retrieval of 30 seconds:
 +
<source lang="java">
 +
IRecordReader<Record> reader = new ForwardReader<Record>(this.locator);
 +
reader = new KeepAliveReader<Record>(reader, 30, TimeUnit.SECONDS);
 +
</source>
 +
 
 +
===Locator Utilities===
 +
The <code>Locators</code> class provides locator-related convenience methods.
 +
Currently, the class provides a Local-to-TCP converter utility which can be useful if it cannot be known in advance whether a produced resultset should be available only locally or be exposed remotely through TCP. The result producer could then produce its results locally and the results can be exposed through TCP by converting the locator to a TCP locator if it is deemed necessary.
 +
The exact way of converting a local locator to TCP is shown in the following snippet:
 +
<source lang="java">
 +
URI localLocator = ... //An operation returning a local locator
 +
URI TCPLocator = Locators.localToTCP(localLocator);
 +
</source>
 +
 
 +
== HTTP Protocol ==
 +
gRS can also work on top of HTTP instead of TCP. The transferred messages are in XML format.
 +
There are corresponding ''HTTP'' classes for ConnectionManger, Proxies, etc that can be used instead of the ''TCP'' classes to enable this functionality.
 +
 
 +
'''Note''' that a both ends need to use the same protocol in order to be able to communicate.

Latest revision as of 19:02, 11 December 2013

Introduction

The goal of the gRS framework is to enable point to point producer consumer communication. For this to be achieved and connect collocated or remotely located actors the framework protects the two parties from technology specific knowledge and limitations. Additionally it can provide upon request value adding functionalities to enhance the clients communication capabilities.

In the process of offering these functionalities to its clients the framework imposes minimal restrictions on the clients design. Providing an interface similar to commonly used programming structures such as a collection and a stream, client programs can easily incorporate it.

Furthermore, since one of the gRS main goals is to provide location independent services, it is modularly build to allow a number of different technologies to be used as transport and communication mediation depending on the scenario used. For communication through trusted channels a clear TCP connection can be used for fast delivery. Authentication of communication parties is build in to the framework and can be utilized on communication packet level. In situations where more secure connections are required, the full communication stream can be fully encrypted. In cases of firewalled communication, an http transport mechanism can be easily incorporated and supplied by the framework without a single change in the clients code. In case of collocated actors, in memory references can be directly passed from producer to consumer without either of them changing their behavior in any of the described cases.

In a disjoint environment where the producer and consumer are not customly made to interact only with each other or even in the case that the needs of the consumer change from invocation to invocation, a situation might appear where the producer generates more data than the consumer needs even at a per record scope. The gRS enables fine grained transport mechanisms at level not just of a record, but also at the level of record field. A record containing more than one fields may be needed entirely by its consumer or at the next interaction only a single field of the record may be needed. In such a situation a consumer can retrieve only the payload he is interested in without having to deal with any additional payload. Even at the level of the single field, the transported data can be further optimized to consume only just the needed bytes instead of having to transfer a large amount of data which will at the end be disposed without being used.

In modern systems the situation frequently appears where the producers output is already available for access through some protocol. The gRS has a very extendable design as to the way the producer will serve its payload. A number of protocols can be used, ftp, http, network storage, local filesystem, and many more can be implemented very easily and modularly.

gRS2 component is available in our Maven repositories with the following coordinates:

<groupId>org.gcube.execution</groupId>
<artifactId>grs2library</artifactId>
<version>...</version>

TCP Connection Manager

In case the gRS created is to be shared with a remote, or in general in a different address space than the one the writer is in, a TCPConnectionManager should be used. A TCPConnectionManager is static in the context of the address space it is used and is initialized though a singleton pattern. Therefore, one can safely initialize it in any context it is useful for him and do not worry if other components are also trying to initialize it. An example of initializing the TCPConnectionManager is shown in the following snippet.

List<PortRange> ports=new ArrayList<PortRange>(); //The ports that the TCPConnection manager should use
ports.add(new PortRange(3000, 3050));             //Any in the range between 3000 and 3050
ports.add(new PortRange(3055, 3055));             //Port 3055
TCPConnectionManager.Init(
  new TCPConnectionManagerConfig("localhost", //The hostname by which the machine is reachable 
    ports,                                    //The ports that can be used by the connection manager
    true                                      //If no port ranges were provided, or none of them could be used, use a random available port
));
TCPConnectionManager.RegisterEntry(new TCPConnectionHandler());      //Register the handler for the gRS2 incoming requests
TCPConnectionManager.RegisterEntry(new TCPStoreConnectionHandler()); //Register the handler for the gRS2 store incoming requests

A gCube service could use the onInit event to make this initialization. The TCPConnectionManager is located in the MadgikCommons package.

Definitions

Every gRS needs to have explicitly defined the definitions of the records it holds. The record definitions in turn contain the field definitions they contain. Every record that is then handled by the specific gRS must comply to one of the definitions initially provided to the gRS. For example, creating a gRS that handles only one type of records that in turn contains only a single type of field is presented in the following snippet

//The gRS will contain only one type of records which in turn contains only a single field 
RecordDefinition[] defs=new RecordDefinition[]{          //A gRS can contain a number of different record definitions
    new GenericRecordDefinition((new FieldDefinition[] { //A record can contain a number of different field definitions
    new StringFieldDefinition("ThisIsTheField")          //The definition of the field
  }))
};

Record Writer Initialization

To initialize a gRS2 writer one must simply create a new instance of the available writers. Depending on the configuration detail one wishes to set, a number of different constructors are available

writer=new RecordWriter<GenericRecord>(
  proxy, //The proxy that defines the way the writer can be accessed
  defs   //The definitions of the records the gRS handles
);
 
writer=new RecordWriter<GenericRecord>(
  proxy, //The proxy that defines the way the writer can be accessed
  defs,  //The definitions of the records the gRS handles
  50,    //The capacity of the underlying synchronization buffer
  2,     //The maximum number of parallel records that can be concurrently accessed on partial transfer  
  0.5f   //The maximum fraction of the buffer that should be transferred during mirroring  
);
 
writer=new RecordWriter<GenericRecord>(
  proxy,           //The proxy that defines the way the writer can be accessed
  defs,            //The definitions of the records the gRS handles
  50,              //The capacity of the underlying synchronization buffer
  2,               //The maximum number of parallel records that can be concurrently accessed on partial transfer  
  0.5f,            //The maximum fraction of the buffer that should be transferred during mirroring 
  60,              //The timeout in time units after which an inactive gRS can be disposed  
  TimeUnit.SECONDS //The time unit in timeout after which an inactive gRS can be disposed
);

For convenience, if one wishes to create a writer with the same configuration as a already available reader, one can use one of the following constructors

writer=new RecordWriter<GenericRecord>(
  proxy, //The proxy that defines the way the writer can be accessed
  reader //The reader to copy configuration from
);
 
writer=new RecordWriter<GenericRecord>(
  proxy, //The proxy that defines the way the writer can be accessed
  reader, //The reader to copy configuration from
  50,    //The capacity of the underlying synchronization buffer
  2,     //The maximum number of parallel records that can be concurrently accessed on partial transfer  
  0.5f   //The maximum fraction of the buffer that should be transferred during mirroring  
);
 
writer=new RecordWriter<GenericRecord>(
  proxy, //The proxy that defines the way the writer can be accessed
  reader, //The reader to copy configuration from
  50,    //The capacity of the underlying synchronization buffer
  2,     //The maximum number of parallel records that can be concurrently accessed on partial transfer  
  0.5f,   //The maximum fraction of the buffer that should be transferred during mirroring  
  60,              //The timeout in time units after which an inactive gRS can be disposed  
  TimeUnit.SECONDS //The time unit in timeout after which an inactive gRS can be disposed
);

In the first of the above examples, all configuration is copied from the reader. The extra configuration parameters override the ones of the reader, meaning that in the last example only the record definitions are duplicated from the reader configuration.

Custom Records

All records that are handled through gRS2 must extend a base class of gr.uoa.di.madgik.grs.record.Record. Extending this class means implementing a couple of required methods which are mainly used to properly manage the transport of the custom record and any additional information it might cary. Although the actual payload of the record is to be managed through fields, one can also assign additional information in the record itself. In the example that follows, a custom record with its respective definition is created. The record hides the fields that it manages by exposing direct getters and setters over a known definition that it provides and should be used with. Additionally, it defines additional information that describes the record as a whole and instead of using additional fields in order to do this, it defines placeholders within the record itself.

Definition

import java.io.DataInput;
import java.io.DataOutput;
import gr.uoa.di.madgik.grs.buffer.IBuffer.TransportDirective;
import gr.uoa.di.madgik.grs.record.GRS2RecordSerializationException;
import gr.uoa.di.madgik.grs.record.RecordDefinition;
import gr.uoa.di.madgik.grs.record.field.FieldDefinition;
import gr.uoa.di.madgik.grs.record.field.StringFieldDefinition;
 
public class MyRecordDefinition extends RecordDefinition
{
  public static final String PayloadFieldName="payload";
  public static final boolean DefaultCompress=false;
 
  public MyRecordDefinition()
  {
    StringFieldDefinition def=new StringFieldDefinition(MyRecordDefinition.PayloadFieldName);
    def.setTransportDirective(TransportDirective.Full);
    def.setCompress(MyRecordDefinition.DefaultCompress);
    this.Fields = new FieldDefinition[]{def};
  }
 
  public MyRecordDefinition(boolean compress)
  {
    StringFieldDefinition def=new StringFieldDefinition(MyRecordDefinition.PayloadFieldName);
    def.setTransportDirective(TransportDirective.Full);
    def.setCompress(compress);
    this.Fields = new FieldDefinition[]{def};
  }
 
  @Override
  public boolean extendEquals(Object obj)
  {
    if(!(obj instanceof MyRecordDefinition)) return false;
    return true;
  }
 
  @Override
  public void extendDeflate(DataOutput out) throws GRS2RecordSerializationException
  {
    //nothing to deflate
  }
 
  @Override
  public void extendInflate(DataInput in) throws GRS2RecordSerializationException
  {
    //nothing to inflate
  }
}

Record

import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import gr.uoa.di.madgik.grs.GRS2Exception;
import gr.uoa.di.madgik.grs.buffer.GRS2BufferException;
import gr.uoa.di.madgik.grs.record.GRS2RecordDefinitionException;
import gr.uoa.di.madgik.grs.record.GRS2RecordSerializationException;
import gr.uoa.di.madgik.grs.record.Record;
import gr.uoa.di.madgik.grs.record.field.Field;
import gr.uoa.di.madgik.grs.record.field.StringField;
 
public class MyRecord extends Record
{
  private String id="none";
  private String collection="none";
 
  public MyRecord()
  {
    this.setFields(new Field[]{new StringField()});
  }
 
  public MyRecord(String payload)
  {
    this.setFields(new Field[]{new StringField(payload)});
  }
 
  public MyRecord(String id,String collection)
  {
    this.setId(id);
    this.setCollection(collection);
    this.setFields(new Field[]{new StringField()});
  }
 
  public MyRecord(String id,String collection,String payload)
  {
    this.setId(id);
    this.setCollection(collection);
    this.setFields(new Field[]{new StringField(payload)});
  }
 
  public void setId(String id) { this.id = id; }
 
  public String getId() { return id; }
 
  public void setCollection(String collection) { this.collection = collection; }
 
  public String getCollection() { return collection; }
 
  public void setPayload(String payload) throws GRS2Exception
  {
    this.getField(MyRecordDefinition.PayloadFieldName).setPayload(payload);
  }
 
  public String getPayload() throws GRS2Exception
  {
    return this.getField(MyRecordDefinition.PayloadFieldName).getPayload();
  }
 
  @Override
  public StringField getField(String name) throws GRS2RecordDefinitionException, GRS2BufferException
  {
    return (StringField)super.getField(name);
  }
 
  @Override
  public void extendSend(DataOutput out) throws GRS2RecordSerializationException
  {
    this.extendDeflate(out);
  }
 
  @Override
  public void extendReceive(DataInput in) throws GRS2RecordSerializationException
  {
    this.extendInflate(in, false);
  }
 
  @Override
  public void extendDeflate(DataOutput out) throws GRS2RecordSerializationException
  {
    try
    {
      out.writeUTF(this.id);
      out.writeUTF(this.collection);
    }catch(IOException ex)
    {
      throw new GRS2RecordSerializationException("Could not deflate record", ex);
    }
  }
 
  @Override
  public void extendInflate(DataInput in, boolean reset) throws GRS2RecordSerializationException
  {
    try
    {
      this.id=in.readUTF();
      this.collection=in.readUTF();
    }catch(IOException ex)
    {
      throw new GRS2RecordSerializationException("Could not inflate record", ex);
    }
  }
 
  @Override
  public void extendDispose()
  {
    this.id=null;
    this.collection=null;
  }
}

Simple Example

In the following snippets, a simple writer and respective reader is presented to show the full usage of the presented elements

Writer

import java.net.URI;
import java.util.concurrent.TimeUnit;
import gr.uoa.di.madgik.grs.buffer.IBuffer.Status;
import gr.uoa.di.madgik.grs.proxy.IWriterProxy;
import gr.uoa.di.madgik.grs.record.GenericRecord;
import gr.uoa.di.madgik.grs.record.GenericRecordDefinition;
import gr.uoa.di.madgik.grs.record.RecordDefinition;
import gr.uoa.di.madgik.grs.record.field.Field;
import gr.uoa.di.madgik.grs.record.field.FieldDefinition;
import gr.uoa.di.madgik.grs.record.field.StringField;
import gr.uoa.di.madgik.grs.record.field.StringFieldDefinition;
import gr.uoa.di.madgik.grs.writer.GRS2WriterException;
import gr.uoa.di.madgik.grs.writer.RecordWriter;
 
public class StringWriter extends Thread
{
  private RecordWriter<GenericRecord> writer=null;
 
  public StringWriter(IWriterProxy proxy) throws GRS2WriterException
  {
    //The gRS will contain only one type of records which in turn contains only a single field 
    RecordDefinition[] defs=new RecordDefinition[]{          //A gRS can contain a number of different record definitions
        new GenericRecordDefinition((new FieldDefinition[] { //A record can contain a number of different field definitions
        new StringFieldDefinition("ThisIsTheField")          //The definition of the field
      }))
    };
    writer=new RecordWriter<GenericRecord>(
        proxy, //The proxy that defines the way the writer can be accessed
        defs   //The definitions of the records the gRS handles
      );
    }
 
    public URI getLocator() throws GRS2WriterException
    {
      return writer.getLocator();
    }
 
    public void run()
    {
      try
      {
        for(int i=0;i<500;i+=1)
        {
          //while the reader hasn't stopped reading
          if(writer.getStatus()!=Status.Open) break;
          GenericRecord rec=new GenericRecord();
          //Only a string field is added to the record as per definition
          rec.setFields(new Field[]{new StringField("Hello world "+i)});
          //if the buffer is in maximum capacity for the specified interval don;t wait any more
          if(!writer.put(rec,60,TimeUnit.SECONDS)) break;
        }
        //if the reader hasn't already disposed the buffer, close to notify reader that no more will be provided 
        if(writer.getStatus()!=Status.Dispose) writer.close();
      }catch(Exception ex)
      {
        ex.printStackTrace();
      }
    }
}

Reader

import gr.uoa.di.madgik.grs.reader.ForwardReader;
import gr.uoa.di.madgik.grs.reader.GRS2ReaderException;
import gr.uoa.di.madgik.grs.record.GenericRecord;
import gr.uoa.di.madgik.grs.record.field.StringField;
import java.net.URI;
 
public class StringReader extends Thread
{
  ForwardReader<GenericRecord> reader=null;
 
  public StringReader(URI locator) throws GRS2ReaderException
  {
    reader=new ForwardReader<GenericRecord>(locator);
  }
 
  public void run()
  {
    try
    {
      for(GenericRecord rec : reader)
      {
        //In case a timeout occurs while optimistically waiting for more records form an originally open writer
        if(rec==null) break;
        //Retrieve the required field of the type available in the gRS definitions
        System.out.println(((StringField)rec.getField("ThisIsTheField")).getPayload());
      }
      //Close the reader to release and dispose any resources in boith reader and writer sides
      reader.close();
    }catch(Exception ex)
    {
      ex.printStackTrace();
    }
  }
}

Main

import java.util.ArrayList;
import java.util.List;
import gr.uoa.di.madgik.commons.server.PortRange;
import gr.uoa.di.madgik.commons.server.TCPConnectionManager;
import gr.uoa.di.madgik.commons.server.TCPConnectionManagerConfig;
import gr.uoa.di.madgik.grs.proxy.tcp.TCPConnectionHandler;
import gr.uoa.di.madgik.grs.proxy.tcp.TCPStoreConnectionHandler;
import gr.uoa.di.madgik.grs.proxy.tcp.TCPWriterProxy;
 
public class MainTest
{
  public static void main(String []args) throws Exception
  {
    List<PortRange> ports=new ArrayList<PortRange>(); //The ports that the TCPConnection manager should use
    ports.add(new PortRange(3000, 3050));             //Any in the range between 3000 and 3050
    ports.add(new PortRange(3055, 3055));             //Port 3055
    TCPConnectionManager.Init(
      new TCPConnectionManagerConfig("localhost", //The hostname by which the machine is reachable 
        ports,                                    //The ports that can be used by the connection manager
        true                                      //If no port ranges were provided, or none of them could be used, use a random available port
    ));
    TCPConnectionManager.RegisterEntry(new TCPConnectionHandler());      //Register the handler for the gRS2 incoming requests
    TCPConnectionManager.RegisterEntry(new TCPStoreConnectionHandler()); //Register the handler for the gRS2 store incoming requests
 
    // gr.uoa.di.madgik.grs.proxy.local.LocalWriterProxy Could be used instead for local only, all in memory, access
    StringWriter writer=new StringWriter(new TCPWriterProxy());
    StringReader reader=new StringReader(writer.getLocator());
 
    writer.start();
    reader.start();
 
    writer.join();
    reader.join();
  }
}

Events

As was already mentioned, the gRS2 serves as a bidirectional communication channel through events that can be propagated from either the writer to the reader or form the reader to the writer along the normal data flow. These events can serve any purpose fitting to the application logic, from control signals to full data transport purposes, although in the latter case, the payload size should be kept moderate as it is always handled in memory and cannot use services such as partial transfer, or be declared using internally managed files.

An example that has been taken to the extremes follows. In this example, a writer and a reader use only events to propagate their data. One should notice that the communication protocol is now up to the reader and writer clients to be implemented. In the example case this has been handled in a quick and dirty way using sleep timeouts. Additionally, it should be noted that other than the basic key - value internally provided event placeholder, an extendable event "placeholder" has been provided to facilitate any type of custom object transport.

Writer

import java.net.URI;
import gr.uoa.di.madgik.grs.buffer.IBuffer.Status;
import gr.uoa.di.madgik.grs.events.BufferEvent;
import gr.uoa.di.madgik.grs.events.KeyValueEvent;
import gr.uoa.di.madgik.grs.events.ObjectEvent;
import gr.uoa.di.madgik.grs.proxy.IWriterProxy;
import gr.uoa.di.madgik.grs.record.GenericRecord;
import gr.uoa.di.madgik.grs.record.GenericRecordDefinition;
import gr.uoa.di.madgik.grs.record.RecordDefinition;
import gr.uoa.di.madgik.grs.record.field.FieldDefinition;
import gr.uoa.di.madgik.grs.record.field.StringFieldDefinition;
import gr.uoa.di.madgik.grs.writer.GRS2WriterException;
import gr.uoa.di.madgik.grs.writer.RecordWriter;
 
public class EventWriter extends Thread
{
  private RecordWriter<GenericRecord> writer=null;
 
  public EventWriter(IWriterProxy proxy) throws GRS2WriterException
  {
    //The gRS will contain only one type of records which in turn contains only a single field 
    RecordDefinition[] defs=new RecordDefinition[]{ //A gRS can contain a number of different record definitions
        new GenericRecordDefinition((new FieldDefinition[] { //A record can contain a number of different field definitions
        new StringFieldDefinition("ThisIsTheField") //The definition of the field
      }))
    };
    writer=new RecordWriter<GenericRecord>(
      proxy, //The proxy that defines the way the writer can be accessed
      defs //The definitions of the records the gRS handles
    );
  }
 
  public URI getLocator() throws GRS2WriterException
  {
    return writer.getLocator();
  }
 
  public void receiveWaitKeyValue() throws GRS2WriterException
  {
    while(true)
    {
      BufferEvent ev=writer.receive();
      if(ev==null) try{ Thread.sleep(100);}catch(Exception ex){}
      else 
      {
        System.out.println("Received event from reader : "+((KeyValueEvent)ev).getKey()+" - "+((KeyValueEvent)ev).getValue());
        break;
      }
    }
  }
 
  public void receiveWaitObject() throws GRS2WriterException
  {
    while(true)
    {
      BufferEvent ev=writer.receive();
      if(ev==null) try{ Thread.sleep(100);}catch(Exception ex){}
      else 
      {
        System.out.println("Reveived event from reader : "+((ObjectEvent)ev).getItem().toString());
        break;
      }
    }
  }
 
  public void run()
  {
    try
    {
      writer.emit(new KeyValueEvent("init", "Hello"));
      for(int i=0;i<5;i+=1) this.receiveWaitKeyValue();
      this.receiveWaitObject();
      if(writer.getStatus()!=Status.Dispose) writer.close();
    }catch(Exception ex)
    {
      ex.printStackTrace();
    }
  }
}

Reader

import gr.uoa.di.madgik.grs.events.BufferEvent;
import gr.uoa.di.madgik.grs.events.KeyValueEvent;
import gr.uoa.di.madgik.grs.events.ObjectEvent;
import gr.uoa.di.madgik.grs.reader.ForwardReader;
import gr.uoa.di.madgik.grs.reader.GRS2ReaderException;
import gr.uoa.di.madgik.grs.record.GenericRecord;
import java.net.URI;
 
public class EventReader extends Thread
{
  ForwardReader<GenericRecord> reader=null;
 
  public EventReader(URI locator) throws GRS2ReaderException
  {
    reader=new ForwardReader<GenericRecord>(locator);
  }
 
  public void run()
  {
    try
    {
      BufferEvent ev=reader.receive();
      System.out.println("Reveived event from writer : "+((KeyValueEvent)ev).getKey()+" - "+((KeyValueEvent)ev).getValue());
      reader.emit(new KeyValueEvent("info", "Hello..."));
      reader.emit(new KeyValueEvent("info", "Hello Again..."));
      reader.emit(new KeyValueEvent("info", "Hello And Again..."));
      reader.emit(new KeyValueEvent("info", "And again..."));
      reader.emit(new KeyValueEvent("info", "Bored already ?"));
      reader.emit(new ObjectEvent(new MyEvent("This is a custom object event", 234)));
      reader.close();
    }catch(Exception ex)
    {
      ex.printStackTrace();
    }
  }
}

MyEvent

import java.io.DataInput;
import java.io.DataOutput;
import gr.uoa.di.madgik.grs.record.GRS2RecordSerializationException;
import gr.uoa.di.madgik.grs.record.IPumpable;
 
public class MyEvent implements IPumpable
{
  private String message=null;
  private int count=0;
 
  public MyEvent(String message,int count)
  {
    this.message=message;
    this.count=count;
  }
 
  public String toString()
  {
    return "event payload message is : '"+this.message+"' with count '"+this.count+"'";
  }
 
  public void deflate(DataOutput out) throws GRS2RecordSerializationException
  {
    try
    {
      out.writeUTF(message);
      out.writeInt(count);
    }catch(Exception ex)
    {
      throw new GRS2RecordSerializationException("Could not serialize event", ex);
    }
  }
 
  public void inflate(DataInput in) throws GRS2RecordSerializationException
  {
    try
    {
      this.message=in.readUTF();
      this.count=in.readInt();
    }catch(Exception ex)
    {
      throw new GRS2RecordSerializationException("Could not deserialize event", ex);
    }
  }
 
  public void inflate(DataInput in, boolean reset) throws GRS2RecordSerializationException
  {
    this.inflate(in);
  }
}

Main

As in Simple Main, using the new writer and reader classes

  /*...*/
  EventWriter writer=new EventWriter(new TCPWriterProxy());
  EventReader reader=new EventReader(writer.getLocator());
  /*...*/

Multi Field Records

Writer

import java.io.File;
import java.net.URI;
import java.net.URL;
import java.util.concurrent.TimeUnit;
import gr.uoa.di.madgik.grs.buffer.IBuffer.Status;
import gr.uoa.di.madgik.grs.proxy.IWriterProxy;
import gr.uoa.di.madgik.grs.record.GenericRecord;
import gr.uoa.di.madgik.grs.record.GenericRecordDefinition;
import gr.uoa.di.madgik.grs.record.RecordDefinition;
import gr.uoa.di.madgik.grs.record.field.Field;
import gr.uoa.di.madgik.grs.record.field.FieldDefinition;
import gr.uoa.di.madgik.grs.record.field.FileField;
import gr.uoa.di.madgik.grs.record.field.FileFieldDefinition;
import gr.uoa.di.madgik.grs.record.field.StringField;
import gr.uoa.di.madgik.grs.record.field.StringFieldDefinition;
import gr.uoa.di.madgik.grs.record.field.URLField;
import gr.uoa.di.madgik.grs.record.field.URLFieldDefinition;
import gr.uoa.di.madgik.grs.writer.GRS2WriterException;
import gr.uoa.di.madgik.grs.writer.RecordWriter;
 
public class MultiWriter extends Thread
{
  private RecordWriter<GenericRecord> writer=null;
 
  public MultiWriter(IWriterProxy proxy) throws GRS2WriterException
  {
    //The gRS will contain only one type of records which in turn contains only a single field 
    RecordDefinition[] defs=new RecordDefinition[]{ //A gRS can contain a number of different record definitions
      new GenericRecordDefinition((new FieldDefinition[] { //A record can contain a number of different field definitions
        new StringFieldDefinition("MyStringField"), //The definition of a string field
        new StringFieldDefinition("MyOtherStringField"), //The definition of a string field
        new URLFieldDefinition("MyHTTPField"), //The definition of an http field
        new FileFieldDefinition("MyFileField") //The definition of a file field
      }))
    };
    writer=new RecordWriter<GenericRecord>(
      proxy, //The proxy that defines the way the writer can be accessed
      defs //The definitions of the records the gRS handles
    );
  }
 
  public URI getLocator() throws GRS2WriterException
  {
    return writer.getLocator();
  }
 
  public void run()
  {
    try
    {
      for(int i=0;i<500;i+=1)
      {
        //while the reader hasn't stopped reading
        if(writer.getStatus()!=Status.Open) break;
        GenericRecord rec=new GenericRecord();
        //Only a string field is added to the record as per definition
        Field[] fs=new Field[4];
        fs[0]=new StringField("Hello world "+i);
        fs[1]=new StringField("Hello world of Diligent");
        fs[2]=new URLField(new URL("http://www.d4science.eu/files/d4science_logo.jpg"));
        fs[3]=new FileField(new File("/bin/ls"));
        rec.setFields(fs);
        //if the buffer is in maximum capacity for the specified interval don't wait any more
        if(!writer.put(rec,60,TimeUnit.SECONDS)) break;
      }
      //if the reader hasn't already disposed the buffer, close to notify reader that no more will be provided 
      if(writer.getStatus()!=Status.Dispose) writer.close();
    }catch(Exception ex)
    {
      ex.printStackTrace();
    }
  }
}

Reader

import gr.uoa.di.madgik.grs.reader.ForwardReader;
import gr.uoa.di.madgik.grs.reader.GRS2ReaderException;
import gr.uoa.di.madgik.grs.record.GenericRecord;
import gr.uoa.di.madgik.grs.record.field.FileField;
import gr.uoa.di.madgik.grs.record.field.StringField;
import gr.uoa.di.madgik.grs.record.field.URLField;
import java.net.URI;
 
public class MultiReader extends Thread
{
  ForwardReader<GenericRecord> reader=null;
 
  public MultiReader(URI locator) throws GRS2ReaderException
  {
    reader=new ForwardReader<GenericRecord>(locator);
  }
 
  public void run()
  {
    try
    {
      for(GenericRecord rec : reader)
      {
        //In case a timeout occurs while optimistically waiting for more records form an originally open writer
        if(rec==null) break;
        //Retrieve the required field of the type available in the gRS definitions
        System.out.println(((StringField)rec.getField("MyStringField")).getPayload());
        System.out.println(((StringField)rec.getField("MyOtherStringField")).getPayload());
        System.out.println(((URLField)rec.getField("MyHTTPField")).getPayload());
        //In case of in memory communication, the target file will be the same as the original. In other case
        //the target file will be a local copy of the original file, and depending on the transport directive
        //set by the writer, it will be fully available, or it will gradually become available as data are read
        System.out.println(((FileField)rec.getField("MyFileField")).getPayload() +" : "+((FileField)rec.getField("MyFileField")).getOriginalPayload());
      }
      //Close the reader to release and dispose any resources in both reader and writer sides
      reader.close();
    }catch(Exception ex)
    {
      ex.printStackTrace();
    }
  }
}

Main

As in Simple Main, using the new writer and reader classes

  /*...*/
  MultiWriter writer=new MultiWriter(new LocalWriterProxy());
  MultiReader reader=new MultiReader(writer.getLocator());
  /*...*/

Retrieving Field Payload

As mentioned, depending on the transport directive set by the writer and the type of proxy used for the locator creation, the payload of some fields may not be fully available to the reader. However, the reader's clients should not have to deal with the details of this and should be able to uniformly access the payload of any field. Additionally, each field, depending on its characteristics, may support different transport directives, and may carry internally its payload, or it may only keep a reference to it. Therefore, only the field itself can be aware of how to access its own payload.

For example, the URLField, that is set to point to a remotely accessible http location, the field opens an HttpConnection and passes the stream to the client so that from then on he does not need to know the payloads actual location. For a FileField that is accessed remotely from the original file location, the data that are requested from the client are fetched remotely if not already available and are provided to the client through an intermediate local file. All these actions are performed in the background without the client having to perform any action.

The way this is achieved is by the MediationInputStream, an input stream implementation that wraps around the Field provided, protocol specific, input stream and from then on handles any remote transport that needs to be made. The field payload mediating input stream is provided by each field. Using the Mediating stream, data are transferred upon request using the field definition's defined chunk size to make the transport buffered. This does of course mean that depending on the size of the payload, a number of round trips need to be made. This will allow for less data been transferred in case only a part of the payload is needed, and also processing can be made on the initial parts while more data is retrieved. On the other hand, if the full payload is needed, the on demand transfer will impose additional overhead. In this cases, a specific field or in fact all the fields of a record can be requested to be made available if supported, in one blocking operation.

In the Multi Field example, if someone wants to read the HttpField and the FileField, the following snippet in the reader is what is needed

BufferedInputStream bin=new BufferedInputStream(rec.getField("MyHTTPField").getMediatingInputStream());
byte[] buffer = new byte[1024];
int bytesRead = 0;
while ((bytesRead = bin.read(buffer)) != -1) { /*...*/ }
 
/*...*/
 
BufferedInputStream bin=new BufferedInputStream(rec.getField("MyFileField").getMediatingInputStream());
byte[] buffer = new byte[1024];
int bytesRead = 0;
while ((bytesRead = bin.read(buffer)) != -1) { /*...*/ }

If the use case is that the a specific field is needed to be read entirely, as previously mentioned, one can request for the entire payload to be retrieved beforehand and not on request. For this, the following line needs to be added before the client starts reading the field payload as shown above

rec.getField("MyFileField").makeAvailable();

Random Access Reader

The example presented this far were targeting a forward only reader. There is also the alternative provided for a random access reader. This reader allows for random seeks forward and backward in the reader side. Since as was already mentioned, a record after it has been accessed by the reader is disposed by the writer, in order to achive going back to a previously accessed record, which is already disposed by the writer, all past records must be locally maintained in the reader side. To avoid bloating the in memory structures of the reader, already read records are persisted. When the client then wants to move back to a record he has already accessed, the desired record as well as a configurable record window ahead of it is restored from the persistency store. The window is always in front of the requested record assuming that forward movement is the predominent one. Note that in case of completely random seeks, a good strategy to achieve maximum performance is to disable the window by setting its size equal to 1. The persistency store that the past records are kept is internally configured. The only currently available one is a disk backed Random Access File. This of course inforces an overhead that for large data is not negligable. Short term plans include the usage of a different persistency backend to make the operation more performant.

One important detail in this, is that every record that is persisted in the reader side must be fully available once restored. That means that all payload that was previously served by the writer side must now be served directly from reader side. This means that whenever a record is persisted, the makeAvailable method is invoked on the record ensuring that any data that would be lost when the record would be disposed in the writer side, will now be directly accessible from the reader side.

In the following example, we assume a gRS created as in the case of the Simple Writer previously described. We present two ways to access the gRS in a random fashion, one using a list iterator, and one that provides random seeks.

Bidirectional Iterator

import gr.uoa.di.madgik.grs.reader.GRS2ReaderException;
import gr.uoa.di.madgik.grs.reader.RandomReader;
import gr.uoa.di.madgik.grs.record.GenericRecord;
import gr.uoa.di.madgik.grs.record.field.StringField;
import java.net.URI;
import java.util.ListIterator;
 
public class RandomIteratorReader extends Thread
{
  RandomReader<GenericRecord> reader=null;
 
  public RandomIteratorReader(URI locator) throws GRS2ReaderException
  {
    reader=new RandomReader<GenericRecord>(locator);
  }
 
  public void run()
  {
    try
    {
      ListIterator<GenericRecord> iter= reader.listIterator();
      while(iter.hasNext())
      {
        GenericRecord rec=iter.next();
        if(rec==null) break;
        System.out.println(((StringField)rec.getField("ThisIsTheField")).getPayload());
      }
      while(iter.hasPrevious())
      {
        GenericRecord rec=iter.previous();
        if(rec==null) break;
        System.out.println(((StringField)rec.getField("ThisIsTheField")).getPayload());
      }
      while(iter.hasNext())
      {
        GenericRecord rec=iter.next();
        if(rec==null) break;
        System.out.println(((StringField)rec.getField("ThisIsTheField")).getPayload());
      }
      reader.close();
    }catch(Exception ex)
    {
      ex.printStackTrace();
    }
  }
}

Random Access

import gr.uoa.di.madgik.grs.buffer.GRS2BufferException;
import gr.uoa.di.madgik.grs.buffer.IBuffer.Status;
import gr.uoa.di.madgik.grs.reader.GRS2ReaderException;
import gr.uoa.di.madgik.grs.reader.RandomReader;
import gr.uoa.di.madgik.grs.record.GRS2RecordDefinitionException;
import gr.uoa.di.madgik.grs.record.GenericRecord;
import gr.uoa.di.madgik.grs.record.field.StringField;
import java.net.URI;
import java.util.concurrent.TimeUnit;
 
public class RandomAccessReader extends Thread
{
  RandomReader<GenericRecord> reader=null;
 
  public RandomAccessReader(URI locator) throws GRS2ReaderException
  {
    reader=new RandomReader<GenericRecord>(locator);
  }
 
  private void read(int count) throws GRS2ReaderException, GRS2RecordDefinitionException, GRS2BufferException
  {
    for(int i=0;i<count;i+=1)
    {
      if(reader.getStatus()==Status.Dispose || (reader.getStatus()==Status.Close && reader.availableRecords()==0)) break;
      GenericRecord rec=reader.get(60, TimeUnit.SECONDS);
      if(rec==null) break;
      System.out.println(((StringField)rec.getField("ThisIsTheField")).getPayload());
    }
  }
 
  public void run()
  {
    try
    {
      this.read(20);
      reader.seek(-10);
      this.read(9);
      reader.seek(-5);
      this.read(3);
      reader.seek(10);
      this.read(5);
      reader.close();
    }catch(Exception ex)
    {
      ex.printStackTrace();
    }
  }
}

Main

As in Simple Main, using the appropriate respective new reader classe

  StringWriter writer=new StringWriter(new LocalWriterProxy());
  RandomIteratorReader reader1=new RandomIteratorReader(writer.getLocator());
  RandomAccessReader reader2=new RandomAccessReader(writer.getLocator());

Store

As already mentioned, the gRS2 is point to point only and once something has been read it is disposed. Furthermore, once a reader has been initialized, any additional request from a different reader seeking to utilize the same locator to access the previously accessed gRS2, will fail. However, there are cases where one might want to have a created gRS2 buffer that is reutilized by multiple readers. This is a case different to the one described in Random Access Reader, as in the latter, the gRS2 is read once from its source and it is persisted in local storage from then on. Also, the gRS2 is accessible by the same reader multiple times. In contrast, in the former case, every accessor of the stored gRS2, is a new reader, who can in turn be read by a random access one or only a forward one.

Creating a gRS2 store, a client can control its behaviour with some parameterization

  • The Store input is provided and it is not necessarily limited to a single incoming gRS2. Any number of incoming gRS2 can be provided and as a result the content of all of them will be multiplexed in a single new gRS2
  • The type of multiplexing that will take place between the number of available inputs, is a matter of parametrization and currently only two ways are available, a FIFO and a First Available policy
  • The amount of time that the created gRS2 will be maintained while inactive and not accessed by any reader is also provided at initialization time

The gRS2 Store is meant to be used as a checkpoint utility and is to be used only at appropriate use cases because of the performance penalties and resource consumptions it imposes. These can be identified in the following two:

  • A persistency store is utilized by the gRS2 Store in order to have all content of the incoming locators locally available and be able to serve them without relying on the possibly remote or even local input buffers which will be disposed and become unavailable. For this reason everything is brought under the management of the gRS2 Store management and persisted. The only store backend currently available is one based on a Random Access File which imposes non neglectable performance penalties. A new more efficient backend is planed for the short term evolution of the gRS
  • Although the gRS2 aims to minimize unnecessary record creation and movement, at most equal to the buffer capacity, the gRS2 Store operates on a different hypothesis. As already mentioned, its primary use is as part of a checkpointing operation. To this end, the Store once initialized will greedily read its inputs until the end no matter if some reader is accessing the stored buffer

As already mentioned, when using the gRS2 Store, there is no need to modify anything in either the writer or the reader. the Store is a utility and thus it can be used externally and without affecting the implicated components. The following example demonstrates the needed addition to persist the output of a number of writers and have a number of readers accessing the multiplexed output.

  StringWriter writer1=new StringWriter(new TCPWriterProxy());
  StringWriter writer2=new StringWriter(new LocalWriterProxy());
  StringWriter writer3=new StringWriter(new TCPWriterProxy());
  StringWriter writer4=new StringWriter(new LocalWriterProxy());
 
  URI []locators=new URI[]{writer1.getLocator(),writer2.getLocator(),writer3.getLocator(),writer4.getLocator()};
  //The respective TCPStoreWriterProxy.store operation could be used to create a locator with a different usage scope
  URI locator=LocalStoreWriterProxy.store(locators, IBufferStore.MultiplexType.FIFO, 60*60, TimeUnit.SECONDS);
 
  StringReader reader1=new StringReader(locator);
  RandomAccessReader reader2=new RandomAccessReader(locator);
  StringReader reader3=new StringReader(locator);
  RandomIteratorReader reader4=new RandomIteratorReader(locator);
  StringReader reader5=new StringReader(locator);
 
  //Start everything
  writer1.start();
  /*...*/
  reader5.start();
 
  //Join everyhting
  writer1.join();
  /*...*/
  reader5.join();

Since it is evident that through the gRS2 Store the writer and reader are disconnected, any events that are emitted by the writer, will reach the reader either the time it is emitted, if the reading is performed wile the store is actively persisting its inputs, or at initialization time of the reader where all the events that have been emitted up until that point are emitted all at once. Events that are emitted by the reader will never reach the writers whose output is persisted.

Utilities and Convenience tools

Record Import

It is quite common that a consumer might wish to forward a record to a producer using the same record definition, either completely unaltered or with minor modifications. To avoid explicitly copying the record, thus wasting valuable computational resources, one has the option of importing records directly to a producer. This can be done with the importRecord methods, which have the same signatures as the put methods used to add entirely new records to a writer.

The following snippet demonstrates how one can forward all records read from a reader to a writer without making any modification

for(GenericRecord rec: reader) {
   if(writer.getStatus() == Status.Close || writer.getStatus() == Status.Dispose)
      break;
   if(!writer.importRecord(rec, 60, TimeUnit.SECONDS))
	break;
}

The following snippet demonstrates how one can modify a known record field, leaving all other fields unchanged. In fact, one can be unaware of the existence of fields other than those which one is interested.

for(GenericRecord rec: reader) {
   ((StringField)rec.getField("payloadField")).setPayload("Modified payload");
   if(writer.getStatus() == Status.Close || writer.getStatus() == Status.Dispose)
      break;
   if(!writer.importRecord(rec, 60, TimeUnit.SECONDS))
	break;
}

Keep-Alive Functionality

Both forward and random access readers can have keep-alive functionality added to them using a typical decorator pattern. This functionality can be useful if a consumer wishes to delay the consumption of records for long periods of time and the producer has no indication of this behavior. In simple cases when the implementation of a transport protocol using events is not desirable, a keep-alive reader can be used in order to keep the communication channel open, by periodically reading records. Keep-alive functionality is transparent to the client, as each time a get method is called, the record either originates from the set of prefetched records, or it is actually read directly from the underlying reader. The order of records is guaranteed to be the same as if the underlying reader was used on its own. The following snippet demonstrates the creation of a keep-alive reader with a period of record retrieval of 30 seconds:

IRecordReader<Record> reader = new ForwardReader<Record>(this.locator);
reader = new KeepAliveReader<Record>(reader, 30, TimeUnit.SECONDS);

Locator Utilities

The Locators class provides locator-related convenience methods. Currently, the class provides a Local-to-TCP converter utility which can be useful if it cannot be known in advance whether a produced resultset should be available only locally or be exposed remotely through TCP. The result producer could then produce its results locally and the results can be exposed through TCP by converting the locator to a TCP locator if it is deemed necessary. The exact way of converting a local locator to TCP is shown in the following snippet:

URI localLocator = ... //An operation returning a local locator
URI TCPLocator = Locators.localToTCP(localLocator);

HTTP Protocol

gRS can also work on top of HTTP instead of TCP. The transferred messages are in XML format. There are corresponding HTTP classes for ConnectionManger, Proxies, etc that can be used instead of the TCP classes to enable this functionality.

Note that a both ends need to use the same protocol in order to be able to communicate.