GDL Streams (2.0)

From Gcube Wiki
Jump to: navigation, search

In some of its operations, the gDL relies on streams to model, process, and transfer large-scale data collections. Streams may consist of document descriptions, document identifiers, and document updates. More generally, they may consist of the outcomes of operations that take in turn large-scale collections in input. Streamed processing makes efficient use of both local and remote resources, from local memory to network bandwidth, promoting the overall responsiveness of clients and services through reduced latencies.

Clients that use these operations will need to route streams towards and across the operations of the gDL, converting between different stream interfaces, often injecting application logic in the process. As a common example, a client may need to:

  • route a remote result set of document identifiers towards the read operations of the gDL;
  • process the document descriptions returned by the read operations, e.g. in order to update some of their properties;
  • feed the modified document descriptions to the write operations of the gDL, so as to commit the changes;
  • inspect commit outcomes, so as to report or otherwise handle the failures that may have occurred in the process.

Throughout the workflow, it is important that the client remains within the paradigm of streamed processing, avoiding the accumulation of data in memory in all cases but where strictly required. Document identifiers will be streaming from the remote location of the original result set as documents descriptions will be flowing back from yet another remote location, as updated document descriptions will be leaving towards the same remote location, and as failures will be steadily coming back for handling.

Streaming raises significant opportunities for clients, as well as non-trivial challenges. In recognition of the difficulties, the gDL includes a set of general-purpose facilities for stream conversion that simplify the tasks of filtering, transforming, or otherwise processing streams. These facilities are cast as the sentences of the Stream DSL, an Embedded Domain-Specific Language (EDSL).


Standard and Remote Iterators

As all the sentences of the Stream DSL take and return streams, we begin by looking look at how streams are represented in the gDL.

Streams have the interface of iterators, i.e. yield elements on demand and are typically consumed within loops. There are two such interfaces:

  • Iterator<T>, the standard Java interface for iterations.
  • RemoteIterator<T>, a variation over Iterator<T> which marks explicitly the remote origin of the stream.

In particular, a RemoteIterator differs from a standard Iterator in two respects:

  • the method next() may throw a checked Exception. This witnesses to the fact that iterating over the stream involves fallible I/O operations;
  • there is a method locator() that returns a reference to the remote stream as a plain String in some implementation-specific syntax.

Locators aside, the key difference between the two interfaces is in their assumptions about the possibility of iteration failures. A standard Iterator does not present failures to its clients other than for requests made past end of the stream (an unchecked NoSuchElementException). This may be because failures do not occur at all, e.g. the iteration is over an in-memory collection; it may also be because the iterator knows how to handle failures when these occur. In this sense, Iterator<T> may well be defined over external, even remote collections, but it assumes that all failure handling policies are responsibilities of its implementations.

In contrast, RemoteIterator<T> makes it clear that:

  • failures are likely to occur;
  • clients are expected to handle them.

The operations of the gDL make use of both interfaces:

  • when they take streams, they expect them as standard Iterators;
  • when they return streams, the provide them as RemoteIterators.

This choice emphasises two points:

  • streams that are provided by clients are of unknown origin, those provided by the library originate in remote services of the gCube Content Management infrastructure.
  • all fault handling policies are in the hands of clients, where they should be. When clients provide an Iterator to the library, they will have embedded a fault handling policy in its implementation. When they receive a RemoteIterator from the library, they will apply a fault handling policy when consuming the stream.

Simple Conversions

The sentences of the DSL begin with verbs, which can be statically imported from the Streams class:

import static org.gcube.contentmanagement.gcubedocumentlibrary.streams.dsl.Streams.*;
...

The verb convert introduces the simplest of sentences, those that convert between Iterators and RemoteIterators. The following example shows the conversion of an Iterator into a RemoteIterator:

import static org.gcube.contentmanagement.gcubedocumentlibrary.streams.dsl.Streams.*;
...
Iterator<SomeType> it = ...
RemoteIterator<SomeType> rit = convert(it);

The result is a RemoteIterator that promises to return failures but never does. The implementation is just a wrapper around the standard Iterator which returns it.toString() as the locator of the underlying collection.

Converting a RemoteIterator to an Iterator is more interesting because it requires the encapsulation of a fault handling policy. The following example shows the possibilities:

import static org.gcube.contentmanagement.gcubedocumentlibrary.streams.dsl.Streams.*;
...
RemoteIterator<SomeType> rit = ...
 
//iterator will return any fault raised by the remote iterator
Iterator<SomeType> it1 = convert(rit).with(IGNORE_POLICY); 
//iterator will stop at the first fault raised by the remote iterator
Iterator<SomeType> it2 = convert(rit).with(FAILFAST_POLICY); 
//iterator will handle fault as specified by given policy
FaultPolicy policy = new FaultPolicy() {...}; 
Iterator<SomeType> it3 = convert(rit).with(policy);

In this example, the clause with() introduces the fault handling policy to encapsulate in the resulting Iterator. Two common policies are predefined and can be named directly, as shown for it1 and it2 above:

  • IGNORE_POLICY: any faults raised by the RemoteIterator are discarded by the resulting Iterator<code>, which will ensure that <code>hasNext()>/code> and <code>next() behave as if they had not occurred;
  • FAILFAST_POLICY: the first fault raised by the RemoteIterator halts the resulting Iterator, which will ensure that hasNext()>/code> and <code>next() behave as if they stream had reached its natural end;

Custom policies can be defined by implementing the interface FaultPolicy:

public interface FaultPolicy ... {
 
	boolean onFault(Exception e, int count); 
}

In onFault(), clients are passed the fault raised by the RemoteIterator, as well as the count of faults raised so far during the iteration (this will be greater than 1 only if the policy will have tolerated some previous faults during the iteration). Clients apply the policy and return true if the fault should be tolerated and the iteration continue, false if they instead wish the iteration to stop. Here's an example of a fault handling policy that tolerates only the first error and uses two aliases for the boolean values to improve the legibility of the policy (CONTINUE and STOP, also defined in the Streams class and statically imported):

import static org.gcube.contentmanagement.gcubedocumentlibrary.streams.dsl.Streams.*;
...
FaultPolicy policy = new FaultPolicy() {
 
       public boolean onFault(Exception e, int count) {
             if (count=1) {
                   ....dealing with fault ...
		   return CONTINUE;
	      }
             else 
                  return STOP;	
        }
};

Note also that the IGNORE_POLICY is the default policy from conversion to standard iterators. Clients can use the clause withDefaults() to avoid naming it.

import static org.gcube.contentmanagement.gcubedocumentlibrary.streams.dsl.Streams.*;
...
RemoteIterator<SomeType> rit = ...
 
//iterator will handle faults with the default policy: IGNORE_POLICY
Iterator<SomeType> it = convert(rit).withDefaults();

Finally, note that stream conversions may also be applied between RemoteIterators, so as to change their FaultPolicy:

import static org.gcube.contentmanagement.gcubedocumentlibrary.streams.dsl.Streams.*;
...
RemoteIterator<SomeType> rit1 = ...
 
//iterator will handle faults with the default policy: IGNORE_POLICY
RemoteIterator<SomeType> rit2 = convert(rit1).withRemote(IGNORE_POLICY);

Here, the clause withRemote() introduces a fault policy for the RemoteIterator in output. Fault policies for RemoteIterator are a superset of those that can be configured on standard Iterators. In particular, they implement the interface RemoteFaultPolicy:

public interface RemoteFaultPolicy ... { 
	boolean onFault(Exception e, int count) throws Exception; 
}

Note that the only difference between a FaultPolicy and a RemoteFaultPolicy is that the latter has the additional option to raise a fault of its own in onFault(). Thus, when a fault occurs during iteration, the RemoteIterator can continue iterating, stop the iteration, but also re-throw the same or another fault to the iterating client, which is indeed what makes a RemoteIterator different from a standard Iterator.

In particular, the Stream DSL predefines a third policy which is available only for RemoteIterators:

  • RETHROW_POLICY: any faults raised during iteration will be immediately propagated to clients;

This is the in fact the default policy for RemoteIterators and clients can use the clause withRemoteDefaults() to avoid naming it. We will see examples of this with verbs other than convert(), for which it makes little sense.

In summary, the Stream DSL allows clients to formulate the following sentences for simple stream conversion:

  • convert(Iterator): converts a standard Iterator into a RemoteIterator;
  • convert(RemoteIterator).with(FaultPolicy): converts a RemoteIterator into a standard Iterator that encapsulates a given FaultPolicy;
  • convert(RemoteIterator).withDefaults(): converts a RemoteIterator into a standard Iterator that encapsulates the IGNORE_POLICY for faults;
  • convert(RemoteIterator).withRemote(RemoteFaultPolicy): converts a RemoteIterator into another RemoteIterator that encapsulates a given RemoteFaultPolicy;
  • convert(RemoteIterator).withRemoteDefaults(): converts a RemoteIterator into another RemoteIterator that encapsulates the RETHROW_POLICY for faults;


ResultSet Conversions

A different but very common form of conversion is between gCube result sets and RemoteIterators. While result sets are the preferred way of modelling remote streams within the system, their iterators do not natively implement the RemoteIterator<T> interface, which has been independently defined in the CML as an abstraction over an underlying result set mechanism. The CML defines an initial set of facilities to perform the conversion from result sets of untyped string payloads to RemoteIterators of typed objects. The Stream DSL builds on these facilities to cater for a few common conversions:


  • payloadsIn(RSLocator): converts an arbitrary result set into a RemoteIterator<String> defined over the record payloads in the result set;
  • urisIn(RSLocator): converts a result set of URI serialisations into a RemoteIterator<URI>;
  • documentIn(RSLocator): converts a result set of GCubeDocument serialisations into a RemoteIterator<GCubeDocument>;
  • metadataIn(RSLocator): converts a result set of GCubeDocument serialisations into a RemoteIterator<GCubeMetadata> defined over the metadata elements of the GCubeDocuments in the result set;
  • annotationsIn(RSLocator): converts a result set of GCubeDocument serialisations into a RemoteIterator<GCubeAnnotations> defined over the annotations of the GCubeDocuments in the result set;
  • partsIn(RSLocator): converts a result set of GCubeDocument serialisations into a RemoteIterator<GCubePart> defined over the parts of the GCubeDocuments in the result set;
  • alternativesIn(RSLocator): converts a result set of GCubeDocument serialisations into a RemoteIterator<GCubeAlternative> defined over the alternatives of the GCubeDocuments in the result set;


Essentially, documentsIn() adapts the result set to a RemoteIterator<T> that parses documents as it iterates over their serialisations. The following methods do the same, but extract the corresponding GCubeElements from the GCubeDocuments obtained from parsing. All the methods are based on the last one, payloadsIn, which is also immediately useful to feed result set of GCubeDocument identifiers to the read operations the gDL that perform stream-based document lookups.

note: all the conversions above produce RemoteIterators that return the locator of the original result set from invocations of locator(). Clients can use the locator to process the stream with standard set-based APIs, as usual.

The usage pattern is straightforward and combines with the previous conversions. The following example illustrates:

import static org.gcube.contentmanagement.gcubedocumentlibrary.streams.dsl.Streams.*;
...
RSLocator rs = ...
Iterator<GCubeDocument> it = convert(documentsIn(rs)).with(FAILFAST_POLICY);

Piped Conversions

The conversions introduced above do not alter the original streams, i.e. the output iterators produce the same elements of the input iterators. The exception is with result set-based conversions: documentsIn() parses the untyped payloads of the input result sets into typed objects, while methods such as metadataIn() extract GCubeMetadata elements from GCubeDocuments. Parsing and extraction are only examples of the kind of post-processing that clients may wish to apply to the elements of existing stream to produce a new stream of post-processed elements. All the remaining sentences of the Stream DSL are dedicated precisely to this kind of conversions.

Sentences introduced by the verb pipe take a stream and return a second stream that applies an arbitrary filter to the elements of the first stream, encapsulating a fault handing policy in the process. The following example illustrates basic usage:

import static org.gcube.contentmanagement.gcubedocumentlibrary.streams.dsl.Streams.*;
...
Iterator<GCubeDocument> it1 = ...
 
Filter<GCubeDocument,String> filter = new Filter<GCubeDocument,String>() { 
                  public String apply(GCubeDocument doc) throws Exception {                           return doc.name();
                  }
};
 
Iterator<GCubeDocument> it2 = pipe(it1).though(filter).withDefaults();

Here, a standard Iterator of GCubeDocuments is piped through a filter that extracts the names of GCubeDocuments. The result is another standard Iterator that produces document names from the original stream. The clause through() introduces the filter on the output stream and, as already discussed for conversion methods, the clause withDefaults() configures the default IGNORE_POLICY for it.

As shown in the example, filters are implementations of the Filter<FROM,TO> interface. The method apply() is self-explanatory: clients are given the elements of the unfiltered stream as the filtered stream is being iterated over, and they have the onus to produce and return an element of the filtered stream. The only point worth stressing is that apply()s can throw a fault if it cannot produce an element of the filtered stream. The filtered stream passes these faults to the FaultPolicy configured for it. In this example, faults clearly cannot occur. If they did, however, the configured policy would simply ignore them, i.e. the problematic elements of the input stream would not contribute to the contents of the filtered stream.

In the example the input stream and the filtered one are both standard Iterators. The construct, however, is generic and can be used to filter any form of stream into any other. In this sense, the construct embeds stream conversions within its clauses. As an example, consider the common case in which a RemoteIterator is filtered into a standard Iterator:

import static org.gcube.contentmanagement.gcubedocumentlibrary.streams.dsl.Streams.*;
...
RemoteIterator<GCubeDocument> rit = ...
 
Filter<GCubeDocument,SometType> filter = ....;
 
Iterator<SomeType> it = pipe(rit).though(filter).with(FAILFAST_POLICY);

Here, filter is applied to the elements of a RemoteIterator to produce a standard Iterator that stops as soon as the input stream raises a fault. Conversely, in the following example:

import static org.gcube.contentmanagement.gcubedocumentlibrary.streams.dsl.Streams.*;
...
RemoteIterator<GCubeDocument> rit1 = ...
 
Filter<GCubeDocument,SometType> filter = ....;
 
RemoteIterator<SomeType> rit2 = pipe(rit1).though(filter).withRemote(IGNORE_POLICY);

Here, filter is applied to the elements of a RemoteIterator to produce yet another RemoteIterator that ignores any fault raised by the input iterator.


To conclude with pipe-based sentences, note that the Stream DSL includes Processor<T>, a base implementation of Filter&ltFROM,TO> that simplifies the declaration of filters that simply mutate the input and return it. The following example illustrates usage:

import static org.gcube.contentmanagement.gcubedocumentlibrary.streams.dsl.Streams.*;
...
RemoteIterator<GCubeDocument> rit1 = ...
 
Processor<GCubeDocument> processor = new Processor<GCubeDocument>() { 
            public void process(GCubeDocument doc) throws Exception {                       doc.setName(doc.name()+"-modified");
            }
} ;
 
RemoteIterator<GCUBEDocument> rit2 = pipe(rit1).though(processor).withRemoteDefaults();

Here, the processor simply updates the GCubeDocuments in the input stream by changing their name. The output stream thus returns the same elements of the input stream, albeit updated. During iteration, its policy is simply to re-throw any fault that may be raised by the input iterator.

In summary, the Stream DSL allows clients to formulate the following sentences for piped stream conversion:

  • pipe(Iterator|RemoteIterator).through(Filter|Processor).with(FaultPolicy): uses a given Filter or Processor to convert a standard Iterator or a RemoteIterator into a standard Iterator that encapsulates a given FaultPolicy;
  • pipe(Iterator|RemoteIterator).through(Filter|Processor).withDefaults(): uses a given Filter or Processor to convert a standard Iterator or a RemoteIterator into a standard Iterator that encapsulates a IGNORE_POLICY for faults;
  • pipe(Iterator|RemoteIterator).through(Filter|Processor).withRemote(RemoteFaultPolicy): uses a given Filter or Processor to convert a standard Iterator or a RemoteIterator into a RemoteIterator that encapsulates a given RemoteFaultPolicy;
  • pipe(Iterator|RemoteIterator).through(Filter|Processor).withRemoteDefaults(): uses a given Filter or Processor to convert a standard Iterator or a RemoteIterator into a RemoteIterator that encapsulates the RETHROW_POLICY for faults;

Finally, the Stream DSL offers a couple of methods that encapsulate pipe sentences to extract content URIs from GCubeElements:

  • <E extends GCubeElement> urisOf(Iterator<E>|RemoteIterator<E>): converts a standard Iterator or a RemoteIterator over GCubeElements into, respectively, a a standard Iterator or a RemoteIterator over the content URIs of the elements;

Folding Conversions

With pipe-based sentences, clients can filter the elements of a stream into the elements of another streams. While the elements of the two stream can vary arbitrarily in type, the correspondence between elements of the two streams is fairly strict: for each element of the input stream there may be at most one element of the output stream (elements that raise iteration failures in the input stream may have no counterpart in the output stream, i.e. may be discarded). In this sense, the streams are always consumed in phase.

In some cases, however, clients may wish to:

  • fold a stream, i.e. produce another stream that has one List element for each N elements of the original stream;
  • unfold a stream, i.e. produce another stream that has N elements for each element in the original stream.

Conceptually, these requirements are still within the scope of filtering, but the fact that the consumption of the filtered stream is out of phase with respect to the unfiltered stream requires a rather different treatment. For this reason, the Stream DSL offers two dedicated classes of sentences:

  • group-based sentences for stream folding;
  • unfold-based sentences for stream unfolding.

To fold a stream, clients indicate how many elements of the stream should be grouped into elements of the folded stream, what filter should be applied to each of the elements of the stream and, as usual, what fault handling policy should be used for the folded stream. The following example illustrates usage in the common case in which a RemoteIterator is folded into a standard Iterator:

import static org.gcube.contentmanagement.gcubedocumentlibrary.streams.dsl.Streams.*;
...
RemoteIterator<GCubeDocument> rit = ...
 
Filter<GCubeDocument,SometType> filter = ....;
 
Iterator<List<SomeType>> it = group(rit).in(10).pipingThrough(filter).withDefaults();

The RemoteIterator is here folded in Lists of 10 elements, (or smaller, if the end of the input stream is reached before a List of The clause in() indicates the maximum size of the output Lists. Each of the GCubeDocuments in the original stream is then passed through filter, which produces one of the List elements for it. The clause pipingThrough allows the configuration of the filer. Finally, the default IGNORE_POLICY is set on the folded stream with the clause withDefaults(), meaning that any fault raised by the RemoteIterator or filter will be tolerated and the element that caused the failure will simply not contribute to the accumulation of the next 10 elements of the folded stream.

note: the example shows the folding of a RemoteIterator into a standard Iterator but, as for all the sentences of the DSL, all combinations of input and output streams are possible, with the usual implications on the fault handing policies that can be set on the folded stream and with the optional choice of Processors over Filters in cases where folding simply groups updated elements of the stream.

It is a common requirement to fold a stream without transforming or altering otherwise its elements. In this case, the clause pipingThrough can be omitted altogether from the sentence:

import static org.gcube.contentmanagement.gcubedocumentlibrary.streams.dsl.Streams.*;
...
RemoteIterator<GCubeDocument> rit = ...
 
Iterator<List<GCubeDocument>> it = group(rit).in(10).withDefaults();

Effectively, the stream is here being filtered with a pass-through filter that simply returns the elements of the unfolded streams. As we shall see, t his kind of folding is particularly useful to 'slice' a stream in small in-memory collections that can be used with the write operations of the gDL that work in bulk and by-value.


In summary, the Stream DSL allows clients to formulate the following sentences for folding stream conversion:

  • group(Iterator|RemoteIterator).in(N).pipingThrough(Filter|Processor).with(FaultPolicy): uses a given Filter or Processor to N-fold a standard Iterator or a RemoteIterator into a standard Iterator that encapsulates a given FaultPolicy;
  • group(Iterator|RemoteIterator).in(N).pipingThrough(Filter|Processor).withDefaults(): uses a given Filter or Processor to N-fold a standard Iterator or a RemoteIterator into a standard Iterator that encapsulates the IGNORE_POLICY for faults;
  • group(Iterator|RemoteIterator).in(N).pipingThrough(Filter|Processor).withRemote(RemoteFaultPolicy): uses a given Filter or Processor to N-fold a standard Iterator or a RemoteIterator into a RemoteIterator that encapsulates a given RemoteFaultPolicy;
  • group(Iterator|RemoteIterator).in(N).pipingThrough(Filter|Processor).withRemoteDefaults(): uses a given Filter or Processor to N-fold a standard Iterator or a RemoteIterator into a RemoteIterator that encapsulates the RETHROW_POLICY for faults;
  • group(Iterator|RemoteIterator).in(N).with(FaultPolicy): uses a pass-through filter to N-fold a standard Iterator or a RemoteIterator into a standard Iterator that encapsulates a given FaultPolicy;
  • group(Iterator|RemoteIterator).in(N).withDefaults(): uses a pass-through filter to N-fold a standard Iterator or a RemoteIterator into a standard Iterator that encapsulates the IGNORE_POLICY for faults;
  • group(Iterator|RemoteIterator).in(N).withRemote(RemoteFaultPolicy): uses a pass-through filter to N-fold a standard Iterator or a RemoteIterator into a RemoteIterator that encapsulates a given RemoteFaultPolicy;
  • group(Iterator).in(N).withRemoteDefaults(): uses a pass-through filter to N-fold a standard Iterator or a RemoteIterator into a RemoteIterator that encapsulates the RETHROW_POLICY for faults


Unfolding Conversions

Unfolding a stream follows a similar pattern, as shown in the following example:

import static org.gcube.contentmanagement.gcubedocumentlibrary.streams.dsl.Streams.*;
...
RemoteIterator<GCubeDocument> rit = ...
 
Filter<GCubeDocument,List<SometType>> filter = ....;
 
Iterator<SomeType> it = unfold(rit).pipingThrough(filter).withDefaults();

This time we cannot dispense with using a Filter, which is necessary to map a single element of the stream into a List of elements that the unfolded stream, a standard Iterator in this example, will then yield one at the time at the client's demand. As usual, all combinations of standard Iterators, RemoteIterators, and fault handling policies are allowed. Using Processors is instead disallowed here, as it's in the nature of unfolding to convert a element into a number of different elements. Unfolding and updates, in other words, do not interact well.

The most common application of unfolding is for the extraction of inner elements from documents, e.g. unfold a stream of GCubeDocuments into a stream of GCubeMetadata elements, where each element in the unfolded stream belongs to some GCubeDocument in the document stream. Accordingly, the Stream DSL predefines a comprehensive number of these unfoldings. We have seen some of them already, where the document input stream was in the form of a result set (e.g. metadataIn(RSLocator)). Similar unfoldings are directly available on RemoteIterator<GCubeDocument>s.


In summary, the Stream DSL allows clients to formulate the following sentences for unfolding stream conversion:

  • unfold(Iterator|RemoteIterator).pipingThrough(Filter).with(FaultPolicy): uses a given Filter to unfold a standard Iterator or a RemoteIterator into a standard Iterator that encapsulates a given FaultPolicy;
    • unfold(Iterator|RemoteIterator).pipingThrough(Filter).withDefaults(): uses a given Filter to unfold a standard Iterator or a RemoteIterator into a standard Iterator that encapsulates the IGNORE_POLICY for faults;
  • unfold(Iterator|RemoteIterator).pipingThrough(Filter).withRemote(RemoteFaultPolicy): uses a given Filter to unfold a standard Iterator or a RemoteIterator into a standard Iterator that encapsulates a given RemoteFaultPolicy for faults
  • unfold(Iterator|RemoteIterator).pipingThrough(Filter).withRemoteDefaults(): uses a given Filter to unfold a standard Iterator or a RemoteIterator into a RemoteIterator that encapsulates the RETHROW_POLICY for faults;
  • metadataIn(Iterator<GCubeDocument>|RemoteIterator<GCubeDocument>): unfolds a standard Iterator<GCubeDocument> or a RemoteIterator<GCubeDocument> into, respectively, a Iterator<GCubeMetadata> or a RemoteIterator<GCubeMetadata> defined over the metadata elements of the GCubeDocuments in the original stream;
  • annotationsIn(Iterator<GCubeDocument>|RemoteIterator<GCubeDocument>): unfolds a standard Iterator<GCubeDocument> or a RemoteIterator<GCubeDocument> into, respectively, a Iterator<GCubeAnnotation> or a RemoteIterator<GCubeAnnotation> defined over the annotations of the GCubeDocuments in the original stream;
  • partsIn(Iterator<GCubeDocument>|RemoteIterator<GCubeDocument>): unfolds a standard Iterator<GCubeDocument> or a RemoteIterator<GCubeDocument> into, respectively, a Iterator<GCubePart> or a RemoteIterator<GCubePart> defined over the parts of the GCubeDocuments in the original stream;
  • alternativesIn(Iterator<GCubeDocument>|RemoteIterator<GCubeDocument>): unfolds a standard Iterator<GCubeDocument> or a RemoteIterator<GCubeDocument> into, respectively, a Iterator<GCubeAlternative> or a RemoteIterator<GCubeAlternative> defined over the alternatives of the GCubeDocuments in the original stream;