GDL Views (2.0)

From Gcube Wiki
Jump to: navigation, search

Some clients interact with remote collections to work exclusively with subsets of document descriptions that share certain properties, e.g. are in a given language, have changed in the last month, have metadata in a given schema, have parts of a given type, and so on. Their queries and updates are always resolved within these subsets, rather than the whole collection. Essentially, such clients have their own view of the collection.

The gDL offers support for working with two types of view:

  • local views: these are views defined by individual clients as the context for a number of subsequent queries and updates. Local views may have arbitrary long lifetimes, and may even outlive the client that created them, they are never used by multiple clients. Thus local views are commonly transient and if their definitions are somehow persisted, they are persisted locally to the 'owning' client and remain under its direct responsibility.
  • remote views: these are views defined by some clients and used by many others within the system. Remote views outlive all such clients and persist in the infrastructure, typically for as long as the collection does. They are defined through the View Manager service (VM), which materialises them as WS-Resources. Each VM resource encapsulates the definition of the view as well as its descriptive properties, and it is responsible for managing its lifetime, e.g. keep track of its cardinality and notify interested clients of changes to its contents. However, VM resources are 'passive, i.e. do not mediate access to those content resources.

Naturally, the gDL uses projections as view definitions. It then offers specialised Readers that encapsulate such projections to implicitly resolve all their operations in the scope of the view. This yields view-based access to collections and allows clients to work with local views. In addition, the gDL provides local proxies of VM resources with which clients can create, discover, and inspect remote views. As these proxies map remote view definitions onto projections, remote views can be accessed with the same reading mechanisms available for local views.

read more...

Local Views

To work with a local view of a remote collection, a gDL client creates first the projection that defines the view. The client then injects the projection into a ViewReader, along with another Reader already configured to access the target collection. The ViewReader implements the Reader interface, offering all the read operations discussed above. When any of its operations is called, however, the ViewReader merges the view definition and the input projection, combining their constraints. It then passes the merged projection to the inner Reader, which executes the operation. Effectively, this resolves the operation in the scope of the view.

The following example illustrates the approach:

import static java.util.Locale.*;
import static org.gcube.contentmanagement.gcubedocumentlibrary.projections.Projections.*;
 
//a reader configured to access a target collection.
Reader reader = ...
 
//define a local viewDocumentProjection view = document().whereValue(LANGUAGE,FRENCH);
 //inject view and reader it into a view reader
ViewReader viewReader = new ViewReader(view,reader);
 
GCubeDocument doc = viewReader.get("...some id...", document().with(NAME));

Here, the view includes only the document descriptions (with a bytestream) in a given language. The lookup operation retrieves the target description only if it has a name and is in the view. At runtime, the DocumentProjection that defines the view is merged with the projection passed to the lookup operation. This produces the same effect that the following projection would produce if it was executed by a plain Reader:

document().whereValue(LANGUAGE,FRENCH).with(NAME);

The example above defines the view as a DocumentProjection but any projection can be used for the purpose (e.g. a MetadataProjection, an AnnotationProjection, etc). In general, clients have the same flexibility in defining views as they do in invoking the operations of Readers: any projection that can be used in one context can also be used in the other. Clients will choose DocumentProjections when the view needs to characterise properties of entire documents and/or inner elements of different types. They will instead prefer more specific projections when the view is predicated on properties of inner elements of the same type. For example, a view that characterises document descriptions based only on the schema of their metadata elements is more conveniently defined with a MetadataProjection:

...
//define a local view
DocumentProjection view = metadata().whereValue(SCHEMA_URI,"..some schema..");...
GCubeDocument doc = viewReader.get("...some id...", document().with(NAME));
...

The operation above would lookup the target document description only if it has a name and at least one metadata element in the given schema.

Similarly, clients are free to pass any projection with the operations of the ViewReader, including those that "diverge" arbitrarily from the view:

...
//define a local view
DocumentProjection view = document().where(PART);...
GCubeDocument doc = viewReader.get("...some id...", metadata().with(BYTESTREAM));...

The operation above would lookup the target document descriptions only if it has at least one part and one metadata element with an inlined bytestream.

note: The Reader injected into a ViewReader may be a "plain" DocumentReader, a CachingReader wrapped around a DocumentReader, or even another ViewReader. In the latter case, the views embedded in the two readers would effectively nest.

The freedom in merging view definitions with other projections is limited only by the obvious requirement: the merged projection must not retrieve documents that are outside the view. The ViewReader will detect projections that break the view abstraction and, in case, refuse them as parameters of its operations. For example:

...
//define a local view
DocumentProjection view = document().whereValue(LANGUAGE,FRENCH);...
try {
  GCubeDocument doc = viewReader.get("...some id...", document().with(NAME).whereValue(LANGUAGE,ENGLISH));  assert(false);
}
catch(InvalidProjectionException e) {	assert(true);
}
...

This attempts generates an InvalidProjectionException as document descriptions in English are not part of the view.

The possibility of an invalid projection emerges as soon as there is an overlap between the properties specified in view definitions and those specified in the input projections (as for LANGUAGE above). A crude policy to enforce views would be to prevent overlaps altogether. This policy is too inflexible, for two main reasons:

  • like all projections, view definitions specify retrieval directives (i.e. with(), where, opt()), i.e. what is to be retrieved and what it should not move over the network. While clients can exploit this directives to define default directives within view definitions, it's important that clients may freely override them on a per-operation basis, e.g. replace a with() in the view definition with a where() in the input projection (e.g. to avoid retrieval of unnecessary data). For this reason, it is important to handle overlaps that signal overriding of defaults, provided that the operation can still be guaranteed to remain within the scope of the view. A variation of the example above illustrates the point:
...
//define a local view
DocumentProjection view = document().whereValue(LANGUAGE,FRENCH);...
GCubeDocument doc = viewReader.get("...some id...", document().with(NAME).withValue(LANGUAGE,FRENCH));...
Here the input projection is allowed to overlap with the view definition on LANGUAGE for overriding purposes. Notice that overriding is allowed here because the constraints imposed by the view are preserved.
  • note: in line with all the previous examples, it is good practice to use where() directives in view definitions, so as to limit the need for overriding and to allow clients to specify explicitly only what they wish to retrieve.
  • the other reason to deal with overlaps between view definitions and input projections is simply flexibility. In principle, input projections ought to be allowed to refine the view in their projections. While it is not possible to recognise all refinements statically (and in some case it is genuinely difficult even when it is possible), ViewReader recognise and allow a number of common refinements including:
  • refinements on existence constraints. If the view requires the mere existence of a property, then an input projection can specify a further constraint on it. For example, the view document().where(NAME) can be refined (and overridden) by the input projection document().withValue(NAME,"myname");
  • refinements on deep projections. If the view constraints properties of inner elements, then an input projection can constrain other properties of those elements, or even the same properties if the refinement is allowed in turn. For example, the view document().where(METADATA,metadata().with(NAME)) can be refined by the input projection document().with(METADATA, metadata().withValue(NAME,"myname")).

Flexibility is not only introduced by refinements:

  • a view constrain may also be overridden by a widening input projection. For example, the view document().where(NAME,"myname") can be overridden by the widerdocument().with(NAME). The two projections merge into the projection document().with(NAME,"myname"). Thus clients are not required to know the details of the view if they wish to retrieve the names of document descriptions.

To conclude on the possibilities for view-based access, notice that empty projections continue to retain their simplicity of use under views. For example:

...
//define a local view
PartProjection view = part().whereValue(LANGUAGE,FRENCH);...
GCubeDocument doc = viewReader.get("...some id...", part());...

continues to retrieve all parts of documents in the view. Here, the view definition and the input projection merge into a projection that has the single constrain of the view and optional include constraints for all the other remaining documents properties.

Similarly , the catch-all constraints such as etc() and allexcept() retain their semantics under views, and can be used in input projections with the usual expectations (using them in view definitins as defaults for retrieval is also possible, though discouraged in the general case for the reasons discussed previously).

Remote Views

When working with remote views, accessing documents in the view is not the only client requirements. Publishing a remote view, i.e. share it with other remote clients,and discovering existing views with given properties are also common tasks for clients. In the gDL, support for these tasks is mostly found in CollectionViews.

A CollectionView is a local proxy a remote view, not dissimilarly from how GCubeDocuments are local proxies of remote document descriptions. More specifically, collection views are document-oriented abstractions over the tree-oriented View proxies of WS-Resources of the View Manager service, as offered by the client-library.

Most properties of View proxies carry over directly to CollectionViews, including:

  • the collection identifier (cf. collectionId());
  • the view identifier (cf. id());
  • the descriptive name of the view, e.g "myview" (cf. name());
  • the free-form description of the view, e.g. "all document descriptions such that..." (cf. description());
  • the broad type of the view, e.g. a QName (cf. type());
  • the time of last update of the view (cf. lastUpdate());
  • the cardinality of the view (cf. cardinality());
  • the generic properties of the view (cf. properties()).

Most importantly, the CM Predicates used in View proxies to characterise the document descriptions in the view become Projections over GCubeElements, whether entire document descriptions (i.e. GCubeDocuments) or specific inner elements of such descriptions (e.g. GCubeMetadata).

CollectionView defines a read-only interface to the properties of the view, which supports generic programming tasks over views. Most clients however will work with concrete implementations of the interface. The current version of the gDL includes the following ones:

  • GenericView: a generic implementation for arbitrary remote views;
  • MetadataView: an implementation for remote views defined over a simple set of metadata element properties;
  • AnnotationView: an implementation for remote views defined over a simple set of annotation properties;

All CollectionView implementations inherit from the abstract BaseCollectionView, while MetadataView and AnnotationView inherit from the more specific SimpleView.

note: Custom implementations can be created as well, by specialising GenericView or, more commonly, BaseCollectionView which has been explicitly designed for open-ended extensibility.

Across all implementations, CollectionViews may be in either one of two states:

  • unbound: this is the state of instances that are not associated with remote views. In this state, CollectionViews can be used for view publication and view discovery, but not for view-based access to collections.
  • bound: this is the state of instances that are bound to remote views. In this state, CollectionViews can be used for view-based access but not for view publication or view discovery.

Typically, CollectionViews are created unbound and used for publication or discovery, depending on the general goals of the client. Both operations introduce bindings: a CollectionViews becomes bound after publication and all the CollectionViews returned from discovery operations are already bound; these can then be used for view-based access. In less common cases, clients may start with a VM proxy and inject it directly into a new CollectionViews, which is thus instantiated in a bound state. All the implementations of CollectionView are responsible for enforcing state-based constraints.

Publishing Views

We use GenericViews to illustrate how client can publish, discover and use remote views. We then show how working with MetadataViews and AnnotationViews changes the basic usage patterns.

Publishing a remote view involves creating a proxy for it in a given scope, setting its properties, and invoking the method publish() on it:

import static org.gcube.contentmanagement.gcubedocumentlibrary.projections.Projections.*;
import static java.util.Locale.*;
 
GCubeScope scope = ...
 
//creates unbound view in given scope
GenericView view = new GenericView(scope); 
assert(view.isBound()==false);
 
//sets view properties
view.setId("...");
view.setCollectionId("...");
view.setName("...");
view.setDescription("...");
 
//sets view definition
view.setProjection(document().where(LANGUAGE,FRENCH)); 
/publish view
view.publish(); 
assert(view.isBound());

We create a GenericView in a given scope and sets its properties, including the projection that defines it. We then publish the view in the scope in which we created it. The test shows that the GenericView is unbound until its publication, after which it is bound.

A few points to notice about view publication:

  • in the example, we have used a DocumentProjection for documents (with a bytestream) in French. As usual, clients can use the type of projection which is more convenient to express their constraints (e.g. a MetadataProjection). Regardless of the type of the injected projection, a GenericView returns always a DocumentProjection from its projection() method, as in the general case the definition of the view may be acquired during discovery, when its gDL type is statically unknown.
  • views can be published with the method publishAndBroadcast(). As the name suggests, this overload induces the View Manager service to broadcast a record of its creation. This is then used for autonomic state replication across its instances, in line with the scalability mechanisms of the service.
  • some properties of the view (e.g. time of last update) are set on the view by the View Manager at the point of creation. During publication these properties are automatically synchronised on the local proxy.
  • we did not set the type of the view before publication. The type of a GenericView is in fact constant (the name of the class itself, GenericView) and clients cannot alter it. This constancy is found in all other implementations of CollectionView, as it ensures that upon discovering views we can recognise the classes to which we should bind them.

note: View publication requires unbound proxies. Any attempt to invoke publish() or publishAndBroadcast() on a bound proxy will result in an IllegalStateException.

note: Since version 2.1.0, all remote views can be created without an explicit reference to a scope. In this case, the view will be created and henceforth published in the current scope, i.e. the scope set for the current thread on the default scope manager (GCUBEScopeManager.DEFAULT).

Discovering Views

CollectionViews can act as simple queries to discover remote views of the same type by similarity. The following view properties are all transformed into equality constraints for the query:

  • the view identifier;
  • the collection identifier;
  • the generic properties of the view.

This supports combinations of three base use cases for view discovery:

  • view-by-id: find view with a given identifier;
  • views-for-collection: find all views of this type for given collection;
  • views-with-properties: find all views of this type with given properties;

Discovering views involves creating an unbound proxy in a given scope, setting its properties as query conditions, and invoking the method findSimilar() on it:

GCubeScope scope = ...
 
//creates unbound view in given scope
GenericView view = new GenericView(scope);
 
assert(view.isBound()==false);
 
//sets query conditions
view.setCollectionId("...");
view.addProperty(new ViewProperty(new QName("myprop"), "...some property description...", "myval"));
 
/discover viewsList<GenericView> similars = view.findSimilar();
 
for (GenericView similar: similars) {
  assert(similar.isBound());
  ...
}

Here we look for all GenericViews for a given collection and with a given property, and we obtain an Iterator<GenericView> over the results. Notice that the proxy is unbound but it returns bound results.

Looking for a given view is similar:

GCubeScope scope = ...
 
//creates unbound view in given scope
GenericView view = new GenericView(scope);
 
//sets query condition
view.setId("...");
 
/discover views
List<GenericView> similars = view.findSimilar();
 
GenericView target = similars.get(0);
 ...

note: View discovery requires unbound proxies. Any attempt to invoke findSimilar() on a bound proxy will result in an IllegalStateException.

Queries for cross-type views for a given collection cannot be performed by similarity, i.e. using a proxy as a query. However, the method findSimilar()<code> in the <code>Views class is available for this purpose:

import static org.gcube.contentmanagement.gcubedocumentlibrary.views.Views.*;
 
GCubeScope scope = ...
 
//discover views
List<? extends CollectionView<?,?>> similars = findSimilar("...target collection id...", scope); 
for (CollectionView<?,?> similar : similars) {
  ...process generically...
  if (similar instanceof GenericView) {     GenericView generic = (GenericView) similar;
     ....or specifically...
  }
}

note: Views#findSimilar() is also available in an overloaded version that takes a GCUBESecurityManager for use in secure infrastructures.

More complex queries on remote views are of course possible using the generic query facilities of the system. In this case, individual results can be converted to View proxies of the View Manager service (see conversion utilities in the client library of the service). Such View proxies can then be converted in CollectionViewss using dedicated constructors on implementation classes, e.g.

View untyped = ....
...
GenericView view = new GenericView(view);
...

note: Since version 2.1.0, view discovery does not require an explicit reference to a scope. In this case, views will be discovered in the current scope, i.e. the scope set for the current thread on the default scope manager (GCUBEScopeManager.DEFAULT).

Using Views

Bound CollectionViews proxies can be used for view-based access simply by:

  • invoking the method reader() on the proxy, which returns a ViewReader already configured with the projection that defines the view and the scope in which the proxy was created;
  • using the ViewReader exactly as discussed for local views.

The following example illustrates usage in the typical pattern of read-after-discovery:

GCubeScope scope = ...
 
//creates unbound view in given scope
GenericView view = new GenericView(scope);
 
//sets view properties
view.setId("...");
 
/publish view
List<GenericView> similars = view.findSimilar();
 
GenericView target = similars.get(0);
 
ViewReader reader = target.reader(); ...use reader...

Here, we first create the proxy to serve as a query by-view-id and then obtain a ViewReader from the single result of the query. Notice that view-discovery and view-based access can be performed in fully generic fashion.

note: View-based access require bound proxies. Any attempt to obtain a ViewReader from an unbound proxy will result in an IllegalStateException.

Metadata and Annotation Views

While GenericViews cater for arbitrary view definitions, MetadataViews and AnnotationViews narrow the scope for publication, discovery, and access to views defined over three key properties of, respectively, metadata elements and annotation elements. The properties are:

  • the language of (the bytestream of) the element;
  • the name of the schema of (the bytestream of) the element;
  • the URI of the schema of (the bytestream of) the element.

These are properties shared by all the inner elements of document descriptions, but they are of particular significance for metadata and annotations where they serve as grouping criteria for many processes within the system and, ultimately, for end users. Effectively, any set of values for these properties identifies a "virtual" collection of metadata and annotations for a given document collection.

MetadataViews and AnnotationViews behave like other CollectionViews but differ in the following respects:

  • the proxies address specific plugins of the View Manager service. These plugins are specialised in the management of key view properties such as cardinality and time of last update.
  • clients define views by providing values for the three key properties above to the method setProjection(Locale, String, URI) of the proxies. The proxy is responsible for synthesising a projection based on these values;
  • the method projection() returns a MetadataProjection and an AnnotationProjection, depending on the proxy type;
  • the proxies carry three additional pieces of information about the "virtual" collection defined by the view:
  • whether it is visible to users or only to the system (cf. setUserCollection(), isUserCollection());
  • whether it is editable (cf. setEditable(boolean), isEditable());
  • whether it is indexable (cf. setIndexable(boolean), isIndexable());
This information is mapped by the proxies onto three generic properties of the views. Clients that do no provide values for them inherit defaults at publication time (all properties are set to true).

We show an example of publication of MetadataViews that carries directly over to Annotations.

import static java.util.Locale.*;
 
GCubeScope scope = ...
 
//creates unbound view in given scope
Metadata view = new MetadataView(scope); 
//sets view properties
view.setId("...");
view.setCollectionId("...");
view.setName("...");
view.setDescription("...");
 
//define view and virtual collection properties (use defaults for illustration)
view.setProjection(FRENCH,"etc",new URI("http://someuri"));view.setIndexable(true);view.setUserCollection(true);view.setEditable(true); 
/publish view
view.publish();

View discovery and view-based access occur exactly as with GenericViews.