Difference between revisions of "Content Manager"
(→gDoc API) |
(→gDoc API) |
||
Line 136: | Line 136: | ||
==== Creating Trees ==== | ==== Creating Trees ==== | ||
+ | |||
+ | ==== Consuming Trees ==== | ||
+ | |||
+ | ==== Serialising and Deserialising Trees ==== | ||
+ | |||
+ | ==== Binding Trees ==== | ||
=== Tree Predicates === | === Tree Predicates === |
Revision as of 11:06, 6 September 2010
The Content Manager service provides its clients with uniform access to content served by a variety of back-ends, both inside and outside the system. It is the central component of the gCube subsystem that deals with the organisation of content and related data.
Contents
Service Design
The Content Manager is designed as an OCMA service. In OCMA terms, it classifies as a multi-type, 1-N adapter service:
- it is a multi-type service because it supports two front types for, respectively, reading and writing content modelled as labelled trees.
- Collectively, the front types and the tree content model form the
gDoc
access type of the service.
- it is an adapter service because it adapts the
gDoc
access type to multiple back types, where each back type corresponds to the access type of a whole class of remote repositories.
- For this, the service employes an open architecture of type-specific plugins to which it delegates the creation and operation of its collection managers.
- Plugins are dynamically deployed within single instances of the services, and different instances may host different plugins. In addition, some plugins may support both service front types, i.e. grant read and write access to the corresponding repository. Others may instead support read-only access or, less commonly, write-only access.
The figure below overviews the design and use of the service in the context of one its running instances. The instance exposes three stateful port-types:
- the
ReadManager
serves as the interface of collection managers that offer read-only operations over the content of the bound collection.
- The interface defines the
gDocRead
front type of the service. - The front type and the identifier of the bound collection are published as Resource Properties of the manager, in accordance with OCMA patterns for publication and discovery of service state. A third Resource Property is the name of the bound plugin, i.e. the plugin to which the manager delegates the resolution of its requests.
- the
WriteManager
serves as the interface of collection managers that offer write-only operations over the content of the bound collection.
- The interface defines the
gDocWrite
front type of the service. - Again, the type, the identifier of the bound collection, and the name of the bound plugin are published as Resource Properties of the manager.
- the
Factory
serves as the front-end of a single WS Resource that createsReadManager
andWriteManager
resources .
- The resource is created at the activation of the service instance in the gCube Hosting Node.
- During its lifetime, it publishes creation requests as activation records. Conversely, it subscribes for the activation records that are published by other instances of the service, in line with OCMA patterns for replication of service state.
- The resource also publishes as a Resource Property the list of summary descriptions of the plugins that are hosted at the service instance.
Service plugins logically extend factory and collection manager resources with corresponding resource delegates. In particular:
- the factory delegate extends the
Factory
resource at plugin deployment time in order to handle requests that are specifically addressed to the plugin; - at each such request, the factory delegate processes plugin-specific parameters to create one ore more read delegates and/or write delegates, which the service instance uses to create and extend corresponding collection managers;
- future requests to the managers are then handled by their delegates, which translate the requests against the back-end repository that exposes the collection bound to the managers.
Finally, note that factory and collection managers are persistent resources and may thus be re-activated across restarts of the gCube Hosting Node:
- the factory persists the history of its activations, i.e. the activation records that it published and/or processed.
- the collection managers persist the name and state of their delegates.
Content Model
Architectural considerations aside, the most distinguished element in the design of the Content Manager is its content model. Rather than settle for a fixed set of document structures, the service adopts a generic structure that can act as a 'carrier' for an arbitrary number of concrete document models. In particular, the service deals with edge-labelled and node-attributed trees, the gDoc
trees.
The expectation here is that producers (service plugins) and consumers (service clients) will convene on concrete document models and exchange gDoc
trees with an agreed shape. The agreement may be bilateral or involve any number of parties, and it may apply to the entire document or to distinguished parts of it (e.g. document metadata, annotations, raw content packaging, etc). For maximum decoupling between consumers and producers, the agreement may reflect system-wide conventions and result in canonical tree forms.
gDoc Trees
A gDoc
tree has the following properties:
- its nodes may have an identifier and a number of uniquely named attributes;
- its edges have a label;
- its leaf nodes may have a value;
- its root may identify the collection of the corresponding document.
In particular:
- identifiers, attributes, and leaves have text values;
- attribute names and labels may be qualified with a namespace.
Finally:
- nodes may have a state of
NEW
,MODIFIED
, orDELETED
.
- States denote changes with respect to persistent representations of documents. They are used in the write operations of the service.
The figure below uses a graphical representation to show an example of a gDoc
tree.
gDoc
trees serialise to XML documents for exchange over the network. In particular:
- nodes serialise to elements and attributes serialise to element attributes
- elements are named like the edges that enter the corresponding nodes
- the document element is arbitrarily named
- the elements of inner nodes contain the elements of their children
- the elements of leaves contain their value
- node identifiers serialise to attributes called
http://gcube-system.org/namespaces/contentmanagement/gdoc:id
- node states serialise to attributes called
http://gcube-system.org/namespaces/contentmanagement/gdoc:state
- collection identifiers serialise to attributes called
http://gcube-system.org/namespaces/contentmanagement/gdoc:collID
For example, the gDoc
tree above may serialise as:
<g:gdoc xmlns:g="http://gcube-system.org/namespaces/contentmanagement/gdoc" g:id="1" g:state="MODIFIED" x="..." y="..." g:collID="..."> <a g:id="2" g:state="MODIFIED"> <b g:id="5" g:state="MODIFIED" /> </a> <a g:id="3" g:state="MODIFIED"> <c g:state="NEW"> <d g:state="NEW">...</d> <d g:state="NEW" w="..">...</d> </c> </a> <b g:id="4" g:state="MODIFIED" w="..." /> </g:gdoc>
Note that gDoc
trees inherit constraints from their XML serialisation. In particular, the names of edges, the names of attributes, the values of attributes, and the values of leaves are regulated by the definition of the format.
gDoc API
The XML serialisation of gDoc
trees is 'natural', in that it does not employ dedicated element structures for the representations nodes, edges, attributes, etc. This streamlines its manipulation with standard XMl technologies (e.g. XPath, XSLT, XQuery, DOM, SAX, etc.) and does not inhibit object binding technologies (e.g. JAXB, XStream, etc). As a native option, however, the service defines a bespoke object model and API for gDoc
trees which offer:
- dedicated support for tree processing requirements associated with the use of the service;
- transparencies and optimisations for tree storage, construction, deconstruction, and input/output.
While the model is available to service clients, it also forms the basis of the interface between the service and its plugins. For this reason, its main features are overviewed here while its client-oriented features are discussed later on.
As the figure below illustrates, the model is defined in org.gcube.contentmanagement.contentmanager.stubs.model.trees
in terms of the following components:
-
Node
: an abstract base for nodes with an identifier, a state, and a map ofQName
-ed attributes. -
State
: an inner enumeration ofNode
for node states. -
Edge
: AQName
-ed edge to a targetNode
. -
InnerNode
: aNode
with a list of outgoingEdge
s. -
Leaf
: aNode
with a value. -
gDoc
: anInnerNode
with a collection identifier. -
Nodes
: a collection of static utilities to generateNode
s andEdge
s. -
Bindings
: a collection of static utilities to serialise and deserialiseNodes
s to and from DOM trees and/or character streams. -
NodeView
: a base class for JAXB bindings toNode
s. -
GDocView
: aNodeViewM
for JAXB bindings toGDoc
nodes.
The model API is illustrated by example in the rest of this Section. The full list of methods and their signatures can be found in the code documentation.