Difference between revisions of "XML Indexer"

From Gcube Wiki
Jump to: navigation, search
(Implementation Overview)
(Usage Examples)
 
(28 intermediate revisions by 4 users not shown)
Line 1: Line 1:
=== Introduction ===
+
The XMLIndexer Service is a generic indexer of XML data homogeneous collections. The service allows creating, populating and resolving queries against such collections. Two types of XMLIndexer have been designed, each of them manages a collection of XML documents:
The XMLIndexer factory Service allow to create a GenericXMLIndexerService and a MetadataXMLIndexerService.
+
The GenericXMLIndexerService is not implemented yet. A MetadataXMLIndexer operate over a collection of homogeneus XML documents bound to a specific MetadataCollection, this indexer can be populated, updated, recreated and queried.
+
  
=== Implementation Overview ===
+
* '''''WSDaix''''' – a WSDaix is completely unconscious of the collection and the data handled. This means that it does not impose any constraint about them and, therefore, it assumes that the clients know the schema of the documents to query. It can be used each time it is useful to index and query a (temporary) set of XML data, like a result set;
 +
* '''''GCUBEDaix''''' – a GCUBEDaix is bound to a specific Metadata Collection and it is used to index the elements of such a collection. When a new Metadata Collection to be indexed is created, the [[Metadata Manager]] Service creates also a new related GCUBEDaix and, each time a new Metadata Object is added/updated in such collection, the [[Metadata Manager]] Service also adds/updates the GCUBEDaix by feeding it with the new element.
  
XMLIndexer consist of two parts:
+
==== Resources and Properties ====
 +
The Service adopts a Factory pattern and creates a WS-Resource per each XML Indexer. Since there are two kinds of XML Indexer, there are also two kinds of WS-Resource that the Factory service can create: the GCUBEDaix resource and the WSDaix resource. The state of each Indexer is published in the DIS by means of its WS-ResourceProperties. These resource properties includes the creation parameters and, if the Indexer is a WSDaix, the SetTerminationTime and CurrentTime WS-ResourceProperties.
  
* '''MetadataXMLIndexer'''<br>MetadataXMLIndexer provides the functionalities to retrieve, update, store MetadataElements into Metadata Collections.
+
A GCUBEDaix operates on a collection of homogeneous XML documents bound to a specific Metadata Collection. Since the managed XML documents are wrapped in the Metadata envelope, each document is identified by a unique ID (the Metadata Object ID) and this allows a more advanced management of this type of Indexer with respect to the WSDaix one.  
The Service expose the following operations:
+
** <tt>'''AddElements(Documents[]) --> void'''</tt><br>This operation take a list of Documents and adds them to the current collection. A Document is a pair of id and a String representation of the XMLDocument. This operation can be used to update elements already stored given the same ids.
+
  
=== Dependencies ===
+
==== Functions ====
 +
The main functions supported by the GCUBEDaix XMLIndexer are:
 +
 
 +
* '''addElements() '''– which takes a list of XML documents and adds them to the current XML collection managed by this instance;
 +
* '''addElementsRS() '''– which implement the previous operation by reference, i.e. the message passed as input parameter contains the EPR of a result set containing the list of documents;
 +
* '''executeXPath() '''– which takes and executes a valid XPath expression against the current XML collection and returns all the XML Documents that match the given expression;
 +
* '''executeXPathRS() '''– which takes and executes a valid XPath expression against the current XML collection and returns a reference (the EPR of a Result Set containing them) to the list of XML Documents that match the given expression;
 +
* '''executeXQuery() '''– which takes and executes a valid XQuery against the current XML collection and returns the list XML Documents resulting from the evaluation of the given query on the managed collection;
 +
* '''executeXQueryRS() '''– which takes and executes a valid XQuery against the current XML collection and returns a reference to (its EPR) the list XML Documents resulting from the evaluation of the given query on the managed collection;
 +
 
 +
The main functions supported by the WSDaix are those specified in the WS-DAIX specification.
  
 
=== Usage Examples ===
 
=== Usage Examples ===
 +
 +
This example shows how to create an XMLIndexer:
 +
 +
<pre>
 +
 +
...
 +
public class XMLIndexerTest {
 +
 +
public static void main(String[] args) throws Exception {
 +
 +
try{
 +
XMLIndexerFactoryServiceAddressingLocator factoryLocator= new XMLIndexerFactoryServiceAddressingLocator();
 +
EndpointReferenceType epr= new EndpointReferenceType();
 +
ISClient client= GHNContext.getImplementation(ISClient.class);
 +
 +
 +
GCUBERIQuery riquery=client.getQuery(GCUBERIQuery.class);
 +
riquery.addAtomicConditions(new AtomicCondition("//ServiceClass", "metadatamanagement"), new AtomicCondition("//ServiceName", "XMLIndexer") );
 +
 +
List<GCUBERunningInstance> ris=client.execute(riquery ,GCUBEScope.getScope("/gcube/devsec"));
 +
 +
epr.setAddress(new AttributedURI("http://node2.d.d4science.research-infrastructures.eu:8080/wsrf/services/gcube/metadatamanagement/xmlindexer/XMLIndexerFactory"));
 +
 +
XMLIndexerFactoryPortType factoryPortType= factoryLocator.getXMLIndexerFactoryPortTypePort(epr);
 +
factoryPortType=GCUBERemotePortTypeContext.getProxy(factoryPortType, GCUBEScope.getScope("/gcube/devsec"));
 +
 +
 +
WSResourceQuery query= client.getQuery(WSResourceQuery.class);
 +
 +
query.addAtomicConditions(new AtomicCondition("//gc:ServiceClass", "MetadataManagement"),
 +
new AtomicCondition("//gc:ServiceName","XMLIndexer"),
 +
new AtomicCondition("/child::*[local-name()='Id']", "41413b20-4ea4-11dd-9089-fd7df1d0330e"),
 +
new AtomicCondition("/child::*[local-name()='AccessType']", "GCUBEDaix"));
 +
 +
 +
List<RPDocument> result=client.execute(query, GCUBEScope.getScope("/gcube/devsec"));
 +
 +
System.out.println("query result: "+result.size());
 +
GCUBEDaixServiceAddressingLocator servicelocator= new GCUBEDaixServiceAddressingLocator();
 +
GCUBEDaixPortType servicePortType= servicelocator.getGCUBEDaixPortTypePort(result.get(0).getEndpoint());
 +
System.out.println(" result : "+servicePortType.documentCount(new VOID()));
 +
ExecuteXPathMessage res= servicePortType.executeXPath("//Document");
 +
System.out.println(res.getXPathResult().length);
 +
}catch(Exception e ){e.printStackTrace();}
 +
 +
 +
 +
}
 +
 +
....
 +
</pre>
 +
 +
 +
[[Category:Metadata Management]]

Latest revision as of 09:33, 5 September 2008

The XMLIndexer Service is a generic indexer of XML data homogeneous collections. The service allows creating, populating and resolving queries against such collections. Two types of XMLIndexer have been designed, each of them manages a collection of XML documents:

  • WSDaix – a WSDaix is completely unconscious of the collection and the data handled. This means that it does not impose any constraint about them and, therefore, it assumes that the clients know the schema of the documents to query. It can be used each time it is useful to index and query a (temporary) set of XML data, like a result set;
  • GCUBEDaix – a GCUBEDaix is bound to a specific Metadata Collection and it is used to index the elements of such a collection. When a new Metadata Collection to be indexed is created, the Metadata Manager Service creates also a new related GCUBEDaix and, each time a new Metadata Object is added/updated in such collection, the Metadata Manager Service also adds/updates the GCUBEDaix by feeding it with the new element.

Resources and Properties

The Service adopts a Factory pattern and creates a WS-Resource per each XML Indexer. Since there are two kinds of XML Indexer, there are also two kinds of WS-Resource that the Factory service can create: the GCUBEDaix resource and the WSDaix resource. The state of each Indexer is published in the DIS by means of its WS-ResourceProperties. These resource properties includes the creation parameters and, if the Indexer is a WSDaix, the SetTerminationTime and CurrentTime WS-ResourceProperties.

A GCUBEDaix operates on a collection of homogeneous XML documents bound to a specific Metadata Collection. Since the managed XML documents are wrapped in the Metadata envelope, each document is identified by a unique ID (the Metadata Object ID) and this allows a more advanced management of this type of Indexer with respect to the WSDaix one.

Functions

The main functions supported by the GCUBEDaix XMLIndexer are:

  • addElements() – which takes a list of XML documents and adds them to the current XML collection managed by this instance;
  • addElementsRS() – which implement the previous operation by reference, i.e. the message passed as input parameter contains the EPR of a result set containing the list of documents;
  • executeXPath() – which takes and executes a valid XPath expression against the current XML collection and returns all the XML Documents that match the given expression;
  • executeXPathRS() – which takes and executes a valid XPath expression against the current XML collection and returns a reference (the EPR of a Result Set containing them) to the list of XML Documents that match the given expression;
  • executeXQuery() – which takes and executes a valid XQuery against the current XML collection and returns the list XML Documents resulting from the evaluation of the given query on the managed collection;
  • executeXQueryRS() – which takes and executes a valid XQuery against the current XML collection and returns a reference to (its EPR) the list XML Documents resulting from the evaluation of the given query on the managed collection;

The main functions supported by the WSDaix are those specified in the WS-DAIX specification.

Usage Examples

This example shows how to create an XMLIndexer:


...
public class XMLIndexerTest {

		public static void main(String[] args) throws Exception {
			
			try{
			XMLIndexerFactoryServiceAddressingLocator factoryLocator= new XMLIndexerFactoryServiceAddressingLocator();
			EndpointReferenceType epr= new EndpointReferenceType();
			ISClient client= GHNContext.getImplementation(ISClient.class);
			
			
			GCUBERIQuery riquery=client.getQuery(GCUBERIQuery.class);
			riquery.addAtomicConditions(new AtomicCondition("//ServiceClass", "metadatamanagement"), new AtomicCondition("//ServiceName", "XMLIndexer") );
			
			List<GCUBERunningInstance> ris=client.execute(riquery ,GCUBEScope.getScope("/gcube/devsec"));
					
			epr.setAddress(new AttributedURI("http://node2.d.d4science.research-infrastructures.eu:8080/wsrf/services/gcube/metadatamanagement/xmlindexer/XMLIndexerFactory"));
			
			XMLIndexerFactoryPortType factoryPortType= factoryLocator.getXMLIndexerFactoryPortTypePort(epr);
			factoryPortType=GCUBERemotePortTypeContext.getProxy(factoryPortType, GCUBEScope.getScope("/gcube/devsec"));
			
			
			WSResourceQuery query= client.getQuery(WSResourceQuery.class);
			
			 query.addAtomicConditions(new AtomicCondition("//gc:ServiceClass", "MetadataManagement"), 
					new AtomicCondition("//gc:ServiceName","XMLIndexer"), 
					new AtomicCondition("/child::*[local-name()='Id']", "41413b20-4ea4-11dd-9089-fd7df1d0330e"),
					new AtomicCondition("/child::*[local-name()='AccessType']", "GCUBEDaix"));
			
			
			 List<RPDocument> result=client.execute(query, GCUBEScope.getScope("/gcube/devsec"));

			System.out.println("query result: "+result.size());
			GCUBEDaixServiceAddressingLocator servicelocator= new GCUBEDaixServiceAddressingLocator();
			GCUBEDaixPortType servicePortType= servicelocator.getGCUBEDaixPortTypePort(result.get(0).getEndpoint());
			System.out.println(" result : "+servicePortType.documentCount(new VOID()));
			ExecuteXPathMessage res= servicePortType.executeXPath("//Document");
			System.out.println(res.getXPathResult().length);
			}catch(Exception e ){e.printStackTrace();}
			
			
		 
		}
		
....