Difference between revisions of "OAITMPlugin"

From Gcube Wiki
Jump to: navigation, search
(XML Schema)
(Example)
Line 165: Line 165:
  
 
<source lang="xml">
 
<source lang="xml">
 
  
 
<?xml version="1.0" ?>
 
<?xml version="1.0" ?>
 +
<t:root xmlns:t="http://gcube-system.org/namespaces/data/trees" t:id="oai:ojs.ijict.org:article/377">
  
<t:root xmlns:t="http://gcube-system.org/namespaces/data/trees" t:id="oai:generic.eprints.org:16">
+
<title></title>
 
+
<collectionID>ijoat:OTHR</collectionID>
<title>Southern California partyboat angler survey</title>
+
<creationTime>2012-01-16T08:08:09.000+01:00</creationTime>
<collectionID>7375626A656374733D54</collectionID>
+
<lastUpdateTime>2012-01-16T08:08:09.000+01:00</lastUpdateTime>
<creationTime>2011-09-29T22:41:21.000+02:00</creationTime>
+
<lastUpdateTime>2011-09-29T22:41:21.000+02:00</lastUpdateTime>
+
  
 
<provenance>
 
<provenance>
<statement>This item has been created by the gCube OAI-TM plugin via OAI-PMH metadata harvesting from the metadata provider aquacomm at http://aquacomm.fcla.edu/cgi/oai2</statement>
+
<statement>This item has been created by the gCube OAI-TM plugin via
<setID>7375626A656374733D54</setID>
+
OAI-PMH metadata harvesting from the metadata provider null at
                <setID>7375626A656374733D48</setID>
+
http://ijict.org/index.php/ijoat/oai</statement>
                <setID>7375626A656374733D44</setID>
+
<recordID>oai:ojs.ijict.org:article/377</recordID>
                <setID>74797065733D6D6F6E6F6772617068</setID>
+
<recordID>oai:generic.eprints.org:16</recordID>
+
 
</provenance>
 
</provenance>
  
 
<metadata>
 
<metadata>
 
<schema>oai_dc</schema>
 
<schema>oai_dc</schema>
<schemaLocation>http://www.openarchives.org/OAI/2.0/oai_dc/</schemaLocation>
+
<schemaLocation>http://www.openarchives.org/OAI/2.0/oai_dc/
 +
</schemaLocation>
 
<record>
 
<record>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd" xmlns:dc="http://purl.org/dc/elements/1.1/">
+
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
<dc:title>The Distribution of Kemp's Ridley Sea Turtles (Lepidochelys kempi) Along the Texas Coast: An Atlas</dc:title>
+
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"
<dc:creator>Manzella, Sharon A.</dc:creator>
+
xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:creator>Williams, Jo A.</dc:creator>
+
<dc:title xml:lang="en-US">Collective Intelligence based Framework
<dc:subject>Conservation</dc:subject>
+
for Load Balancing of Web Servers</dc:title>
<dc:subject>Management</dc:subject>
+
<dc:creator>Atul Garg</dc:creator>
<dc:subject>Fisheries</dc:subject>
+
<dc:creator>Dimple Juneja</dc:creator>
<dc:description>Eight hundred sixty-five records of Kemp's ridley sea turtles (Lepidochelys kempi) reported from Texas between the late 1940's to April
+
<dc:subject xml:lang="en-US"></dc:subject>
1990 were compiled from six data bases and the literature, then plotted on a series of Texas maps. Four categories of Kemp's ridleys are identified
+
<dc:subject xml:lang="en-US"></dc:subject>
throughout the atlas: head-started (turtles that are raised in captivity their first year of life), wild, historical (pre-1980), and nesters. Geographic,
+
<dc:subject xml:lang="en-US"></dc:subject>
seasonal, and size distributions of the turtle categories are plotted by regions. Most Kemp's ridleys were reported from the northeast and central
+
<dc:description xml:lang="en-US">The paper exploits the
Texas coast. They were reported from both inshore (landward of barrier islands) and offshore (seaward of barrier islands).  
+
collective intelligence referred to as ant intelligence in World
Scattered nestings occurred in the central to southern regions. Kemp's ridleys were found more often during the spring and summer.  
+
Wide Web with the aim to improve the performance of online web
A total of 546 turtle records contained measurements; most were 20-59.9 cm curved carapace length and considered sub-adults.
+
servers by balancing the load. The central concept of this idea is
Comparison of distributions of head-started and wild Kemp's ridleys suggests head-started Kemp's ridleys inhabit the same areas as wild Kemp's ridleys.
+
that a collection of agents can individually perform relatively
(PDF file contains 56 pages.)</dc:description>
+
simple, self-centered actions, such as the selection or rejection
<dc:publisher>NOAA/National Marine Fisheries Service</dc:publisher>
+
of hyperlinks in a web page for navigation, computing the load of
<dc:date>1992</dc:date>
+
server and aggregate these individual actions into a common
<dc:type>Monograph or Serial issue</dc:type>
+
substrate. The common substrate can then be evaluated to find the
<dc:type>NonPeerReviewed</dc:type>
+
best available server to perform the task. This work aims to
 +
address the challenge of distributing intelligence to World Wide
 +
Web by contributing a unique ant-based intelligent load balancing
 +
framework which is able to integrate and synthesize knowledge on a
 +
scale far beyond the capabilities of individual humans.
 +
</dc:description>
 +
<dc:publisher xml:lang="en-US">International Journal of
 +
Advancements in Technology</dc:publisher>
 +
<dc:contributor xml:lang="en-US"></dc:contributor>
 +
<dc:date>2012-01-18</dc:date>
 +
<dc:type xml:lang="en-US"></dc:type>
 +
<dc:type xml:lang="en-US"></dc:type>
 
<dc:format>application/pdf</dc:format>
 
<dc:format>application/pdf</dc:format>
<dc:identifier>http://aquaticcommons.org/2705/1/tr110.pdf</dc:identifier>
+
<dc:identifier>http://ijict.org/index.php/ijoat/article/view/load-balancing-of-web-servers
<dc:relation>jttp://spo.nwr.noaa.gov/tr110.pdf</dc:relation>
+
</dc:identifier>
<dc:identifier>Manzella, Sharon A. and Williams, Jo A. (1992) The Distribution of Kemp's Ridley Sea Turtles (Lepidochelys kempi) Along the Texas Coast:
+
<dc:source xml:lang="en-US">International Journal of Advancements
An Atlas. NOAA/National Marine Fisheries Service, (NOAA Technical Report NMFS, 110)</dc:identifier>
+
in Technology; Vol 3, No 1 (2012): International Journal of
<dc:relation>http://aquaticcommons.org/2705/</dc:relation>
+
Advancements in Technology; 64-70</dc:source>
 +
<dc:language>en</dc:language>
 +
<dc:relation>http://ijict.org/index.php/ijoat/article/download/load-balancing-of-web-servers/1050
 +
</dc:relation>
 +
<dc:coverage xml:lang="en-US"></dc:coverage>
 +
<dc:coverage xml:lang="en-US"></dc:coverage>
 +
<dc:coverage xml:lang="en-US"></dc:coverage>
 +
<dc:rights>Authors who publish with this journal agree to the
 +
following terms:&lt;br /&gt; &lt;ol type="a"&gt;&lt;br
 +
/&gt;&lt;li&gt;Authors retain copyright and grant the journal right
 +
of first publication with the work simultaneously licensed under a
 +
&lt;a href="http://creativecommons.org/licenses/by/3.0/"
 +
target="_new"&gt;Creative Commons Attribution License&lt;/a&gt;
 +
that allows others to share the work with an acknowledgement of the
 +
work's authorship and initial publication in this
 +
journal.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Authors are able to enter
 +
into separate, additional contractual arrangements for the
 +
non-exclusive distribution of the journal's published version of
 +
the work (e.g., post it to an institutional repository or publish
 +
it in a book), with an acknowledgement of its initial publication
 +
in this journal.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Authors are
 +
permitted and encouraged to post their work online (e.g., in
 +
institutional repositories or on their website) prior to and during
 +
the submission process, as it can lead to productive exchanges, as
 +
well as earlier and greater citation of published work (See &lt;a
 +
href="http://opcit.eprints.org/oacitation-biblio.html"
 +
target="_new"&gt;The Effect of Open
 +
Access&lt;/a&gt;).&lt;/li&gt;&lt;/ol&gt;</dc:rights>
 
</oai_dc:dc>
 
</oai_dc:dc>
 
</record>
 
</record>
Line 221: Line 256:
  
 
<content>
 
<content>
<contentType>main</contentType>
+
<contentType></contentType>
<mimeType>text/url</mimeType>
+
<mimeType></mimeType>
<url>http://aquaticcommons.org/16/1/Marine_Resources_Administrative_Report_No._80%2D7.pdf</url>
+
<url></url>
 
</content>
 
</content>
  
<content>
+
</t:root>
<contentType>alternative</contentType>
+
<mimeType>text/html; charset=UTF-8</mimeType>
+
<url>http://aquaticcommons.org/16/</url>
+
</content>
+
  
</t:root>
 
  
  

Revision as of 14:28, 18 April 2013

OAI TM Plugin is a plugin of the Tree Based Access Facilities that allows harvesting of metadata descriptions of the records in an archive, using The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH).

Each OAI Record is transformed in a edge-labelled tree by OAI TM Plugin.

Plugin parameters fields

Plugins lead to the creation of one or more collections. Thus, in addition to the information below, the user should specify collection name and description.

In order to instruct a Plugin on how to perform the harvesting, a user should specify the following mandatory information:

  • repository name: the name of the repository to be harvested, e.g. "aquacomm";
  • base URL: the base URL of the repository, e.g. "http://aquacomm.fcla.edu/cgi/oai2";
  • default metadata format: the metadata format to be used for harvesting, e.g. "oai_dc";
  • title XPath: the expression for identifying the title of the harvested resource, e.g. "//*[local-name()='title']";

In addition to that, the user might specify the following information:

  • content XPath: the expression for identifying the content of the harvested resource, e.g. "//*[local-name()='identifier' and contains(.,'://')]";
  • alternatives XPath: the expression for identifying additional content of the harvested resource, e.g. "//*[local-name()='relation' and contains(.,'://')]";
  • set Identifiers List: the list of id of the sets to take into consideration during the harvesting phase;

Two typologies of plugins have been defined:

  • WrapSetsRequest: to create a collection for each set of the external repository or for each set specified in the setIdentifierList;
  • WrapRepositoryRequest: to create a single collection containing the whole content of the repository or the content of the sets specified in the setIdentifierList;

Tree model

Conceptual Schema

Each collection item produced by this plugin is characterised by the following information:

  • item metadata: global information on the item including
    • title: the title of the record;
    • collectionID: the collection this item belongs to;
    • creationTime: the time the item was created;
    • lastUpdateTime: the most recent time the item has been updated;
    • provenance: It is characterised by the following information:
      • statement: "This item has been created by the gCube "+ pluginName +" via OAI-PMH metadata harvesting from the metadata provider "+repositoryName+" at "+baseURL;
      • setID: the repository set the object belongs to (optional and repeatable);
      • recordID: the identifier of the metadata record;
  • metadata (repeatable): the metadata record harvested. It is characterised by the following information:
    • schema: the metadata format of the metadata record;
    • schemaLocation: the metadata format schema URI;
    • record: the manifestation of the metadata record harvested;
  • content (repeatable): any potential payload shipped with the metadata record. It is characterised by the following information:
    • contentType: i.e. whether main or alternative content;
    • mimeType: MIME type of the actual content;
    • url: URL to the actual content;

XML Schema

The XML Schema of "record" element depends on the schema used to define record formats. This is a XML Schema of oai_dc record.

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified" >
       <xs:element name="t:root" >
              <xs:complexType>
                     <xs:sequence>
                            <xs:element name="title" type="xs:string" />
                            <xs:element name="collectionID" type="xs:string" />
                            <xs:element name="creationTime" type="xs:string" />
                            <xs:element name="lastUpdateTime" type="xs:string" />
                            <xs:element name="provenance" >
                                   <xs:complexType>
                                          <xs:sequence>
                                                 <xs:element name="statement" type="xs:string" />
                                                 <xs:element name="setID" type="xs:string" maxOccurs="unbounded" />
                                                 <xs:element name="recordID" type="xs:string" />
                                             </xs:sequence>
                                      </xs:complexType>
                               </xs:element>
                            <xs:element name="metadata" >
                                   <xs:complexType>
                                          <xs:sequence>
                                                 <xs:element name="schema" type="xs:string" />
                                                 <xs:element name="schemaLocation" type="xs:string" />
                                                  	<xsd:element name="record">
								<xsd:complexType>
								  <xsd:sequence>
								    <xs:element name="oai_dc:dc" >
                                                                             <xs:complexType>
                                                                                    <xs:sequence>
                                                                                           <xs:element name="dc:title" type="xs:string" >
                                                                                                  <xs:complexType>
                                                                                                         <xs:attribute name="xml:lang" type="xs:string" />
                                                                                                     </xs:complexType>
                                                                                              </xs:element>
                                                                                           <xs:element name="dc:creator" maxOccurs="unbounded" type="xs:string" />
                                                                                           <xs:element name="dc:subject" maxOccurs="unbounded" >
                                                                                                  <xs:complexType>
                                                                                                         <xs:attribute name="xml:lang" type="xs:string" />
                                                                                                     </xs:complexType>
                                                                                              </xs:element>
                                                                                           <xs:element name="dc:description" type="xs:string" >
                                                                                                  <xs:complexType>
                                                                                                         <xs:attribute name="xml:lang" type="xs:string" />
                                                                                                     </xs:complexType>
                                                                                              </xs:element>
                                                                                           <xs:element name="dc:publisher" type="xs:string" >
                                                                                                  <xs:complexType>
                                                                                                         <xs:attribute name="xml:lang" type="xs:string" />
                                                                                                     </xs:complexType>
                                                                                              </xs:element>
                                                                                           <xs:element name="dc:contributor" >
                                                                                                  <xs:complexType>
                                                                                                         <xs:attribute name="xml:lang" type="xs:string" />
                                                                                                     </xs:complexType>
                                                                                              </xs:element>
                                                                                           <xs:element name="dc:date" type="xs:date" />
                                                                                           <xs:element name="dc:type" maxOccurs="unbounded" >
                                                                                                  <xs:complexType>
                                                                                                         <xs:attribute name="xml:lang" type="xs:string" />
                                                                                                     </xs:complexType>
                                                                                              </xs:element>
                                                                                           <xs:element name="dc:format" type="xs:string" />
                                                                                           <xs:element name="dc:identifier" type="xs:string" />
                                                                                           <xs:element name="dc:source" type="xs:string" >
                                                                                                  <xs:complexType>
                                                                                                         <xs:attribute name="xml:lang" type="xs:string" />
                                                                                                     </xs:complexType>
                                                                                              </xs:element>
                                                                                           <xs:element name="dc:language" type="xs:string" />
                                                                                           <xs:element name="dc:relation" type="xs:string" />
                                                                                           <xs:element name="dc:coverage" maxOccurs="unbounded" >
                                                                                                  <xs:complexType>
                                                                                                         <xs:attribute name="xml:lang" type="xs:string" />
                                                                                                     </xs:complexType>
                                                                                              </xs:element>
                                                                                           <xs:element name="dc:rights" type="xs:string" />
                                                                                       </xs:sequence>
                                                                                    <xs:attribute name="xmlns:oai_dc" type="xs:string" />
                                                                                    <xs:attribute name="xmlns:xsi" type="xs:string" />
                                                                                    <xs:attribute name="xmlns:dc" type="xs:string" />
                                                                                    <xs:attribute name="xsi:schemaLocation" type="xs:string" />
                                                                                </xs:complexType>
                                                                         </xs:element>
								  </xsd:sequence>
								</xsd:complexType>
						      </xsd:element>
                                             </xs:sequence>
                                      </xs:complexType>
                               </xs:element>
                            <xs:element name="content" maxOccurs="unbounded" minOccurs="0">
                                   <xs:complexType>
                                          <xs:sequence>
                                                 <xs:element name="contentType" type="xs:string" />
                                                 <xs:element name="mimeType" type="xs:string" />
                                                 <xs:element name="url" type="xs:string" />
                                             </xs:sequence>
                                      </xs:complexType>
                               </xs:element>
                        </xs:sequence>
                     <xs:attribute name="xmlns:t" type="xs:string" />
                     <xs:attribute name="t:id" type="xs:string" />
                 </xs:complexType>
          </xs:element>
   </xs:schema>

Example

A tree generated by OAI TM Plugin looks like this:

<?xml version="1.0" ?>
<t:root xmlns:t="http://gcube-system.org/namespaces/data/trees" t:id="oai:ojs.ijict.org:article/377">
 
	<title></title>
	<collectionID>ijoat:OTHR</collectionID>
	<creationTime>2012-01-16T08:08:09.000+01:00</creationTime>
	<lastUpdateTime>2012-01-16T08:08:09.000+01:00</lastUpdateTime>
 
	<provenance>
		<statement>This item has been created by the gCube OAI-TM plugin via
			OAI-PMH metadata harvesting from the metadata provider null at
			http://ijict.org/index.php/ijoat/oai</statement>
		<recordID>oai:ojs.ijict.org:article/377</recordID>
	</provenance>
 
	<metadata>
		<schema>oai_dc</schema>
		<schemaLocation>http://www.openarchives.org/OAI/2.0/oai_dc/
		</schemaLocation>
		<record>
			<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
				xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
				xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/  http://www.openarchives.org/OAI/2.0/oai_dc.xsd"
				xmlns:dc="http://purl.org/dc/elements/1.1/">
				<dc:title xml:lang="en-US">Collective Intelligence based Framework
					for Load Balancing of Web Servers</dc:title>
				<dc:creator>Atul Garg</dc:creator>
				<dc:creator>Dimple Juneja</dc:creator>
				<dc:subject xml:lang="en-US"></dc:subject>
				<dc:subject xml:lang="en-US"></dc:subject>
				<dc:subject xml:lang="en-US"></dc:subject>
				<dc:description xml:lang="en-US">The paper exploits the
					collective intelligence referred to as ant intelligence in World
					Wide Web with the aim to improve the performance of online web
					servers by balancing the load. The central concept of this idea is
					that a collection of agents can individually perform relatively
					simple, self-centered actions, such as the selection or rejection
					of hyperlinks in a web page for navigation, computing the load of
					server and aggregate these individual actions into a common
					substrate. The common substrate can then be evaluated to find the
					best available server to perform the task. This work aims to
					address the challenge of distributing intelligence to World Wide
					Web by contributing a unique ant-based intelligent load balancing
					framework which is able to integrate and synthesize knowledge on a
					scale far beyond the capabilities of individual humans.
				</dc:description>
				<dc:publisher xml:lang="en-US">International Journal of
					Advancements in Technology</dc:publisher>
				<dc:contributor xml:lang="en-US"></dc:contributor>
				<dc:date>2012-01-18</dc:date>
				<dc:type xml:lang="en-US"></dc:type>
				<dc:type xml:lang="en-US"></dc:type>
				<dc:format>application/pdf</dc:format>
				<dc:identifier>http://ijict.org/index.php/ijoat/article/view/load-balancing-of-web-servers
				</dc:identifier>
				<dc:source xml:lang="en-US">International Journal of Advancements
					in Technology; Vol 3, No 1 (2012): International Journal of
					Advancements in Technology; 64-70</dc:source>
				<dc:language>en</dc:language>
				<dc:relation>http://ijict.org/index.php/ijoat/article/download/load-balancing-of-web-servers/1050
				</dc:relation>
				<dc:coverage xml:lang="en-US"></dc:coverage>
				<dc:coverage xml:lang="en-US"></dc:coverage>
				<dc:coverage xml:lang="en-US"></dc:coverage>
				<dc:rights>Authors who publish with this journal agree to the
					following terms:&lt;br /&gt; &lt;ol type="a"&gt;&lt;br
					/&gt;&lt;li&gt;Authors retain copyright and grant the journal right
					of first publication with the work simultaneously licensed under a
					&lt;a href="http://creativecommons.org/licenses/by/3.0/"
					target="_new"&gt;Creative Commons Attribution License&lt;/a&gt;
					that allows others to share the work with an acknowledgement of the
					work's authorship and initial publication in this
					journal.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Authors are able to enter
					into separate, additional contractual arrangements for the
					non-exclusive distribution of the journal's published version of
					the work (e.g., post it to an institutional repository or publish
					it in a book), with an acknowledgement of its initial publication
					in this journal.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Authors are
					permitted and encouraged to post their work online (e.g., in
					institutional repositories or on their website) prior to and during
					the submission process, as it can lead to productive exchanges, as
					well as earlier and greater citation of published work (See &lt;a
					href="http://opcit.eprints.org/oacitation-biblio.html"
					target="_new"&gt;The Effect of Open
					Access&lt;/a&gt;).&lt;/li&gt;&lt;/ol&gt;</dc:rights>
			</oai_dc:dc>
		</record>
	</metadata>
 
	<content>
		<contentType></contentType>
		<mimeType></mimeType>
		<url></url>
	</content>
 
</t:root>

Maven coordinates

The Maven coordinates of oai-tree-plugin of its development versions are:

<dependency>
  <groupId>org.gcube.data.oai.tmplugin</groupId>
  <artifactId>oai-tm-plugin</artifactId>
  <version>1.1.0-2.13.0</version>
</dependency>