Difference between revisions of "Kernel density"
Line 1: | Line 1: | ||
− | In statistics, [http://en.wikipedia.org/wiki/Kernel_density_estimation 'Kernel Density Estimation | + | In statistics, [http://en.wikipedia.org/wiki/Kernel_density_estimation ''Kernel Density Estimation (KDE)''] is a non-parametric way to estimate the probability density function of a random variable. |
In our case, we have a two-dimensional variable, representing occurrence points of a given species in a given data range. | In our case, we have a two-dimensional variable, representing occurrence points of a given species in a given data range. | ||
Line 78: | Line 78: | ||
</executionResult> | </executionResult> | ||
... | ... | ||
− | </streamingOutput> | + | </streamingOutput> |
</code> | </code> | ||
Line 123: | Line 123: | ||
species=Carcharodon carcharias; | species=Carcharodon carcharias; | ||
</code> | </code> | ||
+ | |||
Response: | Response: | ||
Line 161: | Line 162: | ||
</ns:ExecuteResponse> | </ns:ExecuteResponse> | ||
</source> | </source> | ||
+ | |||
Input data contents: | Input data contents: | ||
Line 184: | Line 186: | ||
toDate=2005-01-01T00:00Z | toDate=2005-01-01T00:00Z | ||
</code> | </code> | ||
+ | |||
Response: | Response: | ||
Line 240: | Line 243: | ||
</ns:ExecuteResponse> | </ns:ExecuteResponse> | ||
</source> | </source> | ||
+ | |||
Input data contents (for the three executions one for each species): | Input data contents (for the three executions one for each species): |
Revision as of 14:26, 24 April 2013
In statistics, Kernel Density Estimation (KDE) is a non-parametric way to estimate the probability density function of a random variable. In our case, we have a two-dimensional variable, representing occurrence points of a given species in a given data range.
The Wps Service associated with this specific algorithm is called Species Occurrences Kernel Density Estimation (SOKDE)
Contents
Overview
The goal of this wps-service is to offer a functional, scalable and OGC-compliant tool to generate, from a set of species in a given temporal range, a Shapefile (for each species) , that represent the KDE on the relatives occurrence points sets. To do this, the service first takes the occurrence points from the species scientific name, then applies the KDE on the generated occurrence points. This is done for each species in the set.
The KDE algorithm is provided by the Institut de recherche pour le développement (IRD), written in R language. This algorithm gets an occurrence file and a set of percentages to generate the shapefile with polygons for each probability density given percentage.
The occurrence points generation from species names and data range is done by Species Product Discovery Tools, a CLI application that generate an occurrence csv file exploiting the Species Product Discovery Service.
The whole process make parallelization around all given species, using Wps-Hadoop specifications.
Wps Request Syntax
The process Identifier is com.terradue.wps_hadoop.processes.ird.KernelDensity.
The process description is obtained by the wps DescribeProcess request:
<wps_host> ?service=wps &version=1.0.0 &request=DescribeProcess &identifier=com.terradue.wps_hadoop.processes.kernel_density.KernelDensity
The process execution is obtained by the wps Execute request:
<wps_host> ?Service=WPS &version=1.0.0 &Request=execute &identifier=com.terradue.wps_hadoop.processes.ird.KernelDensity &DataInputs=<data_inputs>
Data Inputs Parameters
- species: list of species, using scientific name (multiple, at least one required)
- fromDate': lower limit to apply a filter to the temporal occurrence, using ISO 8601 date format (like "1980-11-04T00:00Z") (single, optional)
- toDate: upper limit to apply a filter to the temporal occurrence, using ISO 8601 date format (single, optional)
- percentages: percentages of total density estimation for contour lines (multiple, optional, default={25, 50, 75, 80, 90, 95, 98})
Wps Response
The KDE wps process, in accordance with all wps-hadoop-streaming processes, return a complex_data object, which is an xml document with this structure:
<streamingOutput> <algorithmName> algorithmName </algorithmName> <jobId> jobId </jobId> <executionResult> <inputData> <url> input data file url </url> </inputData> <outputData> <url> output data file url 1 </url> <url> output data file url 2 </url> ... <url> output data file url n </url> </outputData> </executionResult> <executionResult> ... </executionResult> <executionResult> ... </executionResult> ... </streamingOutput>
In particular, the KDE return this output:
<streamingOutput> <algorithmName>kernelDensity</algorithmName> <jobId> jobId </jobId> <executionResult> <inputData> <url> input data file url (values of species name, fromDate, toDate, percentages) </url> </inputData> <outputData> <url> occurrence file url </url> <url> shapefile url (tar.gz) </url> </outputData> </executionResult> <executionResult> ... </executionResult> <executionResult> ... </executionResult> ... </streamingOutput>
Uses cases
Base Example
Kernel Density of Carcharodon carcharias (White Shark), without date filter and with default set of contour percentages.
Request:
wps01.i-marine.d4science.org/wps/WebProcessingService ?Service=WPS &version=1.0.0 &Request=execute &identifier=com.terradue.wps_hadoop.processes.ird.KernelDensity &dataInputs= species=Carcharodon carcharias;
Response:
<?xml version="1.0" encoding="UTF-8" ?> <ns:ExecuteResponse xmlns:ns="http://www.opengis.net/wps/1.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wps/1.0.0 http://schemas.opengis.net/wps/1.0.0/wpsExecute_response.xsd" serviceInstance="http://wps01.i-marine.d4science.org:80/wps/WebProcessingService?REQUEST=GetCapabilities&SERVICE=WPS" xml:lang="en-US" service="WPS" version="1.0.0"> <ns:Process ns:processVersion="1.0.0"> <ns1:Identifier xmlns:ns1="http://www.opengis.net/ows/1.1">com.terradue.wps_hadoop.processes.kernel_density.KernelDensity</ns1:Identifier> <ows:Title xmlns:wps="http://www.opengis.net/wps/1.0.0" xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:xlink="http://www.w3.org/1999/xlink">Species Occurrences Kernel Density</ows:Title> </ns:Process> <ns:Status creationTime="2013-04-24T15:14:33.571+02:00"> <ns:ProcessSucceeded>Process has succeeded</ns:ProcessSucceeded> </ns:Status> <ns:ProcessOutputs> <ns:Output> <ns1:Identifier xmlns:ns1="http://www.opengis.net/ows/1.1">result</ns1:Identifier> <ows:Title xmlns:wps="http://www.opengis.net/wps/1.0.0" xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:xlink="http://www.w3.org/1999/xlink">result</ows:Title> <ns:Data> <ns:ComplexData mimeType="application/xml"> <streamingOutput> <algorithmName>kernelDensity</algorithmName> <jobId>c95d212d-c7b5-490c-9ed0-e1696fe41c41</jobId> <executionResult> <inputData> <url>http://wps01.i-marine.d4science.org:80/wps/store/c95d212d-c7b5-490c-9ed0-e1696fe41c41/output/files/exec0/inputData.txt</url> </inputData> <outputData> <url>http://wps01.i-marine.d4science.org:80/wps/store/c95d212d-c7b5-490c-9ed0-e1696fe41c41/output/files/exec0/occ.csv</url> <url>http://wps01.i-marine.d4science.org:80/wps/store/c95d212d-c7b5-490c-9ed0-e1696fe41c41/output/files/exec0/output.tgz</url> </outputData> </executionResult> </streamingOutput> </ns:ComplexData> </ns:Data> </ns:Output> </ns:ProcessOutputs> </ns:ExecuteResponse>
Input data contents:
species="Carcharodon carcharias", fromDate=, toDate=, percentages=25 50 75 80 90 95 98
Complex Example
Kernel Density of Amphiprion percula (Percula Clownfish), Thunnus atlanticus (Blackfin Tuna) and Architeuthis (Giant Squid), with date from 1990 to 2005, and with set of contour percentages: (25%, 50%, 75%, 90%)
Request:
wps01.i-marine.d4science.org/wps/WebProcessingService ?Service=WPS &version=1.0.0 &Request=execute &identifier=com.terradue.wps_hadoop.processes.ird.KernelDensity &dataInputs= species=Amphiprion percula; species=Thunnus atlanticus; species=Architeuthis; fromDate=1990-01-01T00:00Z; toDate=2005-01-01T00:00Z
Response:
<?xml version="1.0" encoding="UTF-8" ?> <ns:ExecuteResponse xmlns:ns="http://www.opengis.net/wps/1.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wps/1.0.0 http://schemas.opengis.net/wps/1.0.0/wpsExecute_response.xsd" serviceInstance="http://wps01.i-marine.d4science.org:80/wps/WebProcessingService?REQUEST=GetCapabilities&SERVICE=WPS" xml:lang="en-US" service="WPS" version="1.0.0"> <ns:Process ns:processVersion="1.0.0"> <ns1:Identifier xmlns:ns1="http://www.opengis.net/ows/1.1">com.terradue.wps_hadoop.processes.kernel_density.KernelDensity</ns1:Identifier> <ows:Title xmlns:wps="http://www.opengis.net/wps/1.0.0" xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:xlink="http://www.w3.org/1999/xlink">Species Occurrences Kernel Density</ows:Title> </ns:Process> <ns:Status creationTime="2013-04-24T15:14:41.949+02:00"> <ns:ProcessSucceeded>The service succesfully processed the request.</ns:ProcessSucceeded> </ns:Status> <ns:ProcessOutputs> <ns:Output> <ns1:Identifier xmlns:ns1="http://www.opengis.net/ows/1.1">result</ns1:Identifier> <ows:Title xmlns:wps="http://www.opengis.net/wps/1.0.0" xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:xlink="http://www.w3.org/1999/xlink">result</ows:Title> <ns:Data> <ns:ComplexData mimeType="application/xml"> <streamingOutput> <algorithmName>kernelDensity</algorithmName> <jobId>b5f1caa3-125b-4f0c-81a9-30dae0bb2873</jobId> <executionResult> <inputData> <url>http://wps01.i-marine.d4science.org:80/wps/store/b5f1caa3-125b-4f0c-81a9-30dae0bb2873/output/files/exec0/inputData.txt</url> </inputData> <outputData> <url>http://wps01.i-marine.d4science.org:80/wps/store/b5f1caa3-125b-4f0c-81a9-30dae0bb2873/output/files/exec0/occ.csv</url> <url>http://wps01.i-marine.d4science.org:80/wps/store/b5f1caa3-125b-4f0c-81a9-30dae0bb2873/output/files/exec0/output.tgz</url> </outputData> </executionResult> <executionResult> <inputData> <url>http://wps01.i-marine.d4science.org:80/wps/store/b5f1caa3-125b-4f0c-81a9-30dae0bb2873/output/files/exec1/inputData.txt</url> </inputData> <outputData> <url>http://wps01.i-marine.d4science.org:80/wps/store/b5f1caa3-125b-4f0c-81a9-30dae0bb2873/output/files/exec1/occ.csv</url> <url>http://wps01.i-marine.d4science.org:80/wps/store/b5f1caa3-125b-4f0c-81a9-30dae0bb2873/output/files/exec1/output.tgz</url> </outputData> </executionResult> <executionResult> <inputData> <url>http://wps01.i-marine.d4science.org:80/wps/store/b5f1caa3-125b-4f0c-81a9-30dae0bb2873/output/files/exec2/inputData.txt</url> </inputData> <outputData> <url>http://wps01.i-marine.d4science.org:80/wps/store/b5f1caa3-125b-4f0c-81a9-30dae0bb2873/output/files/exec2/occ.csv</url> <url>http://wps01.i-marine.d4science.org:80/wps/store/b5f1caa3-125b-4f0c-81a9-30dae0bb2873/output/files/exec2/output.tgz</url> </outputData> </executionResult> </streamingOutput> </ns:ComplexData> </ns:Data> </ns:Output> </ns:ProcessOutputs> </ns:ExecuteResponse>
Input data contents (for the three executions one for each species):
species="Architeuthis", fromDate=1990-01-01T00:00Z, toDate=2005-01-01T00:00Z, percentages=25 50 75 90
species="Amphiprion percula", fromDate=1990-01-01T00:00Z, toDate=2005-01-01T00:00Z, percentages=25 50 75 90
species="Thunnus atlanticus", fromDate=1990-01-01T00:00Z, toDate=2005-01-01T00:00Z, percentages=25 50 75 90