martes, 9 de abril de 2013

"Geoscripting" in SEXTANTE

My colleague Ilya Rosenfeld just published a few days ago a great post about Geoscript, a project that provides a wrapping of the main  GeoTools elements for handling spatial data in several scripting languages, making it easier to access the power of GeoTools.

I took the example data from that blog post (irradiance data provided by the National Renewable Energy Lab) and tried to replicate part of the work using the QGIS Python console and, of course, calling SEXTANTE algorithms from it. We will be using SEXTANTE algorithms from the console, and also writing a Python script and an R script, both of them to be wrapped as new SEXTANTE algorithms.

Let's start. The first thing is to open the dataset, that you can download here

You can use the usual menus in QGIS, but we can open it directly from the QGIS Python console, with something like this

>>>import sextante

Of course, you should replace the paths with the equivalent ones pointing to the folder were you have put the dataset shapefiles. Those are the paths in my system.

As you can see SEXTANTE provides commands to simplify some QGIS tasks (plus, others are already very simple in the QGIS console, so you have a very powerful environment for working with spatial data)

To select the Florida state, we can do the following


And the selection will appear in the QGIS canvas.

The help for the Select by attribute algorithm should explain the above sentence.


ALGORITHM: Select by attribute
LAYERNAME <ParameterVector>
 ATTRIBUTE <ParameterTableField from LAYERNAME>
    COMPARISON <ParameterSelection>
    COMPARISONVALUE <ParameterString>
    RESULT <OutputVector>

    0 - ==
    1 - !=
    2 - >
    3 - >=
    4 - <
    5 - <=
    6 - begins with
    7 - contains

Now that there is just one polygon selected, any operation performed by another SEXTANTE algorithm will use only that polygon instead of the whole layer. We can use that to cut the original layer with Direct Normal Irrdiance (DNI) values and get a layer with only those polygons within Florida. The Cut shapes algorithm will help us with this, and we can call it using the following code

>>>outCut = sextante.runalg("saga:cutshapeslayer", "solar_dni_polygons",1,"usa_l48",None,None)
Remember that, when run in the console, SEXTANTE algorithms do not load their results into the QGIS canvas, so you have to manually load them in case you need to. The resulting layer, if loaded, should look like this.

Let's work now on the data we have in our new layer. There is no algorithm to summarize all the values corresponding to those polygons into an average one that can be assigned to the whole Florida state, but we can create it rather easily. Open the SEXTANTE script editor and enter the following code.

##output=output vector

from sextante.core.SextanteVectorWriter import SextanteVectorWriter
from qgis.core import *
from PyQt4.QtCore import *

inputLayer = sextante.getobject(input)
features = sextante.getfeatures(inputLayer)
fields = inputLayer.pendingFields().toList()
outputLayer = SextanteVectorWriter(output, None, fields, QGis.WKBPoint,
count = 0
mean = [0 for field in fields]
x = 0
y = 0
for ft in features:
    c = ft.geometry().centroid().asPoint()   
    x += c.x()
    y += c.y()               
    attrs = ft.attributes()
    for f in range(len(fields)):
            mean[f] += float(attrs[f].toDouble()[0])
    count += 1   
if count != 0:
    mean = [value / count for value in mean]
    x /= count
    y /= count           
outFeat = QgsFeature()   
meanPoint = QgsPoint(x, y)
outFeat.setAttributes([QVariant(v) for v in mean])

This is a small script that creates a new layer with one point representing the features in the layer (placed in the 'centroid of centroids' of those input features), and attributes representing the average value.

Save it as in the scripts folder (remember that it is the defualt one when SEXTANTE show you the save file dialog). The name to refer to the algorithm is taken from the filename, in case you haven't provided one in the algorithm description (which we haven't done. We have just defined the group, not the algorihtm name), so this one will be named script:summarize

Now you can call it just double clicking on its name in the toolbox, and using the dialog that SEXTANTE will create based on your algorithm description.

However, you can also use the console to call it, and since we are mostly working on the console in this exercise, we will do it that way. Enter the following sentence to call the algorithm.

>>>outSummarize = sextante.runalg("script:summarize",outCut["CUT"],None)
The resulting layer has the following attributes table.

Notice how we are using the output of the previous algorithm as input for this one, with no need to load it in QGIS, and refering to it using the outSummarize dict returned when running the cutting algorithm
If we want to create a histogram, we have no tools in SEXTANTE to do so, at least not in our particular case in which the data is in a single row of the attributes table of a vector layer, and not in all its fields, but just a subset of them. However, it is not difficult to create an algorithm to do that, and we can rely on R to do it.

Open the SEXTANTE R editor and create a new algorithm with the following code.

barplot(as.matrix(layer@data[1, 4:15]))

Save it as monthly_barplot.rsx, and it will now be available to be called from all the SEXTANTE components and also from the console. Run it with the following code

>>>sextante.runalg("r:createmonthlybarplot", outSummarize["output"],None)
The output will be available at the SEXTANTE results viewer. The png file created by R is embedded in a simple HTML page, as usual.

Now let's automate things a bit and create a larger script that creates does the clipping, the summarize and the barplot creation, all in one shot.

Here you have it.

##barplot=output html
##dni barplot=name
sextante.runalg("qgis:selectbyattribute", states, "STATE", 0, state)
outCut = sextante.runalg("saga:cutshapeslayer", dni, 1, states, None, None)
outSummarize = sextante.runalg("script:summarize", outCut["CUT"], None)
sextante.runalg("r:createmonthlybarplot", outSummarize["output"], barplot)

Save it and you will see it in the toolbox, with the name dni barplot. This time we have added the name in the description header, so the name is not taken from the file. You can use any filename to save it

The algorithm has the name of the states attribute hardcoded,  but you can edit it to make it more flexible. Take this just as an example.

The interest of this script is not to use it once to replicate the same process we have already run, but to use it repeatedly to create similar barplots for other states easily.

I case you want to produce a barplot for each state, you can automate that with a few lines of Python code:

from improt getAttributeValues
layer = sextante.getobject("usa_l48")
states = sextante.getAttributeValues(layer, "STATE")
for state in states:
    sextante.runalg("script:barplot", "solar_dni_polygons","usa_l48", state, state + ".html")

We just get a list of all the states and call our new script algorithm repeatedly.

As you can see, using SEXTANTE on the Python console is a really powerfull tool. I will write more about this way of using SEXTANTE in some other posts that I'm currently preparing.