Friday, October 16, 2009

Using GRASS to fix topology

Started trying to use GRASS to solve some of the more difficult problems that I can't fix in PostGIS. Here's one of them. I've imported a shapefile of country borders and am trying to fix the topology. The screenshot shows a country border with a bunch of "x" marks that I *think* represent centroids. I'd like to take them out, and just have one, long, continuous line at the country border. Not sure how to do that yet.

Wednesday, October 14, 2009

My take on commercial vs. open source GIS

Was alerted to a posting on the LinkedIn OGC group that seriously chapped my hide. A person had asked for advice regarding whether they should consider Open Source as a viable alternative to commercial software. You can read the original post at this link.

Here's my response.

"GIS" is now a very broad topic, and means different things to different people. You didn't specify what type of work you needed to do, so I'll try to address a few major topics that I encounter routinely:
- data conversion and creation,
- spatial analysis,
- server based data storage and access,
- desktop visualization,
- paper map production,
- web-based static maps,
- web-based dynamic maps,
and finally a weird one,
- reprojection to and from local projections from around the world.

- Data conversion and creation:
Hands down I prefer OS for this. Although I admit that I have encountered difficulties with CAD data that was created in bleeding edge commercial software. Usually this is because the OS development team has had to reverse engineer a new data format. I feel that there is far greater ability to work with data programatically using OS tools, and in my experience these tools suffer fewer failures when working with very large data sets. Having said that, if all I ever did was convert data, all day, every day, I probably would buy FME by Safe Software. They support more formats than I'm even aware of, and make it (relatively) easy to setup automated conversion pipelines. I don't need that degree of interoperability, and I use GDAL for practically every task that falls in this category.

- Spatial Analysis:
I feel that OS can do most spatial analysis quite capably. I'm a huge fan of doing vector based spatial analysis using SQL in PostGIS. I do 90% of my analysis this way. For more complicated topological analysis, as well as raster analysis, GRASS works extremely well. Having said that, my hands down favorite tool for analysis remains ArcMap. I am still much, much, faster in ArcMap, and I find that it is much easier to see the results of a particular operation with it. However, I rarely use it, as I can't run it on my laptop, and we can't afford another license anyhow.

- Server-based data storage and access:
Unless you have already heavily invested in commercial systems, or require database storage of raster data, OS (PostGIS specifically) wins in my book. If your entire workflow is already based on using many ArcGIS-equipped workstations which all connect to an Oracle Enterprise DB, I would be hesitant about deciding to convert to OS. Not because OS can't do it (except for database raster storage), but because there will be so much work in converting both the workflow, and the storage system itself. I understand from others that this is getting better - ESRI now supports PostGIS via SDE for example - but it's not without its headaches. Database storage of rasters is essentially non-existent in OS. There are all sorts of "solutions" talked about, but none of them are great, and frankly it's a glaring hole in OS.

- Desktop visualization:
Depends. If you want to quickly see what a data set looks like, or style several layers together, OS is great. It meets all of my needs on a daily basis. Then again, I think the grayscale display of 32-bit data in OpenEV is fantastic. There definitely is a lack of refinement in this area, but I'm hesitant to say OS isn't as good, mostly because I wonder just how sophisticated it needs to be?

- Paper Map Productionn:
For the frequent production of "casual maps", I think commercial software is still much better. I say "casual" because if you're making professional quality maps, I doubt you're going to just use a commercial GIS package. More than likely, you are also going to invest in a commercial graphics application as well. For "casual" paper maps, it's alot easier to use ArcGIS. Most of the time the results look pretty good, although the PDF engine is terrible and anything can happen. Personally, I use MapServer and GIMP to create paper maps with OS tools. I think they look pretty good, but they take more work than they should.

- Static web maps:
You're absolutely silly if you use anything but one of the OS tools for this. They can create lovely images - better in some cases than commercial products - and the only expense is the setup and configuration time.

- Dynamic and "slippy" web maps:
And now we get to what has everyone up in arms these days - everyone wants a mashup. Everyone is getting into this market, and everyone touts the capabilities of their solution at the expense of all others. I think that again it depends on what your goals are. If you want to create a web-based tool for spatial analysis, then I think the commercial offerings provide more "all in one" capability. If on the other hand, your goal is to display single or multiple styled layers, to allow for feature Identification and Selection (ESRI definition intended) - either by map click or attribute search, then there is no reason to use commercial software. In fact, I actually think that for these situations OS can both look and perform better. There are even OS base maps now that rival any of the commercial ones. I think there is a HUGE (note use of CAPS) market for, "3 vector layers and one image" on a web map, and I'm ecstatic to hear that a major vendor of commercial software doesn't think so.

- Reprojection support:
This frankly is an edge category, but's it's caused me problems numerous times. I suspect most people won't care though. OS software supports hundreds of common (and some not so common) projections via the PROJ library. Not only that, but it is easy to create your own custom projection as well, to display a specific area exactly how'you'd like. However, it does not deal with the reprojection of certain ones very well, especially those which use old, grid-based datums. Commercial software definitely still does a far better job at doing this.

Finally, a couple general observations about OS vs. commercial software. One of the major differences between commercial and OS applications is that in general OS applications do not try to address every conceivable situation you might possibly encounter in your work. Instead, they tend to focus on doing a few things extremely well, then rely on the ability to access other tools when the need arises. For example, Quantum GIS does a good job of displaying raster and vector layers, but relies on the ability to access GRASS in order to do more complex spatial analysis. If you are looking for a single solution for everything, commercial software with its multitude of extensions might be for you.

Support is also a good topic worth mentioning. In my experience, the people who like and insist on support contracts (aside from the vendors, of course), are the ones who never actually use the software. It's the ones buying the software that want it, and they view it as insurance that if something goes wrong with their very expensive purchase, they will get support. And usually there is a Service Level Agreement (SLA) that specifies they will get support in a timely manner, the amount of time depending on whether they are "Gold" or "Platinum" contract holders. But an SLA doesn't guarantee the quality of support - it can't. It merely ensures that the customer's problem will be dealt with in whatever support process the vendor has in place. Some companies are great, some are not, and it has nothing to do with how much money you paid for your support contract. In general the support given to OS applications by the developers and users of the product is as good, OR BETTER (note use of CAPS for emphasis) than any I have received via contract. Support extends beyond just staffing a help desk and charging a monthly maintenance/support fee though. It includes documentation of the product, and rapid bug fixes for critical problems.

Documentation for OS applications can sometimes be tough to find, and the quality is variable. There definitely is documentation out there, but sometimes it takes more effort than it should to find, and then it's not always easily digestible. You won't find the Help system that's in ArcGIS, that's for sure. But you will find many well-written Wiki's that cover most topics, and very responsive user lists to answer specific questions. Bug fixes and feature enhancements are far more rapid in the OS world, and tend to be in response to problems and requests reported by users. I think it's Documentation can sometimes be tough to find, this is true. There definitely is good documentation out there, but it is sometimes hard to find, and not easily digestible. You won't find the Help system that's in ArcGIS, for sure. But you will find many well-written Wiki's that cover most topics, and very responsive user lists to answer specific questions.a function of the development process that enables this. The bug tracking systems are transparent, and at any time you can go take a look at what's being done to resolve your problem, or to see what problems exist in a specific version. And lets not forget that "money talks" in the OS world as well. You absolutely need to be able to connect to that new/old/weird/unsupported database with your OS app, and you need it done NOW, and are willing to pay $5000 for the priviledge? There's a really good chance someone in the development community will do it for you.

So, to wrap this thing up, Open Source GIS software is just like anything else in life, it has both good and bad points. The same is true of commercial software as well. I don't think you can generalize across the entire category of uses with simple blanket statements. Identify what specific uses you need to address, then ask the question again. You'll probably get different answers for each one.

Best of luck.

Sunday, June 28, 2009

Neanderthal Woodworking is a Success

Did a wee bit of yard work yesterday to get rid of some old logs and large branches that we had inherited from the previous owner. Gave me an opportunity to use some less modern tools and I was very happy with how well they worked.

First in use was the frame-saw made from some of Jim's left-over maple and a $4 saw blade from the Depot. Made short work of the half rotten birch branches we had laying around. Saw will need to be fine-tuned by making the arms shorter above the cross piece, giving them some nice curves for comfort and aesthetics, and slightly modifying the blade attachment so that the blade can't cock itself sideways in the arms.



Next up was the wooden wedge I used to help split some stumps. I had driven a couple cold chisels into this stump in an attempt to split it, and had only succeeded in getting a 1/4 inch crack. In one of my woodworking books, Roy Underhill talks about using wooden wedges to split trees, so I grabbed a piece of hard something, cut it into a wedge shape and rammed it into the crack. Worked like a charm.

Sunday, May 10, 2009

Why does Twitter suck so badly?

Kind of ironic that the reason I'm writing about Twitter's lameness is because I want to read some of my friend's tweets and post a few of my own. See, a few months ago I signed up for an account, decided I wasn't really all that thrilled with how Twitter was broadcasting all my private messages to specific individuals to every single person who had subscribed to my feed, and canceled it. After a while I kinda started to miss the connectivity it gave me to people that I don't see very often, and last week I decided to try and reactivate my account. Only its not that simple.

I logged into the Twitter site with my old user-name and password and was told, "Hi Roger, click here if you want to reactivate your account." I did, received an email that said, "Click on this link to reactivate you account." and proceeded to be unimpressed. The link to me to the login page, with a banner that told me it couldn't verify that I had requested an account restoration. WTF? I mean, they sent me the confirmation email, right?

Frankly, I really have a hard time understanding how a site like Twitter can have such piss-poor account management. I mean stuff like you can only have an email address or mobile phone number linked to a single account is weak. Combined with the inability to delete and reactivate your account means that you essentially sign up with a single account once, for life.

Google, please buy Twitter out, or provide similar functionality via Gmail chat, so that we don't have to keep dealing with idiots who can't manage a database.

Friday, April 10, 2009

Subquery of subquery PostGIS SQL

Had to create some polygons that represented the dissolved and buffered outlines of a region, that were then clipped to an artificial extent. Came up with the following SQL, which still makes me scratch my head when I look at it.

SELECT ST_Intersection(c1.diss_buff_geom, m.the_geom) AS intersection_geom, 
        c1.name AS name
    FROM (SELECT geomunion(the_geom) AS diss_buff_geom, name
        FROM (SELECT name, ST_Buffer((ST_Dump(the_geom)).geom, .12) AS the_geom 
            FROM coastlines) c
        WHERE c.name LIKE 'India' GROUP BY c.name) c1, 
            global_land_mask_v5 m
        WHERE ST_Intersects(c1.diss_buff_geom, m.the_geom) 
            AND c1.name = 'India';

Honestly... still hurts to look at this now.

Monday, April 6, 2009

Working with GDAL in C

Baby steps. Find the "ogr_capi_test.c" test program and compile it. Comes with GDAL library and is found in $GDAL_ROOT/ogr.

Compile with these flags: "gcc -g ogr_capi_test.c `gdal-config --libs` `gdal-config --cflags` -o ogr_capi_test"

Wednesday, March 25, 2009

Problem with PostGIS ST_Buffer

Seems like I'm not getting back all of the geometry from a query.

select name, Buffer(the_geom, .12) from coastlines where iso_3_code = 'PNG';



After a bit of screwing around, the solution I found is to blow up the MULTIPOLYGON that is Papua New Guinea with ST_Dump, then ST_Buffer each of the individual polygons, and finally return those buffered polygons as a subquery result to the geomunion function. Like this:

SELECT c.name, geomunion(the_geom) FROM (SELECT name, ST_Buffer((ST_Dump(the_geom)).geom, .12) AS the_geom FROM coastlines) c WHERE c.name LIKE 'Papua%' GROUP BY c.name;

The results look like this:


Works, but it doesn't seem like this should be necessary.

Sunday, March 22, 2009

Mapserver does Dynamic Charting

Pie and bar charts from attribute data. Who knew?

http://mapserver.org/output/dynamic_charting.html

Wednesday, March 4, 2009

How to avoid partial labels with Mapserver and Tilecache

Found this in a Mapserver list message. Authored by Thomas Bonfort, so ought to be accurate.

To summarize, here is the way to *completely* avoid truncated labels
and edge artifacts:

* use a 10 pixel metabuffer in your tilecache config (the number of
metatiles is irrelevant, but as Chris points out, there'll be more
labels included if you use a 3x3 (or more) metatiling scheme than a
1x1 one :

metaTile=true
metaSize=3,3
metaBuffer=10

* set a 10 pixel edge buffer in mapserver (so no labels are rendered
in the 10 pixels on the edges of the image) :
WEB
METADATA
labelcache_map_edge_buffer "-10"
END
END

* use PARTIALS FALSE in all your label blocks

Monday, March 2, 2009

Continuous Build System head-bashing

Not really sure why I agreed to help setup our continuous build system, but here I am. Had initially started to look at Cruisecontrol, and got it to do 80% of what we needed. However, due to its java-centric nature, getting the last 20% was proving to be a bear. And then Tomcat started to be flaky and that was the straw that broke the camel's back.

Took a look at Buildbot today. It seems like a much better option for us, as it is Python based and has very few dependencies. Unfortunately, the documentation isn't very good, and although I have the skeleton of a system built, I still haven't figured out how to do a basic SVN checkout and build with it.

Reminds me of a conversation I recently had with some GIS professionals who had tried to do something with open source applications. Every single one of them complained about how many of the training resources they had looked at contained "8 different ways to do the same thing". They said they would have preferred being shown one, solid way of doing something, and then building upon that example.

Seems like a good idea to me.

9 - 28 Feb 2009

Rode to work last week in a steady, driving rain for the first time. Had purchased some fenders and clipless pedals over the weekend, and this was a good test of the new gear. Everything worked well, with the exception of some slight rubbing from the front fender when turning and going over bumps. Should be easy to get that adjusted properly. Did find out why everyone seems to have add-on mud flaps though. The spray from the front tire comes off at the perfect height to get all over your shoes. Heard that the way to deal with that is to use the plastic from a cut up water bottle. Will have to try that this weekend.

Also deleted my Twitter account last week. This is the 2nd time I've done that, and this time I think it's final. While I think the idea of sending out frequent status messages is neat, it implies a certain level of ego that I'm not willing to maintain. I'm sure someone out there cares about what I have to say, but I'm not sure I want them to read about it on Twitter. So I'm going to try the "blog thing" and will see how that works instead.

Wednesday, February 25, 2009

Raster masking with gdal_rasterize and PostGIS

Clips an elevation file of Poland to the outlines of the country, plus .12 degrees. Gives a nicely buffered edge.

gdal_rasterize -b 1 -i -burn -9999 PG:'host=localhost dbname='unep'' -sql "SELECT ST_buffer(the_geom, .12) from coastlines where name='Poland'" Poland_elev.tif

Tuesday, February 17, 2009

Python Douglas Peuker algorithm

http://mappinghacks.com/code/dp.py.txt

Mapserver read NetCDF

OUTPUTFORMAT
NAME GEOTIFF_FLOAT
DRIVER "GDAL/GTiff"
MIMETYPE "image/tiff"
IMAGEMODE FLOAT32
EXTENSION "tif"
END

LAYER
NAME "mynetcdf"
STATUS OFF
TYPE RASTER
DUMP TRUE
DATA "mydata.nc"
METADATA
wcs_label "Test netCDF Server"
ows_extent '-0.5625 -89.69761276245117 359.4375 89.69761276245117'
wcs_resolution '1.125 -1.1212201595306397'
ows_srs "EPSG:4326"
wcs_formats "GEOTIFF_FLOAT"
wcs_nativeformat "netCDF"
wcs_bandcount "27"
wcs_rangeset_axes "bands"
wcs_rangeset_label "Atmospheric Levels"
wcs_rangeset_name "bands"
END
END

See also: http://mapserver.org/ogc/wcs_format.html#netcdf