attribute: Phillie Casablanca

Category: GIS (1)

Understanding raster, basic GIS concepts and the python gdal library

I've been learning about how to handle *raster* files and getting comfortable with the python libraries that are available to manipulate them.
(I'm still learning, but hopefully this will help someone like me who is just getting started)

First thing you need to know about common *decimal degree* latitude/longitude notation is that it is **NOT** a linear representation, but *angular* (defining a specific point on a sphere-like shape). So performing `cartisian coordinate <>`_ (where the distance from (1,1) to (1,2) and the distance from (1,1) to (1,0) is an equal distance) operations is going to give you very accurate results.

Next, the common *decimal degree* latitude/longitude values you see use a specific **geographic coordinate system** (Commonly *WGS84*). In addition to **geographic**, **projected coordinate systems** are also used.

This quote from the `OGR Projections Tutorial <>`_ helped me to understand what's going on here:

>> There are two primary kinds of coordinate systems. The first is **geographic** (positions are measured in *long/lat*) and the second is **projected** (such as UTM - positions are measured in *meters* or *feet*).

>> A **projected** coordinate system (such as UTM, Lambert Conformal Conic, etc) requires an underlying **geographic** coordinate system as well as a definition for the *projection transform* used to translate between linear positions (in meters or feet) and angular long/lat positions.

In my examples here I'll be using **UTM** for the *projected* coordinate system and **WGS84** for the *geographic*. I'll be calling the "projection transform", *geo_transform* in my code below.

.. note::

UTM is split into multiple *zones* and may be difficult to work with in places where you will be spanning zones. The "Google" projection, *900913*, may be better for simple manipulations.

First some simple coordinate conversion. When doing manipulation of data you're going to need to do some conversions, here's how to do it with the python `GDAL <>`_ package:

import osr

def transform_utm_to_wgs84(easting, northing, zone):
utm_coordinate_system = osr.SpatialReference()
utm_coordinate_system.SetWellKnownGeogCS("WGS84") # Set geographic coordinate system to handle lat/lon
is_northern = northing > 0
utm_coordinate_system.SetUTM(zone, is_northern)

wgs84_coordinate_system = utm_coordinate_system.CloneGeogCS() # Clone ONLY the geographic coordinate system

# create transform component
utm_to_wgs84_geo_transform = osr.CoordinateTransformation(utm_coordinate_system, wgs84_coordinate_system) # (, )
return utm_to_wgs84_geo_transform.TransformPoint(easting, northing, 0) # returns lon, lat, altitude

def transform_wgs84_to_utm(lon, lat):
def get_utm_zone(longitude):
return (int(1+(longitude+180.0)/6.0))

def is_northern(latitude):
Determines if given latitude is a northern for UTM
if (latitude < 0.0):
return 0
return 1

utm_coordinate_system = osr.SpatialReference()
utm_coordinate_system.SetWellKnownGeogCS("WGS84") # Set geographic coordinate system to handle lat/lon
utm_coordinate_system.SetUTM(get_utm_zone(lon), is_northern(lat))

wgs84_coordinate_system = utm_coordinate_system.CloneGeogCS() # Clone ONLY the geographic coordinate system

# create transform component
wgs84_to_utm_geo_transform = osr.CoordinateTransformation(wgs84_coordinate_system, utm_coordinate_system) # (, )
return wgs84_to_utm_geo_transform.TransformPoint(lon, lat, 0) # returns easting, northing, altitude

As you may have noticed, *easting* and *northing* are the *UTM* terms for *x*, and *y*.

Now, why would you want to use *rasters*? Well, rasters are used as a way to aggregate data in a given area. I've commonly called this "*binning*", but in raster terms the "*bin*" elements of a raster are refered to as a "**cell**" or "**pixel**". The *raster* structure/object model in gdal is like this:

- A *raster* consists of a *dataset*.

- A *dataset* consists of one or more *bands*.

- A *band* contains multiple *cells*/*pixels*/*bins*.

- A *cell*/*pixel*/*bin* is a location with a *value*.

Now for the code. I think there may be a good way to create a *RasterBuilder* class to handle the raster functions, but I'll leave that to you.


import numpy
from osgeo import gdal
from osgeo import osr
gdal.AllRegister() # register all drivers
from shapely.geometry import MultiPoint # used to get extend/bounds

def get_point_bounds(points):
mp_points = MultiPoint(points) #from shapely.geometry import MultiPoint
(min_lon, min_lat, max_lon, max_lat) = mp_points.bounds
return min_lon, min_lat, max_lon, max_lat

def get_raster_size(minx, miny, maxx, maxy, cell_width, cell_height):
Determine the number of rows/columns given the bounds of the point data and the desired cell size
cols = int((maxx - minx) / cell_width)
rows = int((maxy - miny) / abs(cell_height))
return cols, rows

def lonlat_to_pixel(lon, lat, inverse_geo_transform):
Translates the given lon, lat to the grid pixel coordinates in data array (zero start)
# transform to utm
utm_x, utm_y, utm_z = transform_wgs84_to_utm(lon, lat)

# apply inverse geo tranform
pixel_x, pixel_y = gdal.ApplyGeoTransform(inverse_geo_transform, utm_x, utm_y)
pixel_x = int(pixel_x) - 1 # adjust to 0 start for array
pixel_y = int(pixel_y) - 1 # adjust to 0 start for array

return pixel_x, abs(pixel_y) # y pixel is likly a negative value given geo_transform

def create_raster(points, filename="test.tiff"):
points == { (lon, lat): value, ...}
lon/lat values in WGS84
# create empty raster
driver = gdal.GetDriverByName(OUTPUT_FORMAT)
number_of_bands = 1
band_type = gdal.GDT_Float32

x_rotation = 0
y_rotation = 0
cell_width_meters = 50.0
cell_height_meters = 50.0

(min_lon, min_lat, max_lon, max_lat) = get_point_bounds(points) # retrieve bounds for point data

srs = osr.SpatialReference()
srs.SetWellKnownGeogCS("WGS84") # Set geographic coordinate system to handle lat/lon
srs.SetUTM( get_utm_zone(min_lon), is_northern(max_lat)) # Set projected coordinate system to handle meters

# create transforms for point conversion
wgs84_coordinate_system = srs.CloneGeogCS() # clone only the geographic coordinate system

wgs84_to_utm_transform = osr.CoordinateTransformation(wgs84_coordinate_system, srs)
utm_to_wgs84_transform = osr.CoordinateTransformation(srs, wgs84_coordinate_system)

# convert to UTM
top_left_x, top_left_y, z = transform_wgs84_to_utm(min_lon, max_lat)
lower_right_x, lower_right_y, z = transform_wgs84_to_utm(max_lon, min_lat)

cols, rows = get_raster_size(top_left_x,
dataset = driver.Create(filename, cols, rows, number_of_bands, band_type)

# GeoTransform parameters
# --> need to know the area that will be covered to define the geo tranform
# top left x, w-e pixel resolution, rotation, top left y, rotation, n-s pixel resolution
geo_transform = [ top_left_x, cell_width_meters, x_rotation, top_left_y, y_rotation, -cell_height_meters ] # cell height must be negative (-) to apply image space to map
inverse_geo_transform = gdal.InvGeoTransform(self.geo_transform)[1] # for mapping lat/lon to pixel

# get the empty raster data array
band = dataset.GetRasterBand(1) # 1 == band index value
data = band.ReadAsArray(0, 0, cols, rows).astype(numpy.cfloat)

# populate array values for output
for lon, lat in points.keys():
value = points[(lon, lat)]
# apply value to array
pixel_x, pixel_y = lonlat_to_pixel(lon, lat, inverse_geo_transform)
data[pixel_x][pixel_y] = value

# write the updated data array to file
band.WriteArray(data, 0, 0)

# set dataset to None to "close" file
dataset = None

monkut // May 2, 2012 // 6:15 p.m.