Python Libraries for Geospatial


I would recommend learning Python for geospatial development. Python has a rich ecosystem of geospatial libraries and tools, which provide robust functionality for working with geospatial data, including reading and writing different file formats, spatial analysis, geoprocessing, visualization, and more. Additionally, it is a versatile programming language that can be used for a wide range of applications, not just limited to geospatial analysis. It allows you to integrate geospatial data with other data sources, perform statistical analysis, create interactive visualizations, build web applications, and automate tasks, among other things. Python also has a vibrant and active open-source community. Many geospatial libraries are open source, meaning they are continually developed, maintained, and improved by a community of contributors. This community aspect ensures that the tools and libraries stay up to date, receive bug fixes, and benefit from new features. When it comes to learning and developing in Python, the curve is relatively gentle compared to its counterparts like Java and C++ languages. Its syntax is straightforward and readable, making it accessible for beginners and there is a wealth of learning resources available, including tutorials, documentation, and online communities, which can help you get started and advance your skills. When it comes to Geospatial and GIS (Geographic Information System) skills are in demand across various industries, including urban planning, environmental management, transportation, agriculture, and more. Python is frequently used in these domains, so learning Python for geospatial applications can enhance your employability and open up job opportunities. This post highlights various libraries that one can learn towards acheving many tasks in geospatial development an it is recommended for those who have some understanding in Python language.

GDAL/OGR

The two open-source software libraries for working with geographic data are Geospatial Data Abstraction Library (GDAL) and OGR. Numerous software programs, including GIS programs, web mapping programs, and data processing tools, employ GDAL and OGR. Numerous open source initiatives, including QGIS, PostGIS, and MapServer, also use them.

gdal icon

GDAL library converts between raster and vector geographic data types. For each supported file, it offers the caller program a single vector abstract data model and a single raster abstract data model. It also includes a number of helpful command-line tools for processing and translating data.

sadia

OGR is used to read and write vector geographic data. Numerous vector file types, including ESRI Shapefiles, GeoJSON, and GML, are supported.

Some key points on GDAL:

  • Geospatial raster datasets can be read, written to, and modified using the functions provided by GDAL, which concentrates on raster data.
  • Numerous raster data formats, including GeoTIFF, ERDAS IMG, NetCDF, and others, are supported.
  • Raster data processing functions like as georeferencing, resampling, warping, mosaicking, and reprojecting are all provided by GDAL.
  • It has capabilities for retrieving details about raster datasets, such as band statistics, metadata, and georeferencing data.
  • For raster datasets, GDAL can also carry out tasks like rescaling, color mapping, and format conversions.

Some key points on OGR include:

  • OGR is primarily concerned with vector data and offers tools for reading, creating, and modifying geographic vector datasets.

  • It is compatible with a wide range of vector data formats, including ESRI Shapefile, GeoJSON, KML, GML, and many more.

  • Operations on vector data, including searching, filtering, editing, and spatial analysis, are made possible by OGR.

  • On vector geometries, it supports spatial operations including intersection, union, difference, buffer, and simplification.

  • OGR may run attribute queries, change attribute values, and retrieve attribute data related to vector features.

  • Vector datasets can be transformed and reprojected using OGR between several coordinate reference systems.

Code sample: Reading raster data

from osgeo import gdal

dataset = gdal.Open("mydata.tif")

Code sample: Writing raster data

gdal.Translate("out.tif", dataset)

Code sample: Converting raster to a different format. (Refer to coordinate systems EPSG codes)

gdal.ReprojectImage("out.tif", dataset, crs="EPSG:4326")

Code sample: Raster resampling

gdal.ResampleImage("out.tif", dataset, resolution=(10, 10))

Code sample: Computing raster statistics

min, max, mean, stddev = gdal.ComputeStatistics(dataset)

Code sample: Using OGR to create dataset

from osgeo import ogr

# Open a vector file
data_path = 'path/to/dataset.shp'
dataset = ogr.Open(data_path)

# Get the layer in the vector file
layer = dataset.GetLayer(0)

# Get the layer's spatial reference
spatial_ref = layer.GetSpatialRef()

# Get the number of features in the layer
feature_count = layer.GetFeatureCount()

# Iterate over the features in the layer
for feature in layer:
    # Get geometry of the feature
    geometry = feature.GetGeometryRef()

    # Get attribute values of the feature
    attribute_value = feature.GetField('attribute_name')

    # Perform operations on the geometry

# Create a new vector file
out_data = 'path/to/out.shp'
driver = ogr.GetDriverByName('ESRI Shapefile')
new_dataset = driver.CreateDataSource(out_data)
new_layer = new_dataset.CreateLayer('new_layer', spatialRef=spatial_ref, geom_type=ogr.wkbPoint)

# Define attributes for the new layer
field_defn = ogr.FieldDefn('attribute_name', ogr.OFTString)
new_layer.CreateField(field_defn)

# Create a new feature and set its geometry and attributes
new_feature = ogr.Feature(new_layer.GetLayerDefn())
new_feature.SetGeometry(ogr.CreateGeometryFromWkt('POINT(0 0)'))
new_feature.SetField('attribute_name', 'value')

# Add the feature to the new layer
new_layer.CreateFeature(new_feature)

# Close the datasets
dataset = None
new_dataset = None

The above examples are just a few from the many functionlities that GDAL/OGR provide. Generally, while OGR concentrates on vector data and GDAL on raster data, both libraries are made to coexist together. They can be used together to manage geographic data that includes both raster and vector components since they have features in common. The GDAL/OGR project is a well-liked option in the geospatial world since it offers a consistent and comprehensive set of tools for processing geographical data. The list of software using GDAL is shown in this GDAL website page.

GeoPandas

An easy approach to deal with geographical data in Python is made possible by the open-source GeoPandas package, which combines the capabilities of pandas (a data manipulation toolkit) and geospatial data. Data with a spatial component, such as points, lines, or polygons reflecting geographic features like cities, highways, or administrative boundaries, is referred to as geospatial data.

geopandas logo

GeoPandas extends the data manipulation and analysis functionalities of pandas by adding spatial operations and geometry handling capabilities. It leverages other powerful libraries, such as Shapely for geometry operations and Fiona for file input/output, to provide a comprehensive geospatial data processing toolkit. Some key features of GeoPandas include:

  • Geometry Operations: GeoPandas enables you to perform various geometric operations, such as calculating areas, lengths, centroids, buffer zones, spatial intersections, and unions, among others.
  • Coordinate Reference Systems (CRS): GeoPandas allows you to work with different coordinate reference systems, which define the spatial reference framework for the data. It supports CRS transformation, projection, and on-the-fly reprojection of geometries.
  • Visualization: GeoPandas integrates well with other Python visualization libraries, such as Matplotlib and Seaborn, allowing you to create maps and visualize geospatial data with ease.
  • Data Structures: GeoPandas introduces two new data structures, GeoSeries and GeoDataFrame, which are extensions of the pandas Series and DataFrame, respectively. These structures allow you to store and manipulate geospatial data along with associated attribute data.
  • File I/O: Shapefile, GeoJSON, GML, and many other geographic data file formats are all supported by GeoPandas for reading and writing. It offers a simple method for reading and writing geographic data to and from various formats using syntax that is similar to that of pandas.

Numerous geospatial analytic activities, such as spatial joins, spatial queries, data aggregation, data cleansing, and exploratory analysis, can be carried out with GeoPandas. The famous Python data ecosystem is widely used to analyze and show geographic data effectively in domains including geography, urban planning, environmental sciences, and data science.

Sample code below demonstrating use of GeoPandas library to read, analyze and plot data, combine/join with another dataset, and finally export to a new ESRI shapefile dataset. Remember to install geopandas library using Python's pip.

import geopandas as gpd

# Read a Shapefile
shp_path = 'my/favorite/path/to/shapefile.shp'
gdf = gpd.read_file(shp_path)

# Display the first few rows of the GeoDataFrame
print(gdf.head())

# Check the geometry type
print(gdf.geometry.type)

# Plot the GeoDataFrame
gdf.plot()

# Perform a spatial query
query = gdf.cx[1.41:-4.9, 41.9:-4.62]  # Select features within a bounding box
print(query)

# Perform a spatial join
points_path = 'path/to/points.shp'
points = gpd.read_file(points_path)
join_result = gpd.sjoin(points, gdf, how='left', op='within')

# Save the result as a new Shapefile
output_path = 'path/to/output.shp'
join_result.to_file(output_path)

Shapely

For geometric operations and manipulations of planar (2D) geometries, Python users frequently use the Shapely library. It offers tools for drawing, examining, and working with geometric shapes like points, lines, and polygons. To manage the geometric components of geospatial data, Shapely is frequently used in conjunction with other geospatial libraries like Geopandas.

shapely in action

Some key features of the Shapely library include:

  • Geometric Objects: The basic geometric objects that Shapely defines include Point, LineString, LinearRing, Polygon, and MultiPolygon. These items can be produced and changed by programs.
  • Geometry Properties: Geometry properties, such as area, length, centroid, border, and bounding box, can be accessed and changed using Shapely. These characteristics offer information that is helpful for analysis and visualization.
  • Input and Output: Both the well-known text (WKT) and well-known binary (WKB) formats for reading and writing geometries are supported by Shapely. For reading and writing geometries from different geographic file formats, it also interfaces well with other libraries like Geopandas and Fiona.
  • Geometric Objects: Several basic geometric objects, including Point, LineString, LinearRing, Polygon, and MultiPolygon, are defined by Shapely. These items can be produced and changed by programs.

Sample code below demonstrates the use of shapely library which creates geometries (points, lines and polygons), performs geometric operations (intersection, spatial relationships), and transforms. Remember to install shapely library using Python's pip.

from shapely.geometry import LineString, Polygon, Point

# Create a LineString geometry
line = LineString([(0, 0), (1, 1), (3, 2)])
print(line)

# Create a Polygon geometry
polygon = Polygon([(0, 0), (0, 1), (1, 1), (1, 0), (0, 0)])
print(polygon)

# Create a Point geometry
point = Point(2.0, 3.0)
print(point)

# Perform geometric operations
intersection = line.intersection(polygon)
print(intersection)

buffer = point.buffer(1.5)
print(buffer)

# Check spatial relationships
print(polygon.contains(point))
print(line.touches(polygon))

# Access geometry properties
print(polygon.area)
print(line.length)
print(point.coords)

# Coordinate transformations
transformed_point = point.transform(4326)  # Transform to WGS84 coordinates
print(transformed_point)

# Check if a geometry is valid
print(polygon.is_valid)

# Simplify a geometry
simplified_polygon = polygon.simplify(0.1)
print(simplified_polygon)

Geospatial analysis, data visualization, and computational geometry activities all make extensive use of Shapely. It offers an effective and simple interface for manipulating geometric objects in Python, and it's frequently combined with other libraries to carry out sophisticated geospatial studies.

PyroSAR

The pyroSAR library offers a comprehensive solution for the management and processing of SAR (Synthetic Aperture Radar) satellite data for applications scaling from desktop PCs to massive server infrastructures. SAR technology gives sea state and ice hazard maps to navigators, terrain structural information to geologists for mineral development, oil spill boundaries on water to environmentalists, and reconnaissance and targeting data to military and intelligence activities. This technology has a wide range of different uses. Some of them, particularly the civilian ones, have not yet received enough attention because SAR technology is only now starting to become affordable for smaller-scale applications because to decreasing cost electronics.

Its main functions include:

  • Metadata: Handling metadata about acquisition characteristics in a database
  • Read/Write: Reading various data formats from previous and current SAR satellite missions
  • Processing: Providing homogenized user-friendly access to SAR satellite processing utilities such as Sentinel Application Platform (SNAP) of the European Space Agency
  • Data Formatting: Formatting of the preprocessed data for further analysis
  • Export: Export to Data Cube solutions

Rasterio

Rasterio is used to read and write geographic raster data. It supports a broad variety of raster formats, including GeoTIFF, GDAL VRT, and NumPy arrays, and it offers a high-level API that makes working with raster data simple.

Rasterio can be used for a variety of tasks, including:

  • Data Visualization: Visualizing and resampling raster datasets
  • Statistics: Calculating raster statistics
  • Read/Write: Reading and writing of raster datasets
  • Conversion: Converting between different raster data formats
  • Applying spatial operations on raster data

Code sample: Reading and writing raster data

import rasterio

dataset = rasterio.open("sample_data.tif")

dataset.write("sample_output.tif")

Code sample: Conversion

import rasterio

# Open GeoTIFF file
input_file = 'my/path/to/input_raster.tif'
dataset = rasterio.open(input_file)

# Prepare output file info
out_file = 'my/path/to/output_raster.tif'
out_format = 'GTiff'  # Set to any supported format

# Copy the GeoTIFF data to a new file with the selected format
with rasterio.open(out_file, 'w', driver=out_format, width=dataset.width,
                   height=dataset.height, count=dataset.count,
                   dtype=dataset.dtypes[0], crs=dataset.crs,
                   transform=dataset.transform) as dst_dataset:
    for i in range(1, src_dataset.count + 1):
        dst_dataset.write(dataset.read(i), i)

# Close the datasets
dataset.close()

Code sample: Visualize raster data

dataset.plot()

Code sample: Resampling raster data to resolution of 10 meters

dataset.reproject(crs="EPSG:4326", resolution=(10, 10))

Code sample: Calculate raster statistics

min, max, mean, stddev = dataset.statistics()

Working with geospatial raster data requires the use of the robust and flexible tool Rasterio. It is well-documented, simple to use, and supported by a sizable user base.

PyQGIS

QGIS can be used in Python using the PyQGIS package. QGIS is a free desktop GIS program that enables users to view, examine, and work with geospatial data. You can use PyQGIS to create Python scripts and plugins to automate processes, increase QGIS functionality, and work with geographical data via the QGIS API. It gives users access to several GIS features, including importing and presenting spatial data, querying and modifying vector data, performing spatial analysis, building maps and layouts, and more. 

You may carry out geoprocessing operations with PyQGIS, work with various data formats, access and alter attribute data, execute spatial queries, make your own UI components, and include QGIS capabilities into your Python workflows or applications. The PyQt package, which offers Python bindings for the Qt framework, serves as the foundation for PyQGIS. By integrating QGIS with Python using PyQt, you can create robust geospatial apps and use Python scripting to automate GIS tasks.

Please click on the image below to have a look at a QGIS Desktop plugin that I developed in 2022 as a part of GIS software engineer interview present on my GitHub.

Sample code: Creating new point layer and adding data in QGIS Desktop environment

import qgis

# Create a new vector layer
layer = QgsVectorLayer("Point?crs=EPSG:4326", "new_layer", "memory")

# Add a point feature to the layer
feature = QgsFeature()
feature.setGeometry(QgsGeometry.fromPointXY(QgsPointXY(10, 10)))
feature.setAttributes(["test"])
layer.addFeature(feature)

# Update the layer
layer.updateFields()

# Set the layer as the current layer
QgsProject.instance().addMapLayer(layer)

WhiteBox Tools

In the realm of terrain analysis and geomatics, WhiteBox Tools is an open-source geospatial analysis and data processing package. In order to deal with digital elevation models (DEMs) and carry out various geospatial analysis tasks, it offers a complete range of tools and methods. WhiteBox Tools is a widely used open-source GIS toolkit that was built on top of the Geotools library and was created using the Java programming language.

WhiteBox Tools

In order to work with raster and vector data, it provides a variety of features, such as DEM analysis, hydrological modeling, terrain analysis, picture processing, geographical statistics, and more. Key features of Whitebox Tools include:

  • Image Processing: WhiteBox provides functions for working with remote sensing imagery, including image enhancement, filtering, classification, and image segmentation.
  • Hydrological Analysis: The library provides resources for stream ordering, watershed delineation, stream network extraction, flow direction and accumulation, and hydrological modeling and analysis.
  • Spatial Statistics/Analysis: Numerous spatial analysis and statistical operations are available through WhiteBox Tools, including proximity analysis, spatial interpolation, point pattern analysis, and spatial autocorrelation.
  • DEM: Digital elevation models (DEM) can be processed and analyzed using a range of tools from WhiteBox Tools. Slope, aspect, curvature, hillshading, viewshed analysis, and terrain categorization are among the functions included in this.

In the disciplines of geomatics, terrain analysis, environmental modeling, and geospatial research, WhiteBox Tools is frequently utilized. It offers a complete range of tools and algorithms that make it easier to analyze and process geospatial data, especially when it comes to examination of the terrain and elevation.

Geemap

Google Earth Engine (GEE) is a cloud computing platform with a multi-petabyte database of satellite imagery and geographical datasets. Geemap is a Python program for interactive mapping with GEE. The geospatial community has embraced GEE in recent years, and it has facilitated a variety of environmental applications at the local, regional, and global levels. GEE offers JavaScript and Python APIs for communicating with the Earth Engine servers in order to do computations.

Geemap

The GEE Python API has comparatively little documentation and has less capabilities for interactively viewing results than the GEE JavaScript API, which also provides an interactive IDE (i.e., GEE JavaScript Code Editor). Summarized key features of Geemap include:

  • Code Conversion: Supports GEE JavaScript conversionto Python and Jupyter notebooks.
  • Data Export: Export of GEE FeatureCollection to other vector formats and ImageCollection to GeoTIFF.
  • Rendering: Supprots displaying of data layers on the interactive map.
  • Image Processing/Analysis: Supports pixel-level data extraction, zonal statistics calculations, image classification and accuracy assessment.
  • Drawing: Its interface contains drawing tools.

Fiona

Fiona is a Python package that provides a simple and consistent interface to read and write spatial data using the GDAL/OGR library. It is designed to be easy to use and to make it easy to read and write data in a variety of formats, including GeoJSON, Shapefile, and GeoPackage. Working with spatial data is made easy with Fiona, a strong and flexible tool. It is well-documented, simple to use, and supported by a sizable user base. Some functionlities supported by Fiona include:

  • Read/Write: Reading and writing spatial data
  • Data Conversion: Converting between different spatial data formats
  • Projection: Reprojecting spatial data
  • Spatial Statistics: Calculating statistics on spatial data
  • Spatial Operations: Applying spatial operations to spatial data
  • Visualization: Visualizing spatial data
  • Filtering and Querying: Fiona allows you to filter and query vector data based on attribute values or spatial relationships. You can use conditional statements and spatial predicates to extract specific features from a dataset.

Code sample: Reading spatial data

import fiona

features = fiona.open("data.json")

Code sample: Writing spatial data

features.write("output.json")

Code sample: Conversion

fiona.convert("data.json", "output.shp")

Fiona is a widely used library in the geospatial domain, and it integrates well with other Python libraries such as Shapely, GeoPandas, and PyProj. It provides a lightweight and efficient solution for working with geospatial vector data in Python.

Folium

Folium is used to build individualized and interactive maps. It is constructed on top of the widely used JavaScript package Leaflet.js for interactive maps. Folium offers a straightforward and user-friendly interface for adding different map elements and viewing geospatial data, allowing you to make maps right within your Python environment.

Folium

Folium supports:

  • Routing:
  • Creating markups/Popups:
  • Choropleth Maps:
  • Interactive Maps:
  • GeoJSON:

Code sample: Creating a simple map

import folium

m = folium.Map()

# Set the map center and zoom level
m.set_location([37.775, -122.418], zoom_level=10)

# Display the map
m

Code sample: Adding markers

import folium

m = folium.Map()

# Add markers to the map
for city in ["San Francisco", "New York", "London"]:
    folium.Marker([city, 37.775, -122.418]).add_to(m)

# Display the map
m

Code sample: Adding polygons to map

import folium

m = folium.Map()

# Add polygons to the map
for country in ["United States", "Canada", "Mexico"]:
    folium.Polygon([(x, y) for x, y in country.polygon], fill=True, color="red").add_to(m)

# Display the map
m

Folium  can combine data analysis and visualization with map construction because it interfaces nicely with other Python libraries like Pandas, NumPy, and Matplotlib.

Conclusion


In this first of two posts, I have summarized some of the commonly used Python Geospatial libraries which I highly recommend looking at whether one is a beginner/ntermediate GIS developer, GIS startup enthusiast or anyone who is intersted in GIS programming. Some of the libraries mentioned are baing used by communities in the geospatial field while others have provided the building blocks of many GIS softwares and tools out there. Please keep posted for my second article summarizing and concluding additional Python GIS libraries. If you like this article, do comment or share with someone who may benefit from it.

 

Related articles

Comments

There are no comments yet.

captcha
ABOUT ME
JosephKariuki

Hi, thanks for visiting! My name is Joseph Kariuki, a software developer and this is my website. I am consistently seeking solutions to various problems using coding. This site does not contain ads, trackers, affiliates, and sponsored posts.

DETAILS

Published: Feb. 17, 2023

TAGS
python gis
CATEGORIES
Fundamentals GIS Python