Geoconda
Geoconda is a collection of python packages that facilitate the development of python scripts for geoinformatics applications. It includes following python packages:
- boto3 - for working files in S3 storage, for example Allas. Allas S3 example in CSC geocomputing Github.
- cartopy - for map plotting.
- cfgrib - map GRIB files to the NetCDF Common Data Model
- copc-lib - reader and writer interface for Cloud Optimized Point Clouds (COPC) NEW 2023
- dask - provides advanced parallelism for analytics, enabling performance at scale, including dask-geopandas, Dask-ML and Dask JupyterLab extension.
- descartes - use Shapely or GeoJSON-like geometric objects as matplotlib paths and patches.
- Google Earth Engine API. NEW 2022
- fiona - reads and writes spatial data files.
- geoalchemy2 - provides extensions to SQLAlchemy for working with spatial databases, primarily PostGIS.
- geopandas - GeoPandas extends the datatypes used by pandas.
- igraph - for fast routing. Routing examples in CSC geocomputing Github
- geopy - client for several popular geocoding web services. NEW 2023
- jupyter - Jupyter Notebooks and JupyterLab, best to use with Puhti web interface and Jupyter
- laspy - for reading, modifying, and creating .LAS LIDAR files.
- landsatlinks - download Landsat Collection 2 Level 1. NEW 2023
- leafmap - for geospatial analysis and interactive mapping in a Jupyter environment. NEW 2023
- lidar - for delineating the nested hierarchy of surface depressions in digital elevation models (DEMs).
- metpy - reading, visualizing, and performing calculations with weather data. NEW 2022
- movingpandas - for trajectory data
- networkx - for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. Routing examples in CSC geocomputing Github
- pyproj - performs cartographic transformations and geodetic computations.
- pyogrio - vectorized spatial vector file format I/O using GDAL/OGR. NEW 2022
- open3d - for 3D data processing, NEW 2023
- osmnx - download spatial geometries and construct, project, visualize, and analyze street networks from OpenStreetMap's APIs. Routing examples in CSC geocomputing Github
- owslib - for retrieving data from Open Geospatial Consortium (OGC) web services
- python-pdal - PDAL Python extension for lidar data
- Py6S - Python interface to the Second Simulation of the Satellite Signal in the Solar Spectrum (6S) atmospheric Radiative Transfer Model
- pysal - spatial analysis functions.
- pdal - for lidar data
- pyntcloud - for working with 3D point clouds. NEW 2022
- pystac-client - for working with STAC Catalogs and APIs. STAC example in CSC geocomputing Github. NEW 2022
- python-cdo - scripting interface to CDO (Climate Data Operators).
- rasterio - access to geospatial raster data.
- rasterstats - for summarizing geospatial raster datasets based on vector geometries. It includes functions for zonal statistics and interpolated point queries. rasterstats example in CSC geocomputing Github
- rtree - spatial indexing and search.
- sentinelhub - for working with new Sentinel Hub services. NEW 2023
- sentinelsat - downloading Sentinel images, [sentinelsat example in CSC geocomputing Github] (https://github.com/csc-training/geocomputing/tree/master/python/sentinel)
- shapely - manipulation and analysis of geometric objects in the Cartesian plane.
- scipy - inc pandas, numpy, matplotlib etc
- scikit-learn - machine learning for Python. Spatial machine learning scikit-learn (shallow learning) exercises
- skimage - algorithms for image processing.
- stackstac - STAC data to xarray, STAC example in CSC geocomputing Github.
- swiftclient, keystoneclient - for working with SWIFT storage, for example Allas. Allas Swift example in CSC geocomputing Github.
- whiteboxtools - wide-scope processing of geospatial data, many tools operate in parallel, see CSC whiteboxtools page for details. NEW 2023
- xarray - for multidimensional raster data, inc. rioxarray. STAC example in CSC geocomputing Github.
- xarray-spatial - efficient common raster analysis functions for xarray. NEW 2022
- xarray_leaflet - xarray extension for tiled map plotting. NEW 2023
- And many more, for retrieving the full list in Puhti use:
list-packages
Additionally geoconda includes:
- spyder - Scientific Python Development Environment with graphical interface (similar to RStudio for R).
- GDAL/OGR commandline tools
- GMT The Generic Mapping Tools
- landsatlinks - for creating download URLs for Landsat Collection 2 Level 1 product bundles using the USGS/EROS Machine-to-Machine API. Use
python3.10 -m landsatlinks
. NEW 2023 - PDAL Point Data Abstraction Library
- ncview for visualizing netcdf files
Python has multiple packages for parallel computing, for example multiprocessing, joblib and dask. In our Puhti Python examples there are examples how to utilize these different parallelisation libraries.
If you think that some important GIS package for Python is missing from here, you can ask for installation from servicedesk@csc.fi.
Available
The geoconda
module is available:
- 3.10.9 (Python 3.10.9, PDAL 2.5.2, GDAL 3.6.2, created March 2023), in Puhti.
- 3.10.6 (Python 3.10.6, PDAL 2.4.1, GDAL 3.5.0, created September 2022), in Puhti and Mahti.
Version number is the same as the Python version.
Usage
For using Python packages and other tools listed above, you can initialize them with:
module load geoconda
By default the latest geoconda module is loaded. If you want a specific version you can specify the version number of geoconda:
module load geoconda/[VERSION]
To check the exact packages and versions included in the loaded module:
list-packages
You can add more Python packages to geoconda
, see instructions from CSC Python page.
You can edit your Python code in Puhti with:
- Visual Studio Code in Puhti web interface,
- Visual Studio Code on your local laptop,
- Jupyter Notebook or Lab in Puhti web interface or
- Spyder in Puhti web interface with remote desktop.
To open Spyder in Puhti web interface with remote desktop:
- Log in to Puhti web interface.
- Open Remote desktop: Apps -> Desktop.
- After launching the remote desktop open
Terminal
(Desktop icon) and start Spyder:
module load geoconda
spyder
Using Allas from Python
There are two Python libraries installed in Geoconda that can interact with Allas. Swiftclient uses the swift protocol and boto3 uses S3 protocol. You can find CSC examples how to use both here.
It is also possible to read and write files from and to Allas or other cloud object storage directly with GDAL-based packages such as geopandas
and rasterio
. Please check our Using geospatial files directly from cloud, inc Allas tutorial for instructions and examples.
With large quantities of data in Allas, consider using virtual rasters.
License
All packages are licensed under various free and open source licenses (FOSS), see the linked pages above for exact details.
Citation
Please see the above linked package pages for citation information per package.
Acknowledgement
Please acknowledge CSC and Geoportti in your publications, it is important for project continuation and funding reports. As an example, you can write "The authors wish to thank CSC - IT Center for Science, Finland (urn:nbn:fi:research-infras-2016072531) and the Open Geospatial Information Infrastructure for Research (Geoportti, urn:nbn:fi:research-infras-2016072513) for computational resources and support".
Installation
Geoconda was installed to Puhti and Mahti using Tykkys conda-containerize functionality. The WhiteboxTools conda package installs only WhiteboxTools installer, therefore for proper installation of Whiteboxtools required additional post installation command and folder to wrap commandline tools.
conda-containerize new --mamba --prefix install_dir --post download_wbt -w miniconda/envs/env1/lib/python3.10/site-packages/whitebox/WBT/whitebox_tools geoconda_3.10.9.yml
Geoconda conda environment files and download_wbt
and start_wbt.py
needed for WhiteboxTools are available in CSCs geocomputing repository. Note that for reproducibility, you'll need to define the package versions in the environment file, which can be checked on Puhti and Mahti using list-packages
command after loading the geoconda
module.
References
- CSC Python parallelisation examples
- Python spatial libraries
- Geoprocessing with Python using Open Source GIS
- GeoExamples, a lot of examples of using Python for spatial analysis
- Automating GIS processes course materials, where most of the exercises are done using Python (University of Helsinki)
- Geohack Week materials
- Multiprocessing Basics
- Geographic Data Science with Python
- Aalto Spatial Analytics course material