Skip to content

Improve WDPA data download / access #8

@sjpfenninger

Description

@sjpfenninger

@irm-codebase in PR #1:

@sjpfenninger wile digging around in the integrated module workflow, I found a way to avoid having to force users to download the WDPA dataset on their own.

the following curl command should always download the latest version of that dataset

# Full world polygons to GeoJSON
curl -G "https://data-gis.unep-wcmc.org/server/rest/services/ProtectedSites/The_World_Database_of_Protected_Areas/FeatureServer/1/query" \
     --data-urlencode "where=1=1" \  # ask for all records
     --data-urlencode "outFields=*" \  # ask for all columns
     --data-urlencode "outSR=4326" \ # WGS84 (can be another EPSG)
     --data-urlencode "f=geojson" \  # GeoJSON, convert to GeoParquet later
     -o wdpa_poly_latest.geojson

This results in a ~180 MB download that is completed rather quickly

Another alternative is the ogr2ogr command in GDAL (not tested), which can output GeoParquet directly

# Polygons (layer 1) -> GeoParquet
ogr2ogr -f Parquet wdpa_poly_latest.parquet \
  "https://data-gis.unep-wcmc.org/server/rest/services/ProtectedSites/The_World_Database_of_Protected_Areas/FeatureServer/1" \
  -t_srs EPSG:4326 \
  -lco COMPRESSION=SNAPPY -lco GEOMETRY_ENCODING=WKB

This may warrant further investigation, but it seems like this API is limited to requesting only up to 2000 records. See https://data-gis.unep-wcmc.org/server/rest/services/ProtectedSites/The_World_Database_of_Protected_Areas/FeatureServer which states "MaxRecordCount: 2000". That would explain why the download here is ~10x smaller than the manual download of the full dataset.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions