ramp v1.0.0
DOCUMENTATION

ToC

IMAGERY

Overview

Acquisition of high-resolution satellite imagery can often be one of the most significant barriers to deployment of remote sensing machine learning models. This section is intended to assist users of the ramp model in the process of identifying and acquiring the required imagery for model training and deployment. In addition to providing a list of resources utilized during the development and testing of the ramp model and basic imagery requirements, the section includes information regarding potential imagery pipelines and methods for partnering with public and private sector entities to facilitate imagery access.

 

Imagery Requirements

The ramp model architecture is designed to work on RGB satellite imagery strips and mosaics with resolutions of 50cm or better. The model can also use aerial imagery that has been resampled to 30cm (See the Open Cities resource) Within these specifications there is often considerable variation in image quality related to atmospheric distortion, cloud cover, sun azimuth, and off-nadir angles. Users should review imagery prior to utilization in training and deployment to ensure building rooftops are clearly visible to human analysts. When multiple options for imagery over a single AOI are available, users should choose those images which have the best overall combination of minimal cloud cover, low off-nadir angle, and least obstructive shadows. For some use cases, such as emergency response, it may be critical that the imagery used for deployment is as current as possible. Users should attempt to find an acceptable balance between image currency and quality to ensure the outputs accurate both temporally and spatially.

 

Imagery Workflow

To successfully deploy the ramp machine learning framework, satellite imagery is required in two critical stages:

  1. Creation of Localization Training Data:  To improve performance of the baseline model within a specific region of interest, it is strongly recommended that users utilize a labeled dataset created from imagery over the target region, or a region that is geographically similar to localize the provided baseline model. Localization of the baseline model requires labeled data pairs consisting of 256×256 pixel satellite imagery tiles and their respective building footprint vectors (geojson files).

  2. Footprint Generation: Production deployment of the model at scale for building footprint generation requires satellite imagery coverage for the region of interest. 

 

The imagery acquisition process for each of these stages is covered below, including information on imagery requirements, selection of imagery, and open imagery resources.

Imagery Sources

License:  CC BY-NC 4.0

The Maxar Open Data Program (ODP) releases selections of high-res imagery to support response to crises and disasters globally. The ODP has released 1,834,152 sq km of imagery around the world since 2017. The imagery available through the Maxar Open Data program is mainly released to support response to sudden-onset disasters such as flood events, earthquakes, tornados, etc. Because of this, each crisis event will typically have both “pre-event” and “post-event” imagery. The pre-event imagery is typically good quality at the necessary resolution for ramp, and mainly void of cloud cover.

The ramp project relies primarily on the Maxar ODP imagery for creation of localization training datasets.

For machine learning applications, a high quantity of training data is needed. The Maxar ODP has released one of the largest collections of high-res imagery openly available and the imagery generally meets the requirements of the ramp model. There are no existing building labels associated with the ODP imagery, but it can be used for new label creation as well as inference runs if the region of interest has been released.

By necessity, temporal accuracy is prioritized for ODP imagery. This results in a wide range of image quality across the released imagery. Therefore, the imagery selected for labeling should be reviewed carefully to ensure it meets the necessary requirements. These releases are also limited to their respective regions of crises and global coverage is not available, limiting their use for inference outside of crisis areas. 

The ramp project relies primarily on the Maxar ODP imagery for creation of localization training datasets.

  • Strengths
    • Large quantity of open-source imagery available
    • All imagery is high-res, 30-60cm
    • Global geographic diversity available
    • Events triggering release will likely benefit from ramp model inference
    • CC BY-NC 4.0 licensing supports the public release of labeled training data
  • Weaknesses
    • Temporal accuracy may be prioritized over image quality
    • End users have no control over what areas are released and in what quantity.

License: CC BY 3.0 US

Mapbox is an online mapping platform primarily designed for developers who wish to work with geospatial data. Mapbox imagery can be accessed through the Mapbox API which offers global satellite imagery coverage, much of it 30-50cm resolution. The imagery on Mapbox is known as mosaic imagery, which is comprised of individual image strips that have been stitched together to form a continuous basemap. Mosaics have a consistent pixel size that has also been color balanced, atmospherically corrected and cloud-patched. Mosaics are often used as inputs to satellite-based ML/AI workflows due to the static pixel size and minimal/no clouds. Mapbox recently completed a large acquisition of imagery from Maxar, further improving their global coverage, currency, and resolution.

The Mapbox imagery layer can be used in the OpenStreetMap (OSM) platform for labeling and delineating various features. This makes it an excellent candidate for training data since labeling can be done within the OSM web platform and then tiled with the Mapbox imagery to generate labeled pairs. Alternatively, provided existing label quality is sufficient and were created using the Mapbox layer, labeled pairs can be tiled from OSM and Mapbox immediately. Tiling of the OSM labels with Mapbox imagery can be easily accomplished using an open source tool developed by Development Seed called LabelMaker. This tool can be accessed on the Label-Maker Github Page.

With its convenient integrations and the open-source nature of the related tools, Mapbox imagery is generally a good candidate for training data and large-scale footprint generation. However, there are several considerations that may impact useability.

  • The existing OSM building labels rarely align well with the Mapbox imagery layer. These buildings generally appear to have been collected using the higher-quality and more recent Maxar imagery layers. If the Mapbox imagery is to be used, new labeling will generally have to be performed using the Mapbox imagery layer.
  • The imagery quality varies widely, but over Low- and Middle-Income Countries (LMICs) it tends to be poorer, making labeling difficult. “Poor” image quality in this case is mainly in reference to lower resolution (50cm and above) and high off-nadir angles which presents issues with building occlusion, particularly in urban centers.
  • The Mapbox/OSM combination can be an extremely valuable resource in localization labeling and footprint generation for users where imagery and software resources are limited.
  • While the Mapbox imagery is free and open source, the Mapbox imagery license (CC BY 3.0 US) should be carefully reviewed to ensure compliance with usage and sharing limitations.
  • Strengths
    • Easily accessed and downloaded via API
    • Generally high-resolution
    • Good global coverage of imagery mosaics
    • Can be labeled using OSM platform
  • Weaknesses
    • Image quality suffers over LMICs
    • Existing OSM labels rarely align with Mapbox imagery layer

License: CC BY-SA 4.0

SpaceNet consists of a series of open challenges applying machine learning to satellite imagery. SpaceNet challenges 1 and 2 focused on building detection and were published with a collection of labeled training imagery. The source imagery included is high resolution RGB from DigitalGlobe (now Maxar) and covers several distinct geographic regions. The pre-labeled imagery within the SpaceNet 2 challenge dataset includes imagery from Shanghai China, Las Vegas USA, Paris France, and Khartoum Sudan.

The pre-labled Spacenet image tiles are a good option for quickly localizing the ramp baseline model if the target AOI is geographically similar to one of the regions included in the Spacenet dataset. However, there are several aspects of the dataset that should be carefully considered since they will impact how it can be used in a ramp model deployment.

  • The Spacenet image tiles are 650×650 pixels and will need tobe re-tiled to 256×256 pixels to work with the current ramp training architecture.
  • It appears that, to increase the difficulty of the challenge, the imagery has been chopped and sliced into odd shapes. The resulting sporadic empty areas around image tiles must be dealt with prior to training.
  • While the resolution was generally 50cm, the quality of that imagery is often negatively impacted by atmospheric distortion, shadows, and off-nadir angles.
  • The label alignment did not always meet the required horizontal accuracy required for the ramp model. To achieve high accuracy if using these labels for localization, some rework is required (Note: The ramp team reworked a number of these labels and will be providing these updated labels as a resource.)
  • Labels were often aligned with the base of buildings rather than the rooftop. While this collection method is common practice in content management, the ramp model is designed to detect rooflines.
  • Strengths
    • Free and open-source collection of imagery with building labels
    • Generally high resolution
    • Offers geographic diversity which can support several localization types.
  • Weaknesses
    • Image quality varied, with some regions becoming difficult to delineate rooftops.
    • Existing labels do not always meet accuracy or collection requirements for localization.

License: CC-BY-4.0

The Open Cities dataset was developed as part of a challenge to segment building footprints from drone imagery. The goal of the challenge was to accelerate the development of more accurate, relevant, and usable open-source AI models to support mapping for disaster risk management in African cities. The data consists of ~3cm drone imagery from 10 different cities and regions across Africa.

Open Cities drone imagery is paired with accurate building labels, making it an excellent candidate for inclusion in a localization training dataset for deployment in African countries. Since the ramp project utilizes 30-50cm satellite imagery, the Open Cities drone imagery should be resampled to 30cm prior to inclusion in training datasets.

The Open Cities imagery does contain quite a few artifacts such as ghosted/double-exposed areas, no data “holes”, and some warping. Much of the double exposure and warping is limited to the outer bounds of the imagery and can be excluded from the final training set. Many of the artifacts were minimized after resampling the 3cm to 30cm. The labels are generally high quality but still contain some shifts and missed buildings, which necessitates review and update prior to inclusion in a training dataset.

  • Strengths
    • Very high resolution, clear imagery
    • High accuracy building labels
    • Good geographic diversity across urban centers in LMIC’s
  • Weaknesses
    • Some image artifacts
    • Regions are relatively small and the number of training tiles that can be generated from them is limited.

Imagery Acquisition Considerations

While the ramp project was able to simply select a region from the ODP that has already been released for training data creation and model deployment, often in real-world scenarios the region of interest is not available through ODP. Sometimes the imagery must be acquired from a different open source or purchased through a commercial provider. Even after the imagery has been acquired there may not be time or personnel resources available to create new labels. To address these issues, there are several different methods which can be utilized depending on the scenario.

1.

If imagery for the region of interest has been acquired and resources allow, new labels should be created to fine tune the model over the region of interest.

2.

If imagery for the region of interest has been acquired or is available through ODP but there is no time and/or no resources for new labeling, existing labeled datasets should be reviewed to look for sets which are geographically similar to the region of interest. This review should include the released baseline labels, OSM labels, and alternative building datasets if they meet the accuracy requirements for the ramp model. The model can be fine-tuned using these similar labels and then deployed over imagery for the region of interest. If no geographically similar labels can be found, the baseline model can be deployed for inference, but decreased performance should be expected.

3.

If no imagery has been acquired for the region of interest, the imagery is not available through ODP, and commercial purchase is not an option, alternative sources should be reviewed. New building labels can easily be created over the region of interest utilizing the OSM platform and the Mapbox Imagery layer. These labels can then be tiled and downloaded for fine-tuning using the Mapbox tiling API and an open-source tool called Labelmaker. Unfortunately, no date metadata is available for this imagery, so currency may be an issue for many applications.

Partnerships

Sometimes, the same event which triggers an ODP imagery release is the same event to which the ramp model is intended to be applied. In these fortuitous circumstances, imagery can simply be downloaded from the ODP for inference. Occasionally, requests can also be made in partnership with larger governmental entities or NGOs for an ODP release over a region of interest.

Many larger governmental entities and NGOs also already maintain a subscription to commercial imagery provider services such as Maxar SecureWatch or Planet imagery. If the end user/org is resource-bound and cannot purchase commercial imagery, partnership with one of these larger entities may be able to provide the necessary access in exchange for access to the generated footprints.

Alternative Sources

The International Charter is composed of space agencies and space system operators from around the world who work together to provide satellite imagery for disaster monitoring purposes much like ODP.

End users of the ramp model can partner with Disaster Charter Authorised Users to gain access to imagery or register as Authorised Users to request imagery releases if they meet the necessary requirements.

Aerial imagery is also an option for smaller local deployments of the model. Local partners may be available to capture drone imagery for the region of interest which can then be resampled to leverage existing ramp tools or left full resolution for extension of the ramp tools.