Acquisition of high-resolution satellite imagery can often be one of the most significant barriers to deployment of remote sensing machine learning models. This section is intended to assist users of the ramp model in the process of identifying and acquiring the required imagery for model training and deployment. In addition to providing a list of resources utilized during the development and testing of the ramp model and basic imagery requirements, the section includes information regarding potential imagery pipelines and methods for partnering with public and private sector entities to facilitate imagery access.
The ramp model architecture is designed to work on RGB satellite imagery strips and mosaics with resolutions of 50cm or better. The model can also use aerial imagery that has been resampled to 30cm (See the Open Cities resource) Within these specifications there is often considerable variation in image quality related to atmospheric distortion, cloud cover, sun azimuth, and off-nadir angles. Users should review imagery prior to utilization in training and deployment to ensure building rooftops are clearly visible to human analysts. When multiple options for imagery over a single AOI are available, users should choose those images which have the best overall combination of minimal cloud cover, low off-nadir angle, and least obstructive shadows. For some use cases, such as emergency response, it may be critical that the imagery used for deployment is as current as possible. Users should attempt to find an acceptable balance between image currency and quality to ensure the outputs accurate both temporally and spatially.
To successfully deploy the ramp machine learning framework, satellite imagery is required in two critical stages:
Creation of Localization Training Data: To improve performance of the baseline model within a specific region of interest, it is strongly recommended that users utilize a labeled dataset created from imagery over the target region, or a region that is geographically similar to localize the provided baseline model. Localization of the baseline model requires labeled data pairs consisting of 256×256 pixel satellite imagery tiles and their respective building footprint vectors (geojson files).
The imagery acquisition process for each of these stages is covered below, including information on imagery requirements, selection of imagery, and open imagery resources.
License: CC BY-NC 4.0
The Maxar Open Data Program (ODP) releases selections of high-res imagery to support response to crises and disasters globally. The ODP has released 1,834,152 sq km of imagery around the world since 2017. The imagery available through the Maxar Open Data program is mainly released to support response to sudden-onset disasters such as flood events, earthquakes, tornados, etc. Because of this, each crisis event will typically have both “pre-event” and “post-event” imagery. The pre-event imagery is typically good quality at the necessary resolution for ramp, and mainly void of cloud cover.
The ramp project relies primarily on the Maxar ODP imagery for creation of localization training datasets.
For machine learning applications, a high quantity of training data is needed. The Maxar ODP has released one of the largest collections of high-res imagery openly available and the imagery generally meets the requirements of the ramp model. There are no existing building labels associated with the ODP imagery, but it can be used for new label creation as well as inference runs if the region of interest has been released.
By necessity, temporal accuracy is prioritized for ODP imagery. This results in a wide range of image quality across the released imagery. Therefore, the imagery selected for labeling should be reviewed carefully to ensure it meets the necessary requirements. These releases are also limited to their respective regions of crises and global coverage is not available, limiting their use for inference outside of crisis areas.
The ramp project relies primarily on the Maxar ODP imagery for creation of localization training datasets.
License: CC BY 3.0 US
Mapbox is an online mapping platform primarily designed for developers who wish to work with geospatial data. Mapbox imagery can be accessed through the Mapbox API which offers global satellite imagery coverage, much of it 30-50cm resolution. The imagery on Mapbox is known as mosaic imagery, which is comprised of individual image strips that have been stitched together to form a continuous basemap. Mosaics have a consistent pixel size that has also been color balanced, atmospherically corrected and cloud-patched. Mosaics are often used as inputs to satellite-based ML/AI workflows due to the static pixel size and minimal/no clouds. Mapbox recently completed a large acquisition of imagery from Maxar, further improving their global coverage, currency, and resolution.
The Mapbox imagery layer can be used in the OpenStreetMap (OSM) platform for labeling and delineating various features. This makes it an excellent candidate for training data since labeling can be done within the OSM web platform and then tiled with the Mapbox imagery to generate labeled pairs. Alternatively, provided existing label quality is sufficient and were created using the Mapbox layer, labeled pairs can be tiled from OSM and Mapbox immediately. Tiling of the OSM labels with Mapbox imagery can be easily accomplished using an open source tool developed by Development Seed called LabelMaker. This tool can be accessed on the Label-Maker Github Page.
With its convenient integrations and the open-source nature of the related tools, Mapbox imagery is generally a good candidate for training data and large-scale footprint generation. However, there are several considerations that may impact useability.
License: CC BY-SA 4.0
SpaceNet consists of a series of open challenges applying machine learning to satellite imagery. SpaceNet challenges 1 and 2 focused on building detection and were published with a collection of labeled training imagery. The source imagery included is high resolution RGB from DigitalGlobe (now Maxar) and covers several distinct geographic regions. The pre-labeled imagery within the SpaceNet 2 challenge dataset includes imagery from Shanghai China, Las Vegas USA, Paris France, and Khartoum Sudan.
The pre-labled Spacenet image tiles are a good option for quickly localizing the ramp baseline model if the target AOI is geographically similar to one of the regions included in the Spacenet dataset. However, there are several aspects of the dataset that should be carefully considered since they will impact how it can be used in a ramp model deployment.
The Open Cities dataset was developed as part of a challenge to segment building footprints from drone imagery. The goal of the challenge was to accelerate the development of more accurate, relevant, and usable open-source AI models to support mapping for disaster risk management in African cities. The data consists of ~3cm drone imagery from 10 different cities and regions across Africa.
Open Cities drone imagery is paired with accurate building labels, making it an excellent candidate for inclusion in a localization training dataset for deployment in African countries. Since the ramp project utilizes 30-50cm satellite imagery, the Open Cities drone imagery should be resampled to 30cm prior to inclusion in training datasets.
The Open Cities imagery does contain quite a few artifacts such as ghosted/double-exposed areas, no data “holes”, and some warping. Much of the double exposure and warping is limited to the outer bounds of the imagery and can be excluded from the final training set. Many of the artifacts were minimized after resampling the 3cm to 30cm. The labels are generally high quality but still contain some shifts and missed buildings, which necessitates review and update prior to inclusion in a training dataset.
While the ramp project was able to simply select a region from the ODP that has already been released for training data creation and model deployment, often in real-world scenarios the region of interest is not available through ODP. Sometimes the imagery must be acquired from a different open source or purchased through a commercial provider. Even after the imagery has been acquired there may not be time or personnel resources available to create new labels. To address these issues, there are several different methods which can be utilized depending on the scenario.
If imagery for the region of interest has been acquired and resources allow, new labels should be created to fine tune the model over the region of interest.
If imagery for the region of interest has been acquired or is available through ODP but there is no time and/or no resources for new labeling, existing labeled datasets should be reviewed to look for sets which are geographically similar to the region of interest. This review should include the released baseline labels, OSM labels, and alternative building datasets if they meet the accuracy requirements for the ramp model. The model can be fine-tuned using these similar labels and then deployed over imagery for the region of interest. If no geographically similar labels can be found, the baseline model can be deployed for inference, but decreased performance should be expected.
If no imagery has been acquired for the region of interest, the imagery is not available through ODP, and commercial purchase is not an option, alternative sources should be reviewed. New building labels can easily be created over the region of interest utilizing the OSM platform and the Mapbox Imagery layer. These labels can then be tiled and downloaded for fine-tuning using the Mapbox tiling API and an open-source tool called Labelmaker. Unfortunately, no date metadata is available for this imagery, so currency may be an issue for many applications.
Sometimes, the same event which triggers an ODP imagery release is the same event to which the ramp model is intended to be applied. In these fortuitous circumstances, imagery can simply be downloaded from the ODP for inference. Occasionally, requests can also be made in partnership with larger governmental entities or NGOs for an ODP release over a region of interest.
Many larger governmental entities and NGOs also already maintain a subscription to commercial imagery provider services such as Maxar SecureWatch or Planet imagery. If the end user/org is resource-bound and cannot purchase commercial imagery, partnership with one of these larger entities may be able to provide the necessary access in exchange for access to the generated footprints.
The International Charter is composed of space agencies and space system operators from around the world who work together to provide satellite imagery for disaster monitoring purposes much like ODP.
End users of the ramp model can partner with Disaster Charter Authorised Users to gain access to imagery or register as Authorised Users to request imagery releases if they meet the necessary requirements.
Aerial imagery is also an option for smaller local deployments of the model. Local partners may be available to capture drone imagery for the region of interest which can then be resampled to leverage existing ramp tools or left full resolution for extension of the ramp tools.