Global High Categorical Resolution Land Cover Mapping
via Weak Supervision

Xin-Yi Tong, Runmin Dong, Xiao Xiang Zhu

Introduction

Land cover information is indispensable for advancing the United Nations’ sustainable development goals, and land cover mapping under a more detailed category system would significantly contribute to economic livelihood tracking and environmental degradation measurement. However, the substantial difficulty in acquiring fine-grained training data makes the implementation of this task particularly challenging. Here, we propose to combine fully labeled source domain and weakly labeled target domain for weakly supervised domain adaptation (WSDA). This is beneficial as the utilization of sparse and coarse weak labels can considerably alleviate the labor required for precise and detailed land cover annotation. Specifically, we introduce the Prototype-based pseudo-label Rectification and Expansion (PRE) approach, which leverages the prototypes (i.e., the class-wise feature centroids) as the bridge to connect sparse labels and global feature distributions. Based on PRE, we carry out high categorical resolution land cover mapping for 10 cities in different regions around the world, severally using PlanetScope (PS), Gaofen-1 (GF-1), and Sentinel-2 (ST-2) satellite images. In the study areas, we achieve cross-sensor, cross-category, and cross-continent WSDA, with the overall accuracy exceeding 80%. The promising results indicate that PRE is capable of reducing the dependency of land cover classification on high-quality annotations, thereby improving label efficiency. We expect our work to enable global fine-grained land cover mapping, which in turn promote Earth observation to provide more precise and thorough information for environmental monitoring.

Study Area and Data

We construct a weakly supervised land cover classification dataset, comprising two parts: C-megacities and G-cities, which are globally distributed and feature-rich in information with fine-grained weak labels. It encompasses various satellite imagery with spatial resolutions ranging from 3 m to 10 m. Using it as the target domain and combining with the source domain: Five-Billion-Pixels, we implement land cover mapping for 10 cities located globally via WSDA.

The classes of C-megacities are identical to those of the source domain, while the classes of G-cities has been slightly adjusted according to the CORINE land cover project. Therefore, between G-cities and the source domain, there exists cross-sensor, cross-category, and cross-continent challenges.

Examples of densely annotated source domain and sparsely annotated target domain. Fine delineation for a single 1000×1000-pixel image with a resolution of 3 m takes approximately 1 hour. In contrast, scribbling an image with the same size and resolution takes only about 1 minute due to the avoidance of outlining boundaries.

C-megacities and G-cities are released under the open source license:

  • Link: Coming Soon
  • Weakly Supervised Domain Adaptation

    The objective of the WSDA task is to train a semantic segmentation model using fully annotated source data and weakly annotated target data, ensuring effective adaptation of the model to the target domain. We propose PRE to link labeled and unlabeled regions in the target domain utilizing the prototypes. Concretely, in addition to using labeled regions from the source and target domains to assess the domain joint segmentation loss, we introduce a dynamic pseudo-label self-training loss and a dynamic pseudo-label self-rectification loss. A subset of the most reliable pseudo-labels is used to compute the self-training loss. The remaining ones are rectified based on the feature distances, and the degree of modification is used to calculate the self-rectification loss. With each iteration, the pseudo-labels are dynamically expanded, and the prototypes are dynamically updated.

    The code for PRE is available here:

  • Link: Coming Soon
  • Land Cover Mapping

    We creat land cover mapping results for G-cities, which are mosaicked from 101 PS images. Among them, 30 images have weak annotations used for model training, 10 images have annotations for testing but not involved in training, and the remaining 71 images having no annotations at all. To ensure the completeness of visual presentation, these 30 training images are not removed during mapping. Therefore, these results are presented for illustrative purposes only.



    Berlin

    Melbourne

    Nairobi

    Sao Paulo

    Washington DC

    Visual comparison with different land cover mapping projects in Berlin, including Google's Dynamic World, ESA's World Cover, and EEA's CORINE Land Cover.

    The detailed land cover mapping results depict specific categories relevant to the natural environment, economic activities, and urban quality of life, with other categories hidden for visual clarity. (a) Urban industrial zones and urban green spaces in Melbourne; (b) Smallholder agriculture in Nairobi; (c) Mineral sites within forested areas in Sao Paulo; (d) Suburban croplands and wetlands in Washington DC.

    Citation

    @article{
    }
    

    Contact

    E-mail: xinyi.tong@tum.de

    Personal page: Xin-Yi Tong