A Deep Multi-Attention Driven Approach for Multi-Label Remote Sensing Image Classification

Sumbul, Gencer; Demir, BegĂĽm

FG Remote Sensing Image Analysis Group

Deep learning (DL) based methods have been found popular in the framework of remote sensing (RS) image scene classification. Most of the existing DL based methods assume that training images are annotated by single-labels, however RS images typically contain multiple classes and thus can simultaneously be associated with multi-labels. Despite the success of existing methods in describing the information content of very high resolution aerial images with RGB bands, any direct adaptation for high-dimensional high-spatial resolution RS images falls short of accurate modeling the spectral and spatial information content. To address this problem, this paper presents a novel approach in the framework of the multi-label classification of high dimensional RS images. The proposed approach is based on three main steps. The first step describes the complex spatial and spectral content of image local areas by a novel KBranch CNN that includes spatial resolution specific CNN branches. The second step initially characterizes the importance scores of different local areas of each image and then defines a global descriptor for each image based on these scores. This is achieved by a novel multi-attention strategy that utilizes the bidirectional long short-term memory networks. The final step achieves the classification of RS image scenes with multilabels. Experiments carried out on BigEarthNet (which is a large-scale Sentinel-2 benchmark archive) show the effectiveness of the proposed approach in terms of multi-label classification accuracy compared to the state-of-the-art approaches. The code of the proposed approach is publicly available at https://gitlab.tubit.tuberlin.de/rsim/MAML-RSIC.