INVESTIGATIONS ON SKIP-CONNECTIONS WITH AN ADDITIONAL COSINE SIMILARITY LOSS FOR LAND COVER CLASSIFICATION
Pixel-based land cover classification of aerial images is a standard task in remote sensing, whose goal is to identify the physical material of the earth’s surface. Recently, most of the well-performing methods rely on encoder-decoder structure based convolutional neural networks (CNN). In the encoder part, many successive convolution and pooling operations are applied to obtain features at a lower spatial resolution, and in the decoder part these features are up-sampled gradually and layer by layer, in order to make predictions in the original spatial resolution. However, the loss of spatial resolution caused by pooling affects the final classification performance negatively, which is compensated by skip-connections between corresponding features in the encoder and the decoder. The most popular ways to combine features are element-wise addition of feature maps and 1x1 convolution. In this work, we investigate skip-connections. We argue that not every skip-connections are equally important. Therefore, we conducted experiments designed to find out which skip-connections are important. Moreover, we propose a new cosine similarity loss function to utilize the relationship of the features of the pixels belonging to the same category inside one mini-batch, i.e. these features should be close in feature space. Our experiments show that the new cosine similarity loss does help the classification. We evaluated our methods using the Vaihingen and Potsdam dataset of the ISPRS 2D semantic labelling challenge and achieved an overall accuracy of 91.1% for both test sites.