DUAL PYRAMIDS ENCODER-DECODER NETWORK FOR SEMANTIC SEGMENTATION IN GROUND AND AERIAL VIEW IMAGES

Jiang, S. L.; Li, G.; Yao, W.; Hong, Z. H.; Kuc, T. Y.

Semantic segmentation is a fundamental research task in computer vision, which intends to assign a certain category to every pixel. Currently, most existing methods only utilize the deepest feature map for decoding, while high-level features get inevitably lost during the procedure of down-sampling. In the decoder section, transposed convolution or bilinear interpolation was widely used to restore the size of the encoded feature map; however, few optimizations are applied during up-sampling process which is detrimental to the performance for grouping and classification. In this work, we proposed a dual pyramids encoder-decoder deep neural network (DPEDNet) to tackle the above issues. The first pyramid integrated and encoded multi-resolution features through sequentially stacked merging, and the second pyramid decoded the features through dense atrous convolution with chained up-sampling. Without post-processing and multi-scale testing, the proposed network has achieved state-of-the-art performances on two challenging benchmark image datasets for both ground and aerial view scenes.

Zitieren

Zitierform:

Jiang, S. L. / Li, G. / Yao, W. / et al: DUAL PYRAMIDS ENCODER-DECODER NETWORK FOR SEMANTIC SEGMENTATION IN GROUND AND AERIAL VIEW IMAGES. 2020. Copernicus Publications.

Zugriffsstatistik

Gesamt:
Volltextzugriffe:
Metadatenansicht:
12 Monate:
Volltextzugriffe:
Metadatenansicht:

Grafik öffnen

Rechte

Rechteinhaber: S. L. Jiang et al.

Nutzung und Vervielfältigung:

Export