HYPERSPECTRAL IMAGE CLASSIFICATION USING RESIDUAL 2D AND 3D CONVOLUTIONAL NEURAL NETWORK JOINT ATTENTION MODEL
Hyperspectral image classification (HSIC) is a challenging task in remote sensing data analysis, which has been applied in many domains for better identification and inspection of the earth surface by extracting spectral and spatial information. The combination of abundant spectral features and accurate spatial information can improve classification accuracy. However, many traditional methods are based on handcrafted features, which brings difficulties for multi-classification tasks due to spectral intra-class heterogeneity and similarity of inter-class. The deep learning algorithm, especially the convolutional neural network (CNN), has been perceived promising feature extractor and classification for processing hyperspectral remote sensing images. Although 2D CNN can extract spatial features, the specific spectral properties are not used effectively. While 3D CNN has the capability for them, but the computational burden increases as stacking layers. To address these issues, we propose a novel HSIC framework based on the residual CNN network by integrating the advantage of 2D and 3D CNN. First, 3D convolutions focus on extracting spectral features with feature recalibration and refinement by channel attention mechanism. The 2D depth-wise separable convolution approach with different size kernels concentrates on obtaining multi-scale spatial features and reducing model parameters. Furthermore, the residual structure optimizes the back-propagation for network training. The results and analysis of extensive HSIC experiments show that the proposed residual 2D-3D CNN network can effectively extract spectral and spatial features and improve classification accuracy.