A MACHINE LEARNING DATASET FOR LARGE-SCOPE HIGH RESOLUTION REMOTE SENSING IMAGE INTERPRETATION CONSIDERING LANDSCAPE SPATIAL HETEROGENEITY
The demand for timely information about earth’s surface such as land cover and land use (LC/LU), is consistently increasing. Machine learning method shows its advantage on collecting such information from remotely sensed images while requiring sufficient training sample. For satellite remote sensing image, however, sample datasets covering large scope are still limited. Most existing sample datasets for satellite remote sensing image built based on a few frames of image located on a local area. For large scope (national level) view, choosing a sufficient unbiased sampling method is crucial for constructing balanced training sample dataset. Dependable spatial sample locations considering spatial heterogeneity of land cover are needed for choosing sample images. This paper introduces an ongoing work on establishing a national scope sample dataset for high spatial-resolution satellite remote sensing image processing. Sample sites been chosen sufficiently using spatial sampling method, and divided sample patches been grouped using clustering method for further uses. The neural network model for road detection trained our dataset subset shows an increased performance on both completeness and accuracy, comparing to two widely used public dataset.