Spatio-temporal filling of missing points in geophysical data sets
The majority of data sets in the geosciences are obtained from observations and measurements of natural systems, rather than in the laboratory. These data sets are often full of gaps, due to to the conditions under which the measurements are made. Missing data give rise to various problems, for example in spectral estimation or in specifying boundary conditions for numerical models. Here we use Singular Spectrum Analysis (SSA) to fill the gaps in several types of data sets. For a univariate record, our procedure uses only temporal correlations in the data to fill in the missing points. For a multivariate record, multi-channel SSA (M-SSA) takes advantage of both spatial and temporal correlations. We iteratively produce estimates of missing data points, which are then used to compute a self-consistent lag-covariance matrix; cross-validation allows us to optimize the window width and number of dominant SSA or M-SSA modes to fill the gaps. The optimal parameters of our procedure depend on the distribution in time (and space) of the missing data, as well as on the variance distribution between oscillatory modes and noise. The algorithm is demonstrated on synthetic examples, as well as on data sets from oceanography, hydrology, atmospheric sciences, and space physics: global sea-surface temperature, flood-water records of the Nile River, the Southern Oscillation Index (SOI), and satellite observations of relativistic electrons.