Application of high-dimensional fuzzy <i>k</i>-means cluster analysis to CALIOP/CALIPSO version 4.1 cloud–aerosol discrimination
This study applies fuzzy k-means (FKM) cluster analyses to a
subset of the parameters reported in the CALIPSO lidar level 2 data products
in order to classify the layers detected as either clouds or aerosols. The
results obtained are used to assess the reliability of the cloud–aerosol
discrimination (CAD) scores reported in the version 4.1 release of the
CALIPSO data products. FKM is an unsupervised learning algorithm, whereas
the CALIPSO operational CAD algorithm (COCA) takes a highly supervised
approach. Despite these substantial computational and architectural
differences, our statistical analyses show that the FKM classifications
agree with the COCA classifications for more than 94 % of the cases in
the troposphere. This high degree of similarity is achieved because the
lidar-measured signatures of the majority of the clouds and the aerosols are
naturally distinct, and hence objective methods can independently and
effectively separate the two classes in most cases. Classification
differences most often occur in complex scenes (e.g., evaporating water
cloud filaments embedded in dense aerosol) or when observing diffuse
features that occur only intermittently (e.g., volcanic ash in the tropical
tropopause layer). The two methods examined in this study establish overall
classification correctness boundaries due to their differing algorithm
uncertainties. In addition to comparing the outputs from the two algorithms,
analysis of sampling, data training, performance measurements, fuzzy linear
discriminants, defuzzification, error propagation, and key parameters in
feature type discrimination with the FKM method are further discussed in
order to better understand the utility and limits of the application of
clustering algorithms to space lidar measurements. In general, we find that
both FKM and COCA classification uncertainties are only minimally affected
by noise in the CALIPSO measurements, though both algorithms can be
challenged by especially complex scenes containing mixtures of discrete
layer types. Our analysis results show that attenuated backscatter and
color ratio are the driving factors that separate water clouds from
aerosols; backscatter intensity, depolarization, and mid-layer altitude are
most useful in discriminating between aerosols and ice clouds; and the joint
distribution of backscatter intensity and depolarization ratio is critically
important for distinguishing ice clouds from water clouds.
Vorschau
Zitieren
Zeng