A comparative study of hard clustering algorithms for vegetation data

Pakgohar, N.; Rad, Javad Eshaghi; Gholami, Gholam Hossein; Alijanpour, Ahmad; Roberts, David W.

doi:10.1111/jvs.13042

Cited by 5 publications

(6 citation statements)

References 70 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The classifications were undertaken in three steps: (1) pre‐processing involved the selection of a distance measure and normalization of the data; (2) cluster analysis involved the selection and application of the clustering algorithm and its various parameters; (3) cluster validation involved the selection and application of appropriate internal validation techniques to evaluate the quality of the classification. Four clustering algorithms and four validation measures were explored based on demonstrated performance in recent literature (Aho et al, 2008 ; Handl et al, 2005 ; Lengyel et al, 2021 ; Pakgohar et al, 2021 ). We defined a vegetation classification as being comprised of a cluster of plots organized into units with discrete boundaries between them.…”

Section: Methodsmentioning

confidence: 99%

“…Lötter et al ( 2013 ) referred to this as “the classification conundrum”. The amount of research available which advocates particular methods, ideologies and approaches to classify vegetation (Feilhauer et al, 2020 ; Lengyel et al, 2021 ; Lortie et al, 2004 ; Lötter et al, 2013 ; Pakgohar et al, 2021 ), reflects the impracticality of the use of one universal approach in all environments. Nevertheless, there is general agreement that expert opinion is needed to select vegetation units at some stage in the classification process (Brown et al, 2013 ; Lötter et al, 2013 ; Mucina, 1997 ) even if this adds subjectivity to the classification, possibly resulting in bias (Lötter et al, 2013 ; Wolda, 1981 ), with little objective validation of clustering results.…”

Section: Introductionmentioning

confidence: 99%

“…Nevertheless, there is general agreement that expert opinion is needed to select vegetation units at some stage in the classification process (Brown et al, 2013 ; Lötter et al, 2013 ; Mucina, 1997 ) even if this adds subjectivity to the classification, possibly resulting in bias (Lötter et al, 2013 ; Wolda, 1981 ), with little objective validation of clustering results. However, recent classification methods, especially those used in data science (Flynt & Dean, 2016 ), have made it possible to formally test the effectiveness of classifications, thereby reducing the number of subjective choices (Lötter et al, 2013 ; Pakgohar et al, 2021 ). The existence of discrete groups in the data can thus be tested objectively, before expert interpretation is needed.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Can vegetation be discretely classified in species‐poor environments? Testing plant community concepts for vegetation monitoring on sub‐Antarctic Marion Island

Merwe

Greve

Skowno

et al. 2023

Ecology and Evolution

View full text Add to dashboard Cite

The updating and rethinking of vegetation classifications is important for ecosystem monitoring in a rapidly changing world, where the distribution of vegetation is changing. The general assumption that discrete and persistent plant communities exist that can be monitored efficiently, is rarely tested before undertaking a classification. Marion Island (MI) is comprised of species‐poor vegetation undergoing rapid environmental change. It presents a unique opportunity to test the ability to discretely classify species‐poor vegetation with recently developed objective classification techniques and relate it to previous classifications. We classified vascular species data of 476 plots sampled across MI, using Ward hierarchical clustering, divisive analysis clustering, non‐hierarchical kmeans and partitioning around medoids. Internal cluster validation was performed using silhouette widths, Dunn index, connectivity of clusters and gap statistic. Indicator species analyses were also conducted on the best performing clustering methods. We evaluated the outputs against previously classified units. Ward clustering performed the best, with the highest average silhouette width and Dunn index, as well as the lowest connectivity. The number of clusters differed amongst the clustering methods, but most validation measures, including for Ward clustering, indicated that two and three clusters are the best fit for the data. However, all classification methods produced weakly separated, highly connected clusters with low compactness and low fidelity and specificity to clusters. There was no particularly robust and effective classification outcome that could group plots into previously suggested vegetation units based on species composition alone. The relatively recent age ( c. 450,000 years B.P.), glaciation history (last glacial maximum 34,500 years B.P.) and isolation of the sub‐Antarctic islands may have hindered the development of strong vascular plant species assemblages with discrete boundaries. Discrete classification at the community‐level using species composition may not be suitable in such species‐poor environments. Species‐level, rather than community‐level, monitoring may thus be more appropriate in species‐poor environments, aligning with continuum theory rather than community theory.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Can vegetation be discretely classified in species‐poor environments? Testing plant community concepts for vegetation monitoring on sub‐Antarctic Marion Island

Merwe

Greve

Skowno

et al. 2023

Ecology and Evolution

View full text Add to dashboard Cite

show abstract

“…To obtain community assembly matrices, for VST transformed data, we used Jaccard dissimilarity index calculated from presence-absence data (Tian et al, 2019); while for Hellinger transformed data, we used the Euclidean dissimilarity index (Pakgohar et al, 2021).…”

Section: Statistical Analysesmentioning

confidence: 99%

Boreal moss-microbe interactions are revealed through metagenome assembly of novel bacterial species

Ishak

Faticov

Rondeau-Leclaire

et al. 2023

Preprint

View full text Add to dashboard Cite

- Moss microbial communities play important roles for ecosystem processes in boreal forests. Host identity and leaf litter presence can affect microbial community structures of mosses. However, the extent to which host-specific characteristics and land use type affect taxonomic and functional profiles of microbial communities of boreal mosses is still poorly understood. - We used shotgun metagenomic sequencing to characterize microbial taxonomy and metabolic KEGG ortholog (KO) profiles between green and brown sections of five moss species across different natural sites, and two moss species between a natural site and a mine site in Eeyou Istchee, Quebec, Canada. - Our results demonstrate that the abundance of nitrogen metabolic genes differ between moss sections and that mosses from natural and mine environment associate with different microbial taxa and KO profiles. Importantly, conditions at the mine site appear to favor microbial taxa that can tolerate perturbated environments, including taxa that can oxidize sulfur and participate in biofilm formation. - Overall, our results highlight that moss section, moss species identity, and land-use type are strong drivers of diversity and community structure of moss-associated microbial taxa and metabolic genes. These findings could have major implications for boreal forests facing climate change and anthropogenic pressures

show abstract

“…Regression analysis is to model the spectral and structural information directly with the measured species diversity indices, which is a mature and straightforward algorithm, but the applicability in different regions is poor (Ceballos et al, 2015). The clustering algorithm can evaluate species diversity by grouping trees with similar characteristics based on the biochemical and structural variation of different tree species (Asner et al, 2015;Padilla-Martinez et al, 2020;Pakgohar et al, 2021). Clustering can be used to identify patterns or trends in the distribution and abundance of different species within a forest ecosystem.…”

Section: Introductionmentioning

confidence: 99%

Individual tree-based forest species diversity estimation by classification and clustering methods using UAV data

Zheng

et al. 2023

Front. Ecol. Evol.

View full text Add to dashboard Cite

Monitoring forest species diversity is essential for biodiversity conservation and ecological management. Currently, unmanned aerial vehicle (UAV) remote sensing technology has been increasingly used in biodiversity monitoring due to its flexibility and low cost. In this study, we compared two methods for estimating forest species diversity indices, namely the spectral angle mapper (SAM) classification approach based on the established species-spectral library, and the self-adaptive Fuzzy C-Means (FCM) clustering algorithm by selected biochemical and structural features. We conducted this study in two complex subtropical forest areas, Mazongling (MZL) and Gonggashan (GGS) National Nature Forest Reserves using UAV-borne hyperspectral and LiDAR data. The results showed that the classification method performed better with higher values of R2 than the clustering algorithm for predicting both species richness (0.62 > 0.46 for MZL and 0.55 > 0.46 for GGS) and Shannon-Wiener index (0.64 > 0.58 for MZL, 0.52 > 0.47 for GGS). However, the Simpson index estimated by the classification method correlated less with the field measurements than the clustering algorithm (R2 = 0.44 and 0.83 for MZL and R2 = 0.44 and 0.62 for GGS). Our study demonstrated that the classification method could provide more accurate monitoring of forest diversity indices but requires spectral information of all dominant tree species at individual canopy scale. By comparison, the clustering method might introduce uncertainties due to the amounts of biochemical and structural inputs derived from the hyperspectral and LiDAR data, but it could acquire forest diversity patterns rapidly without distinguishing the specific tree species. Our findings underlined the advantages of UAV remote sensing for monitoring the species diversity in complex forest ecosystems and discussed the applicability of classification and clustering methods for estimating different individual tree-based species diversity indices.

show abstract

A comparative study of hard clustering algorithms for vegetation data

Cited by 5 publications

References 70 publications

Can vegetation be discretely classified in species‐poor environments? Testing plant community concepts for vegetation monitoring on sub‐Antarctic Marion Island

Can vegetation be discretely classified in species‐poor environments? Testing plant community concepts for vegetation monitoring on sub‐Antarctic Marion Island

Boreal moss-microbe interactions are revealed through metagenome assembly of novel bacterial species

Individual tree-based forest species diversity estimation by classification and clustering methods using UAV data

Contact Info

Product

Resources

About