We propose a new and simple framework for dimension reduction in the large p, small n setting. The framework decomposes the data into pieces, thereby enabling existing approaches for n > p to be adapted to n < p problems. Estimating a large covariance matrix, which is a very difficult task, is avoided. We propose two separate paths to implement the framework. Our paths provide sufficient procedures for identifying informative variables via a sequential approach.We illustrate the paths by using sufficient dimension reduction approaches, but the paths are very general. Empirical evidence demonstrates the efficacy of our paths. Additional simulations and applications are given in an on-line supplementary file.
Switchgrass (Panicum virgatum L.) is a warm-season, perennial grass valued as a promising candidate species for bioenergy feedstock production. Biomass yield is the most important trait for any bioenergy feedstock. This study was focused on understanding the genetics underlying biomass yield and feedstock quality traits in a “Kanlow” population. The objectives of this study were to (i) assess genetic variation (ii) estimate the narrow sense heritability, and (iii) predict genetic gain per cycle of selection for biomass yield and the components of lignocelluloses. Fifty-four Kanlow half-sib (KHS) families along with Kanlow check were planted in a randomized complete block design with three replications at two locations in Tennessee: Knoxville and Crossville. The data were recorded for two consecutive years: 2013 and 2014. The result showed a significant genetic variation for biomass yield (p < 0.05), hemicellulose concentration (p < 0.05), and lignin concentration (p < 0.01). The narrow sense heritability estimates for biomass yield was very low (0.10), indicating a possible challenge to improve this trait. A genetic gain of 16.5% is predicted for biomass yield in each cycle of selection by recombining parental clones of 10% of superior progenies.
The accuracy and quality of the landslide susceptibility map depend on the available landslide locations and the sampling strategy for absence data (non-landslide locations). In this study, we propose an objective method to determine the critical value for sampling absence data based on Mahalanobis distances (MD). We demonstrate this method on landslide susceptibility mapping of three subdistricts (Upazilas) of the Rangamati district, Bangladesh, and compare the results with the landslide susceptibility map produced based on the slope-based absence data sampling method. Using the 15 landslide causal factors, including slope, aspect, and plan curvature, we first determine the critical value of 23.69 based on the Chi-square distribution with 14 degrees of freedom. This critical value was then used to determine the sampling space for 261 random absence data. In comparison, we chose another set of the absence data based on a slope threshold of < 3°. The landslide susceptibility maps were then generated using the random forest model. The Receiver Operating Characteristic (ROC) curves and the Kappa index were used for accuracy assessment, while the Seed Cell Area Index (SCAI) was used for consistency assessment. The landslide susceptibility map produced using our proposed method has relatively high model fitting (0.87), prediction (0.85), and Kappa values (0.77). Even though the landslide susceptibility map produced by the slope-based sampling also has relatively high accuracy, the SCAI values suggest lower consistency. Furthermore, slope-based sampling is highly subjective; therefore, we recommend using MD -based absence data sampling for landslide susceptibility mapping.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.