BackgroundNowadays, it is possible to collect expression levels of a set of genes from a set of biological samples during a series of time points. Such data have three dimensions: gene-sample-time (GST). Thus they are called 3D microarray gene expression data. To take advantage of the 3D data collected, and to fully understand the biological knowledge hidden in the GST data, novel subspace clustering algorithms have to be developed to effectively address the biological problem in the corresponding space.ResultsWe developed a subspace clustering algorithm called Order Preserving Triclustering (OPTricluster), for 3D short time-series data mining. OPTricluster is able to identify 3D clusters with coherent evolution from a given 3D dataset using a combinatorial approach on the sample dimension, and the order preserving (OP) concept on the time dimension. The fusion of the two methodologies allows one to study similarities and differences between samples in terms of their temporal expression profile. OPTricluster has been successfully applied to four case studies: immune response in mice infected by malaria (Plasmodium chabaudi), systemic acquired resistance in Arabidopsis thaliana, similarities and differences between inner and outer cotyledon in Brassica napus during seed development, and to Brassica napus whole seed development. These studies showed that OPTricluster is robust to noise and is able to detect the similarities and differences between biological samples.ConclusionsOur analysis showed that OPTricluster generally outperforms other well known clustering algorithms such as the TRICLUSTER, gTRICLUSTER and K-means; it is robust to noise and can effectively mine the biological knowledge hidden in the 3D short time-series gene expression data.
This work reports on a comprehensive analysis of the predictive capacity and underlying physicochemical trends provided by d-band based electronic structure features as applied to single-atom alloys (SAAs). Taking CO adsorption energies at kink sites as a model framework, SAA adsorption trends are examined across a range of substrates with vastly differing intrinsic CO adsorption trends. Through this approach, it is demonstrated that SAA adsorption properties can be highly transferable, often displaying atom-like behavior independent of the host substrate, particularly in groups 6 through 12 of the periodic table. The predictability of such SAA behavior is found, however, to be highly qualitative for single d-band based electronic structure features. Nevertheless, it is shown that predictive capacity can be greatly improved through the creation of a feature space comprised of as few as 8 electronic structure features. Intriguingly, following the framework of Hammer and Nørskov, the machine learning accuracy of d-band based electronic structure features is shown to be sensitive to the atomic configuration diversity present in the training ensemble with model accuracy systematically improving through restrictions in the configurational space. More directly, it is shown that elements to the far left of the transition metal block such as Zr and Hf may exhibit CO binding properties comparable to Cu in the CO 2 reduction reaction. However, impurities from groups 6−10 are demonstrated to overbind in a highly transferable manner in line with established pure substrate trends and are likely to act as unwanted posing species concerning CO and the overall CO 2 reduction reaction. The results of this work broadly lay out the predictive capabilities of d-band features as applied to SAAs, as well as their propensity for exhibiting transferable binding properties among d-band substrates.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.