DNA nucleobase sequence controls the size of DNA-stabilized silver clusters, leading to their well-known yet little understood sequence-tuned colors. The enormous space of possible DNA sequences for templating clusters has challenged the understanding of how sequence selects cluster properties and has limited the design of applications that employ these clusters. We investigate the genomic role of DNA sequence for fluorescent silver clusters using a data-driven approach. Employing rapid parallel silver cluster synthesis and fluorimetry, we determine the fluorescence spectra of silver cluster products stabilized by 1432 distinct DNA oligomers. By applying pattern recognition algorithms to this large experimental data set, we discover certain DNA base patterns, or "motifs," that correlate to silver clusters with similar fluorescence spectra. These motifs are employed in machine learning classifiers to predictively design DNA template sequences for specific fluorescence color bands. Our method improves selectivity of templates by 330% for silver clusters with peak emission wavelengths beyond 660 nm. The discovered base motifs also provide physical insights into how DNA sequence controls silver cluster size and color. This predictive design approach for color of DNA-stabilized silver clusters exhibits the potential of machine learning and data mining to increase the precision and efficiency of nanomaterials design, even for a soft-matter-inorganic hybrid system characterized by an extremely large parameter space.
Discriminative base motifs within DNA templates for fluorescent silver clusters are identified using methods that combine large experimental data sets with machine learning tools for pattern recognition. Combining the discovery of certain multibase motifs important for determining fluorescence brightness with a generative algorithm, the probability of selecting DNA templates that stabilize fluorescent silver clusters is increased by a factor of >3.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.