The discovery of novel materials and functional molecules can help to solve some of society’s most urgent challenges, ranging from efficient energy harvesting and storage to uncovering novel pharmaceutical drug candidates. Traditionally matter engineering–generally denoted as inverse design–was based massively on human intuition and high-throughput virtual screening. The last few years have seen the emergence of significant interest in computer-inspired designs based on evolutionary or deep learning methods. The major challenge here is that the standard strings molecular representation SMILES shows substantial weaknesses in that task because large fractions of strings do not correspond to valid molecules. Here, we solve this problem at a fundamental level and introduce SELFIES (SELF-referencIng Embedded Strings), a string-based representation of molecules which is 100% robust. Every SELFIES string corresponds to a valid molecule, and SELFIES can represent every molecule. SELFIES can be directly applied in arbitrary machine learning models without the adaptation of the models; each of the generated molecule candidates is valid. In our experiments, the model’s internal memory stores two orders of magnitude more diverse molecules than a similar test with SMILES. Furthermore, as all molecules are valid, it allows for explanation and interpretation of the internal working of the generative models.
One of the recent proposals for the design of state-of-the-art emissive materials for organic light emitting diodes (OLEDs) is the principle of thermally activated delayed fluorescence (TADF). The underlying idea is to enable facile thermal upconversion of excited state triplets, which are generated upon electron-hole recombination, to excited state singlets by minimizing the corresponding energy difference resulting in devices with up to 100% internal quantum efficiencies (IQEs). Ideal emissive materials potentially surpassing TADF emitters should have both negative singlet-triplet gaps and appreciable fluorescence rates to maximize reverse intersystem crossing (rISC) rates from excited triplets to singlets while minimizing ISC rates and triplet state occupation leading to long-term operational stability. However, molecules with negative singlet-triplet gaps are extremely rare and, to the best of our knowledge, not emissive. In this work, based on computational studies, we describe the first molecules with negative singlet-triplet gaps and considerable fluorescence rates and show that they are more common than hypothesized previously. File list (2) download file view on ChemRxiv manuscript.pdf (1.92 MiB) download file view on ChemRxiv supporting.pdf (338.61 KiB)
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.