It is important to test methods for simulating water, but small water clusters for which benchmarks are available are not very representative of the bulk. Here we present benchmark calculations, in particular CCSD(T) calculations at the complete basis set limit, for water 26-mers drawn from Monte Carlo simulations of bulk water. These clusters are large enough that each water molecule participates in 2.5 hydrogen bonds on average. The electrostatically embedded three-body approximation with CCSD(T) embedded dimers and trimers reproduces the relative binding energies of eight clusters with a mean unsigned error (MUE, kcal per mole of water molecules) of only 0.009 and 0.015 kcal for relative and absolute binding energies, respectively. Using only embedded dimers (electrostatically embedded pairwise approximation) raises these MUEs to 0.038 and 0.070 kcal, and computing the energies with the M11 exchange-correlation functional, which is very economical, yields errors of only 0.029 and 0.042 kcal.
There has been increased interest in discovering combinations of single-nucleotide polymorphisms (SNPs) that are strongly associated with a phenotype even if each SNP has little individual effect. Efficient approaches have been proposed for searching two-locus combinations from genome-wide datasets. However, for high-order combinations, existing methods either adopt a brute-force search which only handles a small number of SNPs (up to few hundreds), or use heuristic search that may miss informative combinations. In addition, existing approaches lack statistical power because of the use of statistics with high degrees-of-freedom and the huge number of hypotheses tested during combinatorial search. Due to these challenges, functional interactions in high-order combinations have not been systematically explored. We leverage discriminative-pattern-mining algorithms from the data-mining community to search for high-order combinations in case-control datasets. The substantially improved efficiency and scalability demonstrated on synthetic and real datasets with several thousands of SNPs allows the study of several important mathematical and statistical properties of SNP combinations with order as high as eleven. We further explore functional interactions in high-order combinations and reveal a general connection between the increase in discriminative power of a combination over its subsets and the functional coherence among the genes comprising the combination, supported by multiple datasets. Finally, we study several significant high-order combinations discovered from a lung-cancer dataset and a kidney-transplant-rejection dataset in detail to provide novel insights on the complex diseases. Interestingly, many of these associations involve combinations of common variations that occur in small fractions of population. Thus, our approach is an alternative methodology for exploring the genetics of rare diseases for which the current focus is on individually rare variations.
We report measurements of adsorption isotherms and the determination of the isosteric heats of adsorption of several small gases (H, D, Ne, N, CO, CH, CH, Ar, Kr, and Xe) on the metal-organic framework (MOF) NU-1000, which is one of the most thermally stable MOFs. It has transition-metal nodes of formula Zr(μ-OH)(μ-O)(OH)(OH) that resemble hydrated ZrO clusters and can serve as catalysts or catalyst supports. The linkers in this MOF are pyrenes linked to the nodes via the carboxylate groups of benzoates. The broad range of adsorbates studied here allows us to compare trends both with adsorption on other surfaces and with density functional calculations also presented here. The experimental isotherms indicate similar filling of the MOF surface by the different gases, starting with strong adsorption sites near the Zr atoms, a result corroborated by the density functional calculations. This adsorption is followed by the filling of other adsorption sites on the nodes and organic framework. Capillary condensation occurs in wide pores after completion of a monolayer. The total amount adsorbed for all the gases is the equivalent of two complete monolayers. The experimental isosteric heats of adsorption are nearly proportional to the atom-atom (or molecule-molecule) Lennard-Jones well-depth parameters of the adsorbates but ∼13-fold larger. The density functional calculations show a similar trend but with much more scatter and heats that are usually greater (by 30%, on average).
Insertional mutagenesis screens in mice are used to identify individual genes that drive tumor formation. In these screens, candidate cancer genes are identified if their genomic location is proximal to a common insertion site (CIS) defined by high rates of transposon or retroviral insertions in a given genomic window. In this article, we describe a new method for defining CISs based on a Poisson distribution, the Poisson Regression Insertion Model, and show that this new method is an improvement over previously described methods. We also describe a modification of the method that can identify pairs and higher orders of co-occurring common insertion sites. We apply these methods to two data sets, one generated in a transposon-based screen for gastrointestinal tract cancer genes and another based on the set of retroviral insertions in the Retroviral Tagged Cancer Gene Database. We show that the new methods identify more relevant candidate genes and candidate gene pairs than found using previous methods. Identification of the biologically relevant set of mutations that occur in a single cell and cause tumor progression will aid in the rational design of single and combinatorial therapies in the upcoming age of personalized cancer therapy.
Although many transition metal complexes are known to have high multireference character, the multireference character of main-group closed-shell singlet diatomic molecules like BeF, CaO, and MgO has been less studied. However, many group-1 and group-2 diatomic molecules do have multireference character, and they provide informative systems for studying multireference character because they are simpler than transition metal compounds. The goal of the present work is to understand these multireference systems better so that, ultimately, we can apply what we learn to more complicated multireference systems and to the design of new exchange-correlation functionals for treating multireference systems more adequately. Fourteen main-group diatomic molecules and one triatomic molecule (including radicals, cations, and anions, as well as neutral closed-shell species) have been studied for this article. Eight of these molecules contain a group-1 element, and six contain a group-2 element. Seven of these molecules are multireference systems, and eight of them are single-reference systems. Fifty-three exchange-correlation functionals of 11 types [local spin-density approximation (LSDA), generalized gradient approximation (GGA), nonseparable gradient approximation (NGA), global-hybrid GGA, meta-GGA, meta-NGA, global-hybrid meta GGA, range-separated hybrid GGA, range-separated hybrid meta-GGA, range-separated hybrid meta-NGA, and DFT augmented with molecular mechanics damped dispersion (DFT-D)] and the Hartree-Fock method have been applied to calculate the bond distance, bond dissociation energy (BDE), and dipole moment of these molecules. All of the calculations are converged to a stable solution by allowing the symmetry of the Slater determinant to be broken. A reliable functional should not only predict an accurate BDE but also predict accurate components of the BDE, so each bond dissociation energy has been decomposed into ionization potential (IP) of the electropositive element, electron affinity of the electronegative bonding partner (EA), atomic excitation energy (EE) to prepare the valence states of the interacting partners, and interaction energy (IE) of the valence-prepared states. Adding Hartree-Fock exchange helps to obtain better results for atomic excitation energy, and this leads to improvements in getting the right answer for the right reason. The following functionals are singled out for reasonably good performance on all three of bond distance, BDE, and dipole moment: B97-1, B97-3, MPW1B95, M05, M06, M06-2X, M08-SO, N12-SX, O3LYP, TPSS, τ-HCTHhyb, and GAM; all but two (TPSS and GAM) of these functionals are hybrid functionals.
The production process of a smart factory is complex and dynamic. As the core of manufacturing management, the research into the flexible job shop scheduling problem (FJSP) focuses on optimizing scheduling decisions in real time, according to the changes in the production environment. In this paper, deep reinforcement learning (DRL) is proposed to solve the dynamic FJSP (DFJSP) with random job arrival, with the goal of minimizing penalties for earliness and tardiness. A double deep Q-networks (DDQN) architecture is proposed and state features, actions and rewards are designed. A soft ε-greedy behavior policy is designed according to the scale of the problem. The experimental results show that the proposed DRL is better than other reinforcement learning (RL) algorithms, heuristics and metaheuristics in terms of solution quality and generalization. In addition, the soft ε-greedy strategy reasonably balances exploration and exploitation, thereby improving the learning efficiency of the scheduling agent. The DRL method is adaptive to the dynamic changes of the production environment in a flexible job shop, which contributes to the establishment of a flexible scheduling system with self-learning, real-time optimization and intelligent decision-making.
In order to understand what governs the accuracy of approximate exchange-correlation functionals for intrinsically multiconfigurational systems containing metal atoms, the properties of the ground electronic state of CaO have been studied in detail. We first applied the T1, TAE(T), B1, and M diagnostics to CaO and confirmed that CaO is an intrinsically multiconfigurational system. Then, we compared the bond dissociation energies (BDEs) of CaO as calculated by 49 exchange-correlation functionals, three exchange-only functionals, and the HF method. To analyze the error in the BDEs for the various functionals, we decomposed each calculated BDE into four components, in particular the ionization potential, the electron affinity, the atomic excitation energy of the metal cation to prepare the valence state, and the interaction energy between prepared states. We found that the dominant error occurs in the calculated atomic excitation energy of the cation. Third, we compared dipole moments of CaO as calculated by the 53 methods, and we analyzed the dipole moments in terms of partial atomic charges to understand the contribution of ionic bonding and how it is affected by errors in the calculated ionization potential of the metal atom. We then analyzed the dipole moment in terms of the charge distribution among orbitals, and we found that the orbital charge distribution does not correlate well with the difference between the calculated ionization potential and electron affinity. Fourth, we examined the potential curves and internuclear distance dependence of the orbital energies of the lowest-energy CaO singlet and triplet states to analyze the near-degeneracy aspect of the correlation energy. The most important conclusion is that the error tends to be dominated by the error in the relative energies of s and d orbitals in Ca(+), and the most popular density functionals predict this excitation energy poorly. Thus, even if they were to predict the BDE reasonably well, it would be due to cancellation of errors. The effect of the cation excitation energy can be understood in terms of an orbital picture, as follows. For most functionals the predicted cation excitation energy is too small, so it is too easy to delocalize charge from the oxygen 2p orbital to the Ca(+) d orbital; this overestimates the covalency and explains why most functionals overestimate the bond energy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.