Several efficient correspondence graph-based algorithms for determining the maximum common substructure (MCS) of a pair of molecules have been published in the literature. The extension of the problem to three or more molecules is however nontrivial; heuristics used to increase the efficiency in the two-molecule case are either inapplicable to the many-molecule case or do not provide significant speedups. Our specific algorithmic contribution is two-fold. First, we show how the correspondence graph approach for the two-molecule case can be generalized to obtain an algorithm that is guaranteed to find the optimum connected MCS of multiple molecules, and that runs fast on most families of molecules using a new divide-and-conquer strategy that has hitherto not been reported in this context. Second, we provide a characterization of those compound families for which the algorithm might run slowly, along with a heuristic for speeding up computations on these families. We also extend the above algorithm to a heuristic algorithm to find the disconnected MCS of multiple molecules and to an algorithm for clustering molecules into groups, with each group sharing a substantial MCS. Our methods are flexible in that they provide exquisite control on various matching criteria used to define a common substructure.
Background Multi–gene panel sequencing using next-generation sequencing (NGS) methods is a key tool for genomic medicine. However, with an estimated 140 000 genomic tests available, current system inefficiencies result in high genetic-testing costs. Reduced testing costs are needed to expand the availability of genomic medicine. One solution to improve efficiency and lower costs is to calculate the most cost-effective set of panels for a typical pattern of test requests. Methods We compiled rare diseases, associated genes, point prevalence, and test-order frequencies from a representative laboratory. We then modeled the costs of the relevant steps in the NGS process in detail. Using a simulated annealing-based optimization procedure, we determined panel sets that were more cost-optimal than whole exome sequencing (WES) or clinical exome sequencing (CES). Finally, we repeated this methodology to cost-optimize pharmacogenomics (PGx) testing. Results For rare disease testing, we show that an optimal choice of 4–6 panels, uniquely covering genes that comprise 95% of the total prevalence of monogenic diseases, saves $257–304 per sample compared with WES, and $66–135 per sample compared with CES. For PGx, we show that the optimal multipanel solution saves $6–7 (27%–40%) over a single panel covering all relevant gene–drug associations. Conclusions Laboratories can reduce costs using the proposed method to obtain and run a cost-optimal set of panels for specific test requests. In addition, payers can use this method to inform reimbursement policy.
A comprehensive SARS-CoV-2 genomic surveillance programme that integrates logistics, laboratory work, bioinformatics, analytics, and timely reporting was deployed through a public-private partnership in the city of Bengaluru, Karnataka in India. As a result, 12461 samples have been sequenced and reported to the Karnataka State public health officials as time-sensitive, decision support during the last one year and uploaded in global public databases in a timely manner. This programme has developed an analytics platform for studying SARS-CoV-2 sequences and their epidemiological context. Continuous sequencing effort enabled timely detection of emergence of Omicron variant in India and the subsequent spread of the same and its sub-lineages with more logistic growth (BA.10, BA.12 and BA.5) in Bengaluru. Our data also helped to provide timely information on variants to determine which of the Variants of Concern tracked globally, were observed in Bengaluru, ensuring targeted efforts and reducing unwarranted fear. This effort highlights the importance of, and the urgent need to, increase genomic surveillance to support the states with limited sequencing and bioinformatics capacity. We describe the development and deployment of this end-to-end solution for genomic surveillance of SARS-CoV-2 in the city of Bengaluru.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.