2012
DOI: 10.1021/ci200552r
|View full text |Cite
|
Sign up to set email alerts
|

Speeding Up Chemical Searches Using the Inverted Index: The Convergence of Chemoinformatics and Text Search Methods

Abstract: In ligand-based screening, retrosynthesis, and other chemoinformatics applications, one of-ten seeks to search large databases of molecules in order to retrieve molecules that are similar to a given query. With the expanding size of molecular databases, the efficiency and scalability of data structures and algorithms for chemical searches are becoming increasingly important. Remarkably, both the chemoinformatics and information retrieval communities have converged on similar solutions whereby molecules or docu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
26
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 21 publications
(26 citation statements)
references
References 33 publications
0
26
0
Order By: Relevance
“…Similarity queries with a Tanimoto threshold on binary fingerprints of chemicals are widely used in chemoinformatics applications [10,7,6,19]. There exist many specialized solutions [23,2,18,22,20].…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Similarity queries with a Tanimoto threshold on binary fingerprints of chemicals are widely used in chemoinformatics applications [10,7,6,19]. There exist many specialized solutions [23,2,18,22,20].…”
Section: Related Workmentioning
confidence: 99%
“…At query time, a depth first traversal on the tree is performed together with pruning based on the number of 1-bits. One of the latest methods is [20], where each fingerprint is transformed into a set (as in Section 2) and inverted index is built on the set elements. This essentially reduces the original problem into a set overlap search problem, where the DivideSkip method proposed in their earlier work [13] is employed for query processing.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Similarity search is widely used in chemical informatics to predict and optimize properties of existing compounds [13,14]. A fundamental problem is to find all the molecules whose fingerprints have Tanimoto similarity no less than a given value.…”
Section: Introductionmentioning
confidence: 99%
“…As more HTS facilities are established and the number of available chemicals increases rapidly, performing large-scale searches and comparisons are becoming even more common and important. 3 Numerous methods have been developed to help speed up searches against chemical libraries [1][2][3][4][5][6][7][8] . Chemical fingerprinting, for example, is a common method of simplifying chemical representation by describing the structural properties of a chemical as a one-dimensional feature string.…”
Section: Introductionmentioning
confidence: 99%