2019
DOI: 10.1186/s13321-019-0367-2
|View full text |Cite|
|
Sign up to set email alerts
|

Interoperable chemical structure search service

Abstract: Motivation The existing connections between large databases of chemicals, proteins, metabolites and assays offer valuable resources for research in fields ranging from drug design to metabolomics. Transparent search across multiple databases provides a way to efficiently utilize these resources. To simplify such searches, many databases have adopted semantic technologies that allow interoperable querying of the datasets using SPARQL query language. However, the interoperable interfaces of the chemi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
1
1

Relationship

3
5

Authors

Journals

citations
Cited by 14 publications
(11 citation statements)
references
References 15 publications
(16 reference statements)
0
10
0
1
Order By: Relevance
“…The query shown in Figure S7 will retrieve all enzymes annotated in UniProtKB/Swiss-Prot that metabolize compounds similar to patulin—but not necessarily identical to patulin. The query uses the SMILES representation (Simplified Molecular-Input Line-Entry System) (opensmiles.org) of patulin, as required for structure searches by IDSM, and uses the sachem:similaritySearch procedure call pattern developed by the IDSM team [ 33 ] with a similarity score threshold of 0.8 (the similarity score is based on Jaccard similarity of Morgan-style connectivity fingerprints). The query is designed with two nested services (“calls”) as illustrated in Figure 5 .…”
Section: Resultsmentioning
confidence: 99%
“…The query shown in Figure S7 will retrieve all enzymes annotated in UniProtKB/Swiss-Prot that metabolize compounds similar to patulin—but not necessarily identical to patulin. The query uses the SMILES representation (Simplified Molecular-Input Line-Entry System) (opensmiles.org) of patulin, as required for structure searches by IDSM, and uses the sachem:similaritySearch procedure call pattern developed by the IDSM team [ 33 ] with a similarity score threshold of 0.8 (the similarity score is based on Jaccard similarity of Morgan-style connectivity fingerprints). The query is designed with two nested services (“calls”) as illustrated in Figure 5 .…”
Section: Resultsmentioning
confidence: 99%
“…(Kratochvíl et al, 2018). The engine is used by the Integrated Database of Small Molecules (IDSM) that operates, among other things, several dedicated endpoints allowing structural search in selected small-molecule datasets via SPARQL (Kratochvíl et al, 2019). To allow substructure and similarity searches via SPARQL also on compounds from WD, we created a dedicated IDSM/Sachem endpoint for WD as well.…”
Section: Wikidatamentioning
confidence: 99%
“…To address this issue, Galgonek et al developed an in-house SPARQL engine that allows utilization of Sachem, a high-performance chemical DB cartridge for PostgreSQL for fingerprint-guided substructure and similarity search (Kratochvíl et al, 2018). The engine is used by the Integrated Database of Small Molecules (IDSM) that operates, among other things, several dedicated endpoints allowing structural search in selected small-molecule datasets via SPARQL (Kratochvíl et al, 2019). To allow substructure and similarity searches via SPARQL also on compounds from Wikidata, a dedicated IDSM/Sachem endpoint was created for the LOTUS project.…”
Section: User Interaction With Lotus Datamentioning
confidence: 99%
“…The query shown in Figure 3 makes use of the ChEBI ontology to find information relevant to cholesterol and other sterols. Users might also identify derivatives of cholesterol or cholesterol like molecules using SPARQL endpoints that support chemical similarity or chemical substructure searches over ChEBI, such as the Integrated Database of Small Molecules (IDSM) (Kratochvil et al, 2019). These advanced search capabilities could be further combined with those of a range of other SPARQL endpoints from resources such as Ensembl (Zerbino et al, 2018), OMA (Altenhoff et al, 2018), OrthoDB (Kriventseva et al, 2019) and Bgee (Bastian et al, 2008), in order to explore small molecule metabolism in the context of genomic organization, variation, evolution and anatomy.…”
Section: Rhea and The Uniprot Sparql Endpointmentioning
confidence: 99%