Background Glycans are complex sugar chains, crucial to many biological processes. By participating in binding interactions with proteins, glycans often play key roles in host–pathogen interactions. The specificities of glycan-binding proteins, such as lectins and antibodies, are governed by motifs within larger glycan structures, and improved characterisations of these determinants would aid research into human diseases. Identification of motifs has previously been approached as a frequent subtree mining problem, and we extend these approaches with a glycan notation that allows recognition of terminal motifs. Results In this work, we customised a frequent subtree mining approach by altering the glycan notation to include information on terminal connections. This allows specific identification of terminal residues as potential motifs, better capturing the complexity of glycan-binding interactions. We achieved this by including additional nodes in a graph representation of the glycan structure to indicate the presence or absence of a linkage at particular backbone carbon positions. Combining this frequent subtree mining approach with a state-of-the-art feature selection algorithm termed minimum-redundancy, maximum-relevance (mRMR), we have generated a classification pipeline that is trained on data from a glycan microarray. When applied to a set of commonly used lectins, the identified motifs were consistent with known binding determinants. Furthermore, logistic regression classifiers trained using these motifs performed well across most lectins examined, with a median AUC value of 0.89. Conclusions We present here a new subtree mining approach for the classification of glycan binding and identification of potential binding motifs. The Carbohydrate Classification Accounting for Restricted Linkages (CCARL) method will assist in the interpretation of glycan microarray experiments and will aid in the discovery of novel binding motifs for further experimental characterisation.
During COVID-19 lockdowns, online learning activities had to be developed for the Undergraduate and Masters by Coursework Bioinformatics students at RMIT University. Therefore, we designed an integrative, industry-based research assignment, which guided the students through a drug discovery project from target identification to lead optimization. The students were able to utilize this real-life scenario to apply multiple diverse but complementary bioinformatic principles to analyze biological and chemical data leading to meaningful predictions. This activity was utilized as a final assessment of the students' knowledge.
The blood fluke Cardicola forsteri (Trematoda: Aporocotylidae) is a pathogen of ranched bluefin tuna in Japan and Australia. Genomics of Cardicola spp. have thus far been limited to molecular phylogenetics of select gene sequences. In this study, sequencing of the C. forsteri genome was performed using Illumina short-read and Oxford Nanopore long-read technologies. The sequences were assembled de novo using a hybrid of short and long reads, which produced a high-quality contig-level assembly (N50 > 430 kb and L50 = 138). The assembly was also relatively complete and unfragmented, comprising 66% and 7.2% complete and fragmented metazoan Benchmarking Universal Single-Copy Orthologs (BUSCOs), respectively. A large portion (> 55%) of the genome was made up of intergenic repetitive elements, primarily long interspersed nuclear elements (LINEs), while protein-coding regions cover > 6%. Gene prediction identified 8,564 hypothetical polypeptides, > 77% of which are homologous to published sequences of other species. The identification of select putative proteins, including cathepsins, calpains, tetraspanins, and glycosyltransferases is discussed. This is the first genome assembly of any aporocotylid, a major step toward understanding of the biology of this family of fish blood flukes and their interactions within hosts.
Aporocotylid blood flukes Cardicola forsteri and C. orientalis are an ongoing health concern for Southern Bluefin Tuna (SBT), Thunnus maccoyii, ranched in Australia. Therapeutic application of praziquantel (PZQ) has reduced SBT mortalities, however PZQ is not a residual treatment therefore reinfection can occur after the single treatment application. This study documents the epidemiology of Cardicola spp. infection in ranched SBT post treatment over three ranching seasons (2018, 2019 and 2021). Infection prevalence (percentage of SBT affected) and intensity (parasite load) was determined by adult fluke counts from heart, egg counts from gill filaments and the use of specific quantitative polymerase chain reaction (qPCR) for detection of C. forsteri and C. orientalis ITS-2 DNA in SBT hearts and gills. SBT Condition Index decreased as intensity of Cardicola spp. DNA in SBT gills increased, suggesting blood fluke infection had a negative effect on SBT growth (Spearman’s r = −0.2426, d.f. = 138, p = 0.0041). Prevalence and intensity of infection indicated PZQ remained highly effective at controlling Cardicola spp. infection in ranched SBT, 10 years after PZQ administration began in this industry. Company A had the highest prevalence and intensity of Cardicola spp. infection in 2018, and Company G had the highest in 2019. No consistent pattern was seen in 2021. Overall, intensity of infection did not increase as ranching duration increased post treatment. Results from this study improve our knowledge of the biology of blood flukes and helps the SBT industry to modify or design new blood fluke management strategies to reduce health risks and improve performance of SBT.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.