Antimicrobial peptides (AMPs) are important components of the innate immune system that have been found to be effective against disease causing pathogens. Identification of AMPs through wet-lab experiment is expensive. Therefore, development of efficient computational tool is essential to identify the best candidate AMP prior to the in vitro experimentation. In this study, we made an attempt to develop a support vector machine (SVM) based computational approach for prediction of AMPs with improved accuracy. Initially, compositional, physico-chemical and structural features of the peptides were generated that were subsequently used as input in SVM for prediction of AMPs. The proposed approach achieved higher accuracy than several existing approaches, while compared using benchmark dataset. Based on the proposed approach, an online prediction server iAMPpred has also been developed to help the scientific community in predicting AMPs, which is freely accessible at http://cabgrid.res.in:8080/amppred/. The proposed approach is believed to supplement the tools and techniques that have been developed in the past for prediction of AMPs.
Selection of informative genes is an important problem in gene expression studies. The small sample size and the large number of genes in gene expression data make the selection process complex. Further, the selected informative genes may act as a vital input for gene co-expression network analysis. Moreover, the identification of hub genes and module interactions in gene co-expression networks is yet to be fully explored. This paper presents a statistically sound gene selection technique based on support vector machine algorithm for selecting informative genes from high dimensional gene expression data. Also, an attempt has been made to develop a statistical approach for identification of hub genes in the gene co-expression network. Besides, a differential hub gene analysis approach has also been developed to group the identified hub genes into various groups based on their gene connectivity in a case vs. control study. Based on this proposed approach, an R package, i.e., dhga (https://cran.r-project.org/web/packages/dhga) has been developed. The comparative performance of the proposed gene selection technique as well as hub gene identification approach was evaluated on three different crop microarray datasets. The proposed gene selection technique outperformed most of the existing techniques for selecting robust set of informative genes. Based on the proposed hub gene identification approach, a few number of hub genes were identified as compared to the existing approach, which is in accordance with the principle of scale free property of real networks. In this study, some key genes along with their Arabidopsis orthologs has been reported, which can be used for Aluminum toxic stress response engineering in soybean. The functional analysis of various selected key genes revealed the underlying molecular mechanisms of Aluminum toxic stress response in soybean.
The ToxA–Tsn1 system is an example of an inverse gene-for-gene relationship. The gene ToxA encodes a host-selective toxin (HST) which functions as a necrotrophic effector and is often responsible for the virulence of the pathogen. The genomes of several fungal pathogens (e.g., Pyrenophora tritici-repentis, Parastagonospora nodorum, and Bipolaris sorokiniana) have been shown to carry the ToxA gene. Tsn1 is a sensitivity gene in the host, whose presence generally helps a ToxA-positive pathogen to cause spot blotch in wheat. Cultivars lacking Tsn1 are generally resistant to spot blotch; this resistance is attributed to a number of other known genes which impart resistance in the absence of Tsn1. In the present study, 110 isolates of B. sorokiniana strains, collected from the ME5A and ME4C megaenvironments of India, were screened for the presence of the ToxA gene; 77 (70%) were found to be ToxA positive. Similarly, 220 Indian wheat cultivars were screened for the presence of the Tsn1 gene; 81 (36.8%) were found to be Tsn1 positive. When 20 wheat cultivars (11 with Tsn1 and 9 with tsn1) were inoculated with ToxA-positive isolates, seedlings of only those carrying the Tsn1 allele (not tsn1) developed necrotic spots surrounded by a chlorotic halo. No such distinction between Tsn1 and tsn1 carriers was observed when adult plants were inoculated. This study suggests that the absence of Tsn1 facilitated resistance against spot blotch of wheat. Therefore, the selection of wheat genotypes for the absence of the Tsn1 allele can improve resistance to spot blotch.
Genome wide association study (GWAS) was conducted for 14 agronomic traits in wheat following widely used single locus single trait (SLST) approach, and two recent approaches viz. multi locus mixed model (MLMM), and multi-trait mixed model (MTMM). Association panel consisted of 230 diverse Indian bread wheat cultivars (released during 1910–2006 for commercial cultivation in different agro-climatic regions in India). Three years phenotypic data for 14 traits and genotyping data for 250 SSR markers (distributed across all the 21 wheat chromosomes) was utilized for GWAS. Using SLST, as many as 213 MTAs (p ≤ 0.05, 129 SSRs) were identified for 14 traits, however, only 10 MTAs (~9%; 10 out of 123 MTAs) qualified FDR criteria; these MTAs did not show any linkage drag. Interestingly, these genomic regions were coincident with the genomic regions that were already known to harbor QTLs for same or related agronomic traits. Using MLMM and MTMM, many more QTLs and markers were identified; 22 MTAs (19 QTLs, 21 markers) using MLMM, and 58 MTAs (29 QTLs, 40 markers) using MTMM were identified. In addition, 63 epistatic QTLs were also identified for 13 of the 14 traits, flag leaf length (FLL) being the only exception. Clearly, the power of association mapping improved due to MLMM and MTMM analyses. The epistatic interactions detected during the present study also provided better insight into genetic architecture of the 14 traits that were examined during the present study. Following eight wheat genotypes carried desirable alleles of QTLs for one or more traits, WH542, NI345, NI170, Sharbati Sonora, A90, HW1085, HYB11, and DWR39 (Pragati). These genotypes and the markers associated with important QTLs for major traits can be used in wheat improvement programs either using marker-assisted recurrent selection (MARS) or pseudo-backcrossing method.
Finger millet (Eleusine coracana L.) is an important dry-land cereal in Asia and Africa because of its ability to provide assured harvest under extreme dry conditions and excellent nutritional properties. However, the genetic improvement of the crop is lacking in the absence of suitable genomic resources for reliable genotype-phenotype associations. Keeping this in view, a diverse global finger millet germplasm collection of 113 accessions was evaluated for 14 agro-morphological characters in two environments viz. ICAR-Vivekananda Institute of Hill Agriculture, Almora (E1) and Crop Research Centre (CRC), GBPUA&T, Pantnagar (E2), India. Principal component analysis and cluster analysis of phenotypic data separated the Indian and exotic accessions into two separate groups. Previously generated SNPs through genotyping by sequencing (GBS) were used for association mapping to identify reliable marker(s) linked to grain yield and its component traits. The marker trait associations were determined using single locus single trait (SLST), multi-locus mixed model (MLMM) and multi-trait mixed model (MTMM) approaches. SLST led to the identification of 20 marker-trait associations (MTAs) (p value<0.01 and <0.001) for 5 traits. While advanced models, MLMM and MTMM resulted in additional 36 and 53 MTAs, respectively. Nine MTAs were common out of total 109 associations in all the three mapping approaches (SLST, MLMM and MTMM). Among these nine SNPs, five SNP sequences showed homology to candidate genes of Oryza sativa (Rice) and Setaria italica (Foxtail millet), which play an important role in flowering, maturity and grain yield. In addition, 67 and 14 epistatic interactions were identified for 10 and 7 traits at E1 and E2 locations, respectively. Hence, the 109 novel SNPs associated with important agro-morphological traits, reported for the first time in this study could be precisely utilized in finger millet genetic improvement after validation.
BackgroundIdentification of unknown fungal species aids to the conservation of fungal diversity. As many fungal species cannot be cultured, morphological identification of those species is almost impossible. But, DNA barcoding technique can be employed for identification of such species. For fungal taxonomy prediction, the ITS (internal transcribed spacer) region of rDNA (ribosomal DNA) is used as barcode. Though the computational prediction of fungal species has become feasible with the availability of huge volume of barcode sequences in public domain, prediction of fungal species is challenging due to high degree of variability among ITS regions within species.ResultsA Random Forest (RF)-based predictor was built for identification of unknown fungal species. The reference and query sequences were mapped onto numeric features based on gapped base pair compositions, and then used as training and test sets respectively for prediction of fungal species using RF. More than 85% accuracy was found when 4 sequences per species in the reference set were utilized; whereas it was seen to be stabilized at ~88% if ≥7 sequence per species in the reference set were used for training of the model. The proposed model achieved comparable accuracy, while evaluated against existing methods through cross-validation procedure. The proposed model also outperformed several existing models used for identification of different species other than fungi.ConclusionsAn online prediction server “funbarRF” is established at http://cabgrid.res.in:8080/funbarrf/ for fungal species identification. Besides, an R-package funbarRF (https://cran.r-project.org/web/packages/funbarRF/) is also available for prediction using high throughput sequence data. The effort put in this work will certainly supplement the future endeavors in the direction of fungal taxonomy assignments based on DNA barcode.
BackgroundDetection of splice sites plays a key role for predicting the gene structure and thus development of efficient analytical methods for splice site prediction is vital. This paper presents a novel sequence encoding approach based on the adjacent di-nucleotide dependencies in which the donor splice site motifs are encoded into numeric vectors. The encoded vectors are then used as input in Random Forest (RF), Support Vector Machines (SVM) and Artificial Neural Network (ANN), Bagging, Boosting, Logistic regression, kNN and Naïve Bayes classifiers for prediction of donor splice sites.ResultsThe performance of the proposed approach is evaluated on the donor splice site sequence data of Homo sapiens, collected from Homo Sapiens Splice Sites Dataset (HS3D). The results showed that RF outperformed all the considered classifiers. Besides, RF achieved higher prediction accuracy than the existing methods viz., MEM, MDD, WMM, MM1, NNSplice and SpliceView, while compared using an independent test dataset.ConclusionBased on the proposed approach, we have developed an online prediction server (MaLDoSS) to help the biological community in predicting the donor splice sites. The server is made freely available at http://cabgrid.res.in:8080/maldoss. Due to computational feasibility and high prediction accuracy, the proposed approach is believed to help in predicting the eukaryotic gene structure.Electronic supplementary materialThe online version of this article (doi:10.1186/s13040-016-0086-4) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.