Recently, Question Answering (QA) has been one of the main focus of natural language processing research. However, Arabic Question Answering is still not in the mainstream. The challenges of the Arabic language and the lack of resources have made it difficult to provide Arabic QA systems with high accuracy. While low accuracies may be accepted for general purpose systems, it is critical in some fields such as religious affairs. Therefore, there is a need for specialized accurate systems that target these critical fields. In this paper, we propose Al-Bayan, a new Arabic QA system specialized for the Holy Quran. The system accepts an Arabic question about the Quran, retrieves the most relevant Quran verses, then extracts the passage that contains the answer from the Quran and its interpretation books (Tafseer). Evaluation results on a collected dataset show that the overall system can achieve 85% accuracy using the top-3 results.
Selecting the most discriminative genes/miRNAs has been raised as an important task in bioinformatics to enhance disease classifiers and to mitigate the dimensionality curse problem. Original feature selection methods choose genes/miRNAs based on their individual features regardless of how they perform together. Considering group features instead of individual ones provides a better view for selecting the most informative genes/miRNAs. Recently, deep learning has proven its ability in representing the data in multiple levels of abstraction, allowing for better discrimination between different classes. However, the idea of using deep learning for feature selection is not widely used in the bioinformatics field yet. In this paper, a novel multi-level feature selection approach named MLFS is proposed for selecting genes/miRNAs based on expression profiles. The approach is based on both deep and active learning. Moreover, an extension to use the technique for miRNAs is presented by considering the biological relation between miRNAs and genes. Experimental results show that the approach was able to outperform classical feature selection methods in hepatocellular carcinoma (HCC) by 9%, lung cancer by 6% and breast cancer by around 10% in F1-measure. Results also show the enhancement in F1-measure of our approach over recently related work in [1] and [2].
Integrative approaches that combine multiple forms of data can more accurately capture pathway associations and so provide a comprehensive understanding of the molecular mechanisms that cause complex diseases. Association analyses based on single nucleotide polymorphism (SNP) genotypes, copy number variant (CNV) genotypes, and gene expression profiles are the 3 most common paradigms used for gene set/pathway enrichment analyses. Many work has been done to leverage information from 2 types of data from these 3 paradigms. However, to the best of our knowledge, there is no work done before to integrate the 3 paradigms all together. In this article, we present an integrated analysis that combine SNP, CNV, and gene expression data to generate a single gene list. We present different methods to compare this gene list with the other 3 possible lists that result from the combinations of the following pairs of data: SNP genotype with gene expression, CNV genotype with gene expression, and SNP genotype with CNV genotype. The comparison is done using 3 different cancer datasets and 2 different methods of comparison. Our results show that integrating SNP, CNV, and gene expression data give better association results than integrating any pair of 3 data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.