Feature selection for chemical compound extraction using wrapper approach with Naive Bayes classifier

Alshaikhdeeb, Basel; Ahmad, Kamsuriah

doi:10.1109/iceei.2017.8312421

Cited by 7 publications

(4 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A variety of interpretable machine learning models have seen use in cheminformatics, with two being particularly appealing: Naïve Bayes (NB) classifiers − and decision tree classifiers, in particular Random Forests − (RF). Both of these methods have been widely deployed in QSAR , and drug design and have multiple studies that demonstrate their capacity to select meaningful features. ,,− The former provides feature importance per class through the likelihoods of the feature given the class ( p ( f i | M i )). However, constructing an NB classifier with the desired degree of flexibility for this work presents significant engineering challenges.…”

Section: Methodsmentioning

confidence: 99%

Explainable Molecular Sets: Using Information Theory to Generate Meaningful Descriptions of Groups of Molecules

Mater

Coote

2021

J. Chem. Inf. Model.

View full text Add to dashboard Cite

Algorithmically identifying the meaningful similarities between an assortment of molecules is a critical chemical problem, and one which is only gaining in relevance as data-driven chemistry continues to progress. Effectively addressing this challenge can be achieved through a reformulation of the problem into information theory, cluster-based supervised classification, and the implementation of key concepts, particularly information entropy and mutual information. These concepts are combined with unsupervised learning atop learned chemical spaces to generate meaningful labels for arbitrary collections of molecules. An open-source and highly extensible codebase is provided to undertake these experiments, demonstrate the viability of the approach on known clusters, and glean insights into the learned representations of chemical space within message-passing neural networks, an architecture not readily permitting interpretability. This approach facilitates the interoperability between human chemical knowledge and the algorithmically derived insights, which will continue to become more prevalent in the coming years.

show abstract

Section: Methodsmentioning

confidence: 99%

Explainable Molecular Sets: Using Information Theory to Generate Meaningful Descriptions of Groups of Molecules

Mater

Coote

2021

J. Chem. Inf. Model.

View full text Add to dashboard Cite

show abstract

“…The analysis of such medical text has been divided into two main tasks. The first task is relatively similar to the Named Entity Recognition (NER) where the medical-related concepts are being identified [2][3][4]. In particular, it concentrates on specific medical entity which is the drug implications or side-effects, this task is known as Adverse Drug Reaction (ADR) extraction [5][6][7][8].…”

Section: Introductionmentioning

confidence: 99%

Enhance Medical Sentiment Vectors through Document Embedding using Recurrent Neural Network

Yousef¹,

Tiun²,

Omar³

et al. 2020

IJACSA

View full text Add to dashboard Cite

Adverse Drug Reaction (ADR) extraction is the process of identifying drug implications mentioned in social posts. Handling medical text for the identification of ADR is vital to research in terms of configuring the side effect and other medical-related entities within any medical text. However, investigating the role of such effect in the context of positive and negative is the responsibility of sentiment classification task where every medical review document would be categorized into its polarity, this is known as Medical Sentiment Analysis (MSA). Several studies have presented various techniques for MSA. Most of the recent studies have concentrated on architectures such as the Convolutional Neural Network (CNN) to get the document embedding. Yet, such architecture focuses only on the input without considering the previous or latter input. This might lead to weaker embedding for the document where some terms would not be considered. Hence, this paper proposes a new document embedding approach based on the Recurrent Neural Network (RNN) to improve the sentiment classification. Using a benchmark dataset of medical sentiments, the proposed method showed greater performance of sentiment classification accuracy. Such finding proves the effectiveness of RNN in producing document embedding.

show abstract

“…However, the most significant factor of these techniques is a feature space that can be generated during model establishment. Features are descriptive characteristics that describe the occurrence of specific entities (Alshaikhdeeb and Ahmad, 2017;2018). Discussing the feature space within the context of extracting ADRs requires mentioning trigger terms, which are specific keywords that come before or after ADRs.…”

Section: Introductionmentioning

confidence: 99%

Extended Trigger Terms for Extracting Adverse Drug Reactions in Social Media Texts

Yousef

Tiun

Omar

2019

Journal of Computer Science

View full text Add to dashboard Cite

Adverse Drug Reaction (ADR) is a disorder caused by taking medications. Studies have addressed extracting ADRs from social networks where users express their opinion regarding a specific medication. Extracting entities mainly depends on specific terms called trigger terms that may occur before or after ADRs. However, these terms should be extended, especially when examining multiple representation of N-gram. This study aims to propose an extension of trigger terms based on the multiple representation of N-gram. Two benchmark datasets are used in the experiments and three classifiers, namely, support vector machine, Naïve Bayes and linear regression, are trained on the proposed extension. Furthermore, two document representations have been utilized including Term Frequency Inverse Document Frequency (TFIDF) and Count Vector (CV). Results show that the proposed extended trigger terms outperform the baseline by achieving 88% and 69% of F1-scores for the first and second datasets, respectively. This finding implies the effectiveness of the proposed extended trigger terms in terms of detecting new ADRs.

show abstract

Feature selection for chemical compound extraction using wrapper approach with Naive Bayes classifier

Cited by 7 publications

References 8 publications

Explainable Molecular Sets: Using Information Theory to Generate Meaningful Descriptions of Groups of Molecules

Explainable Molecular Sets: Using Information Theory to Generate Meaningful Descriptions of Groups of Molecules

Enhance Medical Sentiment Vectors through Document Embedding using Recurrent Neural Network

Extended Trigger Terms for Extracting Adverse Drug Reactions in Social Media Texts

Contact Info

Product

Resources

About