Text Mining with Machine Learning

Žižka, Ján; Dařena, František; Svoboda, Arnošt

doi:10.1201/9780429469275

Cited by 37 publications

(27 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The numerical channel is used to extract the pre-defined dense features (i.e., the protein Positionspecific scoring matrice (PSSM) and the intrinsic disorder tendency of each residue in both protein and peptide sequences). Each categorical channel contains a self-learning word embedding layer 30 , which takes one of the categorical features of the input peptide or protein (i.e., the raw amino acids, secondary structures, polarity, and hydropathy properties). Here, we design such a multi-channel architecture because the input profiles contain multifaceted features of different scales, which may bring inconsistency if we only use a simple encoder.…”

Section: Resultsmentioning

confidence: 99%

“…1b). Each categorical channel consists of three self-learning word embedding layers 30 , taking amino acids, secondary structures, and physiochemical representations as input, respectively. Each numerical channel consists of a fully connected layer to take dense features as input, i.e., the intrinsic disorder tendencies features (ranging between 0 and 1) of peptides and proteins as well as the normalized evolutionary matrices (PSSM) of proteins.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

A deep-learning framework for multi-level peptide–protein interaction prediction

et al. 2021

View full text Add to dashboard Cite

Peptide-protein interactions are involved in various fundamental cellular functions and their identification is crucial for designing efficacious peptide therapeutics. Recently, a number of computational methods have been developed to predict peptide-protein interactions. However, most of the existing prediction approaches heavily depend on high-resolution structure data. Here, we present a deep learning framework for multi-level peptide-protein interaction prediction, called CAMP, including binary peptide-protein interaction prediction and corresponding peptide binding residue identification. Comprehensive evaluation demonstrated that CAMP can successfully capture the binary interactions between peptides and proteins and identify the binding residues along the peptides involved in the interactions. In addition, CAMP outperformed other state-of-the-art methods on binary peptide-protein interaction prediction. CAMP can serve as a useful tool in peptide-protein interaction prediction and identification of important binding residues in the peptides, which can thus facilitate the peptide drug discovery process.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

A deep-learning framework for multi-level peptide–protein interaction prediction

et al. 2021

View full text Add to dashboard Cite

show abstract

“…The second part contains 20% percent which contains 1346 for the validation phase. The hyperparameters for the proposed model were set to as following (i) learning rate = 0.001, (ii) mini-batch size: 64, (iii) number of iterations = 30, (iv) early-stopping = 3 epochs, and (v) optimizer: AdaBoost [46].…”

Section: Resultsmentioning

confidence: 99%

The Detection of COVID-19 in CT Medical Images: A Deep Learning Approach

Khalifa

Taha

Hassanien

et al. 2020

Studies in Big Data

View full text Add to dashboard Cite

The COVID-19 coronavirus is one of the latest viruses that hit the earth in the new century. It was declared as a pandemic by the World Health Organization in 2020. In this chapter, a model for the detection of COVID-19 virus from CT chest medical images will be presented. The proposed model is based on Generative Adversarial Networks (GAN), and a fine-tuned deep transfer learning model. GAN is used to generate more images from the available dataset. While deep transfer models are used to classify the COVID-19 virus from the normal class. The original dataset consists of 746 images. The is divided into two parts; 90% for the training and validation phase, while 10% for the testing phase. The 90% then is divided into 80% percent for the training and 20% percent for the validation after using GAN as image augmenter. The proposed GAN architecture raises the number of images in the training and validation phase to be 10 times larger than the original dataset. The deep transfer models which are selected for experimental trials are Resnet50, Shufflenet, and Mobilenet. They were selected as they include a medium number of layers on their architectures if they are compared with large deep transfer models such as DenseNet, and Inception-ResNet. This will reflect on the performance of the proposed model in terms of reducing training time, memory and CPU usage. The

show abstract

“…Modeling. Text mining analysis included three main steps: preprocessing, text mining, and post-processing, generally after collecting textual data [38].…”

Section: Text Mining and Topicmentioning

confidence: 99%

“…Guidance has been taken from medical sciences specialists to achieve the desired number of topics in each category. It is important to note that the number of topics should be selected proportionally because a large number of topics will lead to a significant quantity of small and considerably similar topics [38,39]. Also, interpretation of topics becomes more challenging due to the dispersion of keywords between topics [40].…”

Section: Text Mining and Topicmentioning

confidence: 99%

Iranian COVID-19 Publications in LitCovid: Text Mining and Topic Modeling

Dastani

Danesh

2021

Scientific Programming

View full text Add to dashboard Cite

COVID-19 is a threat to the lives of people all over the world. As a result of the new and unknown nature of COVID-19, much research has been conducted recently. In order to increase and enhance the growth rate of Iranian publications on COVID-19, this article aims to analyze these publications in LitCovid to identify the topical and content structure and topic modeling of scientific publications in the mentioned subject area. The present article is applied research performed by using an analytical approach as well as text mining techniques. The statistical population is all the publications of Iranian researchers in LitCovid. Latent Dirichlet Allocation (LDA) and Python were used to analyze the data and implement text mining and topic modeling algorithms. Data analysis shows that the percentage of Iranian publications in the eight topical groups in LitCovid is as follows: prevention (39.57%), treatment (18.99%), diagnosis (18.99%), forecasting (7.83%), case report (6.52%), mechanism (3.91%), transmission (3.62%), and general (0.58%). The results indicate that patient, pandemic, outbreak, case, Iranian, model, care, health, coronavirus, and disease are the most important words in the publications of Iranian researchers in LitCovid. Six topics for prevention; four topics for treatment and case report and forecasting; three topics for diagnosis, mechanism, and transmission in general have been obtained by implementing the topic modeling algorithm. Most of the Iranian publications in LitCovid are related to the topic “pandemic status,” with 22.47% in the prevention category, and the lowest number of publications is related to the topic “environment,” with 11.11% in the transmission category. The present study indicates a better understanding of essential and strategic issues of Iranian publications in LitCovid. The results reveal that many Iranian studies on COVID-19 were primarily on the issues related to prevention, management, and control. These findings provided a structured and research-based viewpoint of COVID-19 in Iran to guide researchers and policymakers.

show abstract

Text Mining with Machine Learning

Cited by 37 publications

References 0 publications

A deep-learning framework for multi-level peptide–protein interaction prediction

A deep-learning framework for multi-level peptide–protein interaction prediction

The Detection of COVID-19 in CT Medical Images: A Deep Learning Approach

Iranian COVID-19 Publications in LitCovid: Text Mining and Topic Modeling

Contact Info

Product

Resources

About