Wiesław Paja scite author profile

BackgroundTranscriptional regulation in multi-cellular organisms is a complex process involving multiple modular regulatory elements for each gene. Building whole-genome models of transcriptional networks requires mapping all relevant enhancers and then linking them to target genes. Previous methods of enhancer identification based either on sequence information or on epigenetic marks have different limitations stemming from incompleteness of each of these datasets taken separately.ResultsIn this work we present a new approach for discovery of regulatory elements based on the combination of sequence motifs and epigenetic marks measured with ChIP-Seq. Our method uses supervised learning approaches to train a model describing the dependence of enhancer activity on sequence features and histone marks. Our results indicate that using combination of features provides superior results to previous approaches based on either one of the datasets. While histone modifications remain the dominant feature for accurate predictions, the models based on sequence motifs have advantages in their general applicability to different tissues. Additionally, we assess the relevance of different sequence motifs in prediction accuracy showing that even tissue-specific enhancer activity depends on multiple motifs.ConclusionsBased on our results, we conclude that it is worthwhile to include sequence motif data into computational approaches to active enhancer prediction and also that classifiers trained on a specific set ofenhancers can generalize with significant accuracy beyond the training set.

show abstract

All Relevant Feature Selection Methods and Applications

Rudnicki

Wrzesień

Paja

2014

View full text Add to dashboard Cite

All-relevant feature selection is a relatively new sub-field in the domain of feature selection. The chapter is devoted to a short review of the field and presentation of the representative algorithm. The problem of all-relevant feature selection is first defined, then key algorithms are described. Finally the Boruta algorithm, under development at ICM, University of Warsaw, is explained in a greater detail and applied both to a collection of synthetic and real-world data sets. It is shown that algorithm is both sensitive and selective. The level of falsely discovered relevant variables is low-on average less than one falsely relevant variable is discovered for each set. The sensitivity of the algorithm is nearly 100 % for data sets for which classification is easy, but may be smaller for data sets for which classification is difficult, nevertheless, it is possible to increase the sensitivity of the algorithm at the cost of increased computational effort without adversely affecting the false discovery level. It is achieved by increasing the number of trees in the random forest algorithm that delivers the importance estimate in Boruta.

show abstract

Characterization of Covid-19 infected pregnant women sera using laboratory indexes, vibrational spectroscopy, and machine learning classifications

Güleken

Jakubczyk

Paja

et al. 2022

Talanta

View full text Add to dashboard Cite

Herein, we show differences in blood serum of asymptomatic and symptomatic pregnant women infected with COVID-19 and correlate them with laboratory indexes, ATR FTIR and multivariate machine learning methods. We collected the sera of COVID-19 diagnosed pregnant women, in the second trimester (n = 12), third-trimester (n = 7), and second-trimester with severe symptoms (n = 7) compared to the healthy pregnant (n = 11) women, which makes a total of 37 participants. To assign the accuracy of FTIR spectra regions where peak shifts occurred, the Random Forest algorithm, traditional C5.0 single decision tree algorithm and deep neural network approach were used. We verified the correspondence between the FTIR results and the laboratory indexes such as: the count of peripheral blood cells, biochemical parameters, and coagulation indicators of pregnant women. CH 2 scissoring, amide II, amide I vibrations could be used to differentiate the groups. The accuracy calculated by machine learning methods was higher than 90%. We also developed a method based on the dynamics of the absorbance spectra allowing to determine the differences between the spectra of healthy and COVID-19 patients. Laboratory indexes of biochemical parameters associated with COVID-19 validate changes in the total amount of proteins, albumin and lipase.

show abstract

Deep architectures for long-term stock price prediction with a heuristic-based strategy for trading simulations

et al. 2019

View full text Add to dashboard Cite

Stock price prediction is a popular yet challenging task and deep learning provides the means to conduct the mining for the different patterns that trigger its dynamic movement. In this paper, the task is to predict the close price for 25 companies enlisted at the Bucharest Stock Exchange, from a novel data set introduced herein. Towards this scope, two traditional deep learning architectures are designed in comparison: a long short-memory network and a temporal convolutional neural model. Based on their predictions, a trading strategy, whose decision to buy or sell depends on two different thresholds, is proposed. A hill climbing approach selects the optimal values for these parameters. The prediction of the two deep learning representatives used in the subsequent trading strategy leads to distinct facets of gain.

show abstract

Development of novel spectroscopic and machine learning methods for the measurement of periodic changes in COVID-19 antibody level

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Wiesław Paja

Active enhancer positions can be accurately predicted from chromatin marks and collective sequence motif data

All Relevant Feature Selection Methods and Applications

Characterization of Covid-19 infected pregnant women sera using laboratory indexes, vibrational spectroscopy, and machine learning classifications

Deep architectures for long-term stock price prediction with a heuristic-based strategy for trading simulations

Development of novel spectroscopic and machine learning methods for the measurement of periodic changes in COVID-19 antibody level

Contact Info

Product

Resources

About