Alona Kryshchenko scite author profile

In this paper we present NPEST, a novel tool for the analysis of expressed sequence tags (EST) distributions and transcription start site (TSS) prediction. This method estimates an unknown probability distribution of ESTs using a maximum likelihood (ML) approach, which is then used to predict positions of TSS. Accurate identification of TSS is an important genomics task, since the position of regulatory elements with respect to the TSS can have large effects on gene regulation, and performance of promoter motif-finding methods depends on correct identification of TSSs. Our probabilistic approach expands recognition capabilities to multiple TSS per locus that may be a useful tool to enhance the understanding of alternative splicing mechanisms. This paper presents analysis of simulated data as well as statistical analysis of promoter regions of a model dicot plant Arabidopsis thaliana. Using our statistical tool we analyzed 16520 loci and developed a database of TSS, which is now publicly available at www.glacombio.net/NPEST.

show abstract

An Algorithm for Nonparametric Estimation of A Multivariate Mixing Distribution with Applications to Population Pharmacokinetics

Yamada¹,

Neely²,

Bartroff³

et al. 2019

Preprint

View full text Add to dashboard Cite

In this paper we describe a nonparametric maximum likelihood (NPML) method for estimating multivariate mixing distributions. Given $N$ independent observations, convexity theory shows that the NPML estimator is discrete with at most $N$ support points. The original infinite NPML problem then becomes the finite dimensional problem of finding the location and probability of the support points. The probability of the support points is found by a Primal-Dual Interior-Point method; the location of the support points is found by an Adaptive Grid method. Our method is able to handle high-dimensional and complex multivariate mixture models.An important application is discussed for the problem of population pharmacokinetics and a non-trivial example is treated.Our algorithm has been successfully applied in hundreds of published pharmacometric studies. In addition to population pharmacokinetics, this research also applies to empirical Bayes estimation and many other areas of applied mathematics.

show abstract

An Algorithm for Nonparametric Estimation of a Multivariate Mixing Distribution with Applications to Population Pharmacokinetics

et al. 2020

View full text Add to dashboard Cite

Population pharmacokinetic (PK) modeling has become a cornerstone of drug development and optimal patient dosing. This approach offers great benefits for datasets with sparse sampling, such as in pediatric patients, and can describe between-patient variability. While most current algorithms assume normal or log-normal distributions for PK parameters, we present a mathematically consistent nonparametric maximum likelihood (NPML) method for estimating multivariate mixing distributions without any assumption about the shape of the distribution. This approach can handle distributions with any shape for all PK parameters. It is shown in convexity theory that the NPML estimator is discrete, meaning that it has finite number of points with nonzero probability. In fact, there are at most N points where N is the number of observed subjects. The original infinite NPML problem then becomes the finite dimensional problem of finding the location and probability of the support points. In the simplest case, each point essentially represents the set of PK parameters for one patient. The probability of the points is found by a primal-dual interior-point method; the location of the support points is found by an adaptive grid method. Our method is able to handle high-dimensional and complex multivariate mixture models. An important application is discussed for the problem of population pharmacokinetics and a nontrivial example is treated. Our algorithm has been successfully applied in hundreds of published pharmacometric studies. In addition to population pharmacokinetics, this research also applies to empirical Bayes estimation and many other areas of applied mathematics. Thereby, this approach presents an important addition to the pharmacometric toolbox for drug development and optimal patient dosing.

show abstract

Semi-supervised Nonnegative Matrix Factorization for Document Classification

Haddock

Kassab

et al. 2021

View full text Add to dashboard Cite

We propose new semi-supervised nonnegative matrix factorization (SSNMF) models for document classification and provide motivation for these models as maximum likelihood estimators. The proposed SSNMF models simultaneously provide both a topic model and a model for classification, thereby offering highly interpretable classification results. We derive training methods using multiplicative updates for each new model, and demonstrate the application of these models to single-label and multi-label document classification, although the models are flexible to other supervised learning tasks such as regression. We illustrate the promise of these models and training methods on document classification datasets (e.g., 20 Newsgroups, Reuters).

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.