Pavan K. Kota scite author profile

Pavan K. Kota

4Publications

32Citation Statements Received

167Citation Statements Given

How they've been cited

How they cite others

269

167

Affiliations

Bioengineering (Switzerland), Rice University

Publications

Order By: Most citations

To Petabytes and beyond: recent advances in probabilistic and signal processing algorithms and their application to metagenomics

Elworth

Wang²,

Kota³

et al. 2020

View full text Add to dashboard Cite

As computational biologists continue to be inundated by ever increasing amounts of metagenomic data, the need for data analysis approaches that keep up with the pace of sequence archives has remained a challenge. In recent years, the accelerated pace of genomic data availability has been accompanied by the application of a wide array of highly efficient approaches from other fields to the field of metagenomics. For instance, sketching algorithms such as MinHash have seen a rapid and widespread adoption. These techniques handle increasingly large datasets with minimal sacrifices in quality for tasks such as sequence similarity calculations. Here, we briefly review the fundamentals of the most impactful probabilistic and signal processing algorithms. We also highlight more recent advances to augment previous reviews in these areas that have taken a broader approach. We then explore the application of these techniques to metagenomics, discuss their pros and cons, and speculate on their future directions.

show abstract

Extreme Compressed Sensing of Poisson Rates From Multiple Measurements

Kota

LeJeune

Drezek

et al. 2022

IEEE Trans. Signal Process.

View full text Add to dashboard Cite

The Need for Transfer Learning in CRISPR-Cas Off-Target Scoring

Kota

Pan

et al. 2021

Preprint

View full text Add to dashboard Cite

The scalable design of safe guide RNA sequences for CRISPR gene editing depends on the computational "scoring" of DNA locations that may be edited. As there is no widely accepted benchmark dataset to compare scoring models, we present a curated "TrueOT" dataset that contains thoroughly validated datapoints to best reflect the properties of in vivo editing. Many existing models are trained on data from high throughput assays. We hypothesize that such models may suboptimally transfer to the low throughput data in TrueOT due to fundamental biological differences between proxy assays and in vivo behavior. We developed new Siamese convolutional neural networks, trained them on a proxy dataset, and compared their performance against existing models on TrueOT. Our simplest model with a single convolutional and pooling layer surprisingly exhibits state-ofthe-art performance on TrueOT. Adding subsequent layers improves performance on the proxy dataset while compromising performance on TrueOT. We demonstrate that model complexity can only improve performance on TrueOT if transfer learning techniques are employed. These results suggest an urgent need for the CRISPR community to agree upon a benchmark dataset such as TrueOT and highlight that various sources of CRISPR data cannot be assumed to be equivalent. Our codebase and datasets are available on GitHub at github.com/baolab-rice/CRISPR_OT_scoring.

show abstract

Expanded Multiplexing on Sensor-Constrained Microfluidic Partitioning Systems

Kota

LeJeune

et al. 2022

Preprint

View full text Add to dashboard Cite

Microfluidics can split samples into thousands or millions of partitions such as droplets or nanowells. Partitions capture analytes according to a Poisson distribution, and in diagnostics, the analyte concentration is commonly calculated with a closed-form solution via maximum likelihood estimation (MLE). Here, we present a generalization of MLE with microfluidics, an extension of our previously developed Sparse Poisson Recovery (SPoRe) algorithm, and an in vitro demonstration with droplet digital PCR (ddPCR) of the new capabilities that SPoRe enables. Many applications such as infection diagnostics require sensitive detection and broad-range multiplexing. Digital PCR coupled with conventional target-specific sensors yields the former but is constrained in multiplexing by the number of available measurement channels (e.g., fluorescence). In our demonstration, we circumvent these limitations by broadly amplifying bacteria with 16S ddPCR and assigning barcodes to nine pathogen genera using only five nonspecific probes. Moreover, we measure only two probes at a time in multiple groups of droplets given our two-channel ddPCR system. Although individual droplets are ambiguous in their bacterial content, our results show that the concentrations of bacteria in the sample can be uniquely recovered given the pooled distribution of partition measurements from all groups. We ultimately achieve stable quantification down to approximately 200 total copies of the 16S gene per sample, enabling a suite of clinical applications given a robust upstream microbial DNA extraction procedure. We develop new theory that generalizes the application of this framework to a broad class of realistic sensors and applications, and we prove scaling rules for system design to achieve further expanded multiplexing. This flexibility means that the core principles and capabilities demonstrated here can generalize to most biosensing applications with microfluidic partitioning.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Pavan K. Kota

To Petabytes and beyond: recent advances in probabilistic and signal processing algorithms and their application to metagenomics

Extreme Compressed Sensing of Poisson Rates From Multiple Measurements

The Need for Transfer Learning in CRISPR-Cas Off-Target Scoring

Expanded Multiplexing on Sensor-Constrained Microfluidic Partitioning Systems

Contact Info

Product

Resources

About