A full description of the human proteome relies on the challenging task of detecting mature and changing forms of protein molecules in the body. Large scale proteome analysis1 has routinely involved digesting intact proteins followed by inferred protein identification using mass spectrometry (MS)2. This “bottom up” process affords a high number of identifications (not always unique to a single gene). However, complications arise from incomplete or ambiguous2 characterization of alternative splice forms, diverse modifications (e.g., acetylation and methylation), and endogenous protein cleavages, especially when combinations of these create complex patterns of intact protein isoforms and species3. “Top down” interrogation of whole proteins can overcome these problems for individual proteins4,5, but has not been achieved on a proteome scale due to the lack of intact protein fractionation methods that are well integrated with tandem MS. Here we show, using a new four dimensional (4D) separation system, identification of 1,043 gene products from human cells that are dispersed into >3,000 protein species created by post-translational modification, RNA splicing, and proteolysis. The overall system produced >20-fold increases in both separation power and proteome coverage, enabling the identification of proteins up to 105 kilodaltons and those with up to 11 transmembrane helices. Many previously undetected isoforms of endogenous human proteins were mapped, including changes in multiply-modified species in response to accelerated cellular aging (senescence) induced by DNA damage. Integrated with the latest version of the Swiss-Prot database6, the data provide precise correlations to individual genes and proof-of-concept for large scale interrogation of whole protein molecules. The technology promises to improve the link between proteomics data and complex phenotypes in basic biology and disease research7.
We present a library of efficient implementations of deep learning primitives. Deep learning workloads are computationally intensive, and optimizing their kernels is difficult and time-consuming. As parallel architectures evolve, kernels must be reoptimized, which makes maintaining codebases difficult over time. Similar issues have long been addressed in the HPC community by libraries such as the Basic Linear Algebra Subroutines (BLAS) [2]. However, there is no analogous library for deep learning. Without such a library, researchers implementing deep learning workloads on parallel processors must create and optimize their own implementations of the main computational kernels, and this work must be repeated as new parallel processors emerge. To address this problem, we have created a library similar in intent to BLAS, with optimized routines for deep learning workloads. Our implementation contains routines for GPUs, although similarly to the BLAS library, these routines could be implemented for other platforms. The library is easy to integrate into existing frameworks, and provides optimized performance and memory usage. For example, integrating cuDNN into Caffe, a popular framework for convolutional networks, improves performance by 36% on a standard model while also reducing memory consumption.
Although well-established as a technique for protein purification, the application of continuous elution tube gel electrophoresis to proteome fractionation remains problematic. Difficulties associated with sample collection, particularly at the high mass range or at low sample loadings, continue to plague the technique. Furthermore, an upper mass limit is imposed as slow-moving higher molecular weight proteins are progressively diluted during the collection phase. In short, with current technology, effective separation over a broad mass range has not been achieved. In this work, we present improved techniques for continuous elution tube gel electrophoresis to accommodate broad mass range separation of proteins. Our device enables rapid partitioning of a proteome into discrete mass range fractions in the solution phase. High recovery is achieved at submicrogram to milligram sample loadings. We demonstrate comprehensive, reproducible separations of protein mixtures, as well as separation of a proteome in as fast as 1 h, over mass ranges from below 10 to 250 kDa. Finally, we identified proteins from a prefractionated standard protein mixture using liquid chromatography tandem mass spectrometric (LC-MS/MS) analysis.
Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems. Also, conventional networks fix the architecture before training starts; as a result, training cannot improve the architecture. To address these limitations, we describe a method to reduce the storage and computation required by neural networks by an order of magnitude without affecting their accuracy by learning only the important connections. Our method prunes redundant connections using a three-step method. First, we train the network to learn which connections are important. Next, we prune the unimportant connections. Finally, we retrain the network to fine tune the weights of the remaining connections. On the ImageNet dataset, our method reduced the number of parameters of AlexNet by a factor of 9×, from 61 million to 6.7 million, without incurring accuracy loss. Similar experiments with VGG-16 found that the total number of parameters can be reduced by 13×, from 138 million to 10.3 million, again with no loss of accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.