Machine learning could enable an unprecedented level of control in protein engineering for therapeutic and industrial applications. Critical to its use in designing proteins with desired properties, machine learning models must capture the protein sequence-function relationship, often termed fitness landscape. Existing bench-marks like CASP or CAFA assess structure and function predictions of proteins, respectively, yet they do not target metrics relevant for protein engineering. In this work, we introduce Fitness Landscape Inference for Proteins (FLIP), a benchmark for function prediction to encourage rapid scoring of representation learning for protein engineering. Our curated tasks, baselines, and metrics probe model generalization in settings relevant for protein engineering, e.g. low-resource and extrapolative. Currently, FLIP encompasses experimental data across adeno-associated virus stability for gene therapy, protein domain B1 stability and immunoglobulin binding, and thermostability from multiple protein families. In order to enable ease of use and future expansion to new tasks, all data are presented in a standard format. FLIP scripts and data are freely accessible at https://benchmark.protein.properties.
Widespread availability of protein sequence-fitness data would revolutionize both our biochemical understanding of proteins and our ability to engineer them. Unfortunately, even though thousands of protein variants are generated and evaluated for fitness during a typical protein engineering campaign, most are never sequenced, leaving a wealth of potential sequence-fitness information untapped. This largely stems from the fact that sequencing is unnecessary for many protein engineering strategies; the added cost and effort of sequencing is thus unjustified. Here, we present every variant sequencing (evSeq), an efficient protocol for sequencing a variable region within every variant gene produced during a protein engineering campaign at a cost of cents per variant. Execution of evSeq is simple, requires no sequencing experience to perform, relies only on resources and services typically available to biology labs, and slots neatly into existing protein engineering workflows. Analysis of evSeq data is likewise made simple by its accompanying software (found at github.com/fhalab/evSeq, documentation at fhalab.github.io/evSeq), which can be run on a personal laptop and was designed to be accessible to users with no computational experience. Low-cost and easy to use, evSeq makes collection of extensive protein variant sequence-fitness data practical.
Polymers are becoming more important in all economic sectors, and as environmental concerns grow, biopolymers are replacing metal-or oil-based polymers. Two of these polymers are polyhydroxybutyrate (PHB), which is known for good mechanical characteristics and the manufacturing from renewable resources, and polylactic acid (PLA), which is known for fast degradation rates and great benefits for packaging industry. They exhibit properties that make them competitive alternatives for the less eco-friendly polymers, but processing techniques are not as well-researched. In this study, we performed in vitro degradation and drug release studies with pure PHB and PLA and PHB/PLA blends (1:3). Therefore, polymers were stored at 65°C in a PBS-buffer under rather static conditions to simulate intraossal localization. The mass loss of all samples indicates a degradation of all polymers, and it was confirmed by decreasing molecular weight, decreasing pH, increasing crystallinity, and decreasing water contact angle. Following these measurements, a 60-day drug release study was performed, which revealed a four-phase drug release mechanism, including a diffusion-controlled initial burst release especially elevated for investigated blends due to eased medium interpenetration, and a secondary burst release after 20 days for both blends and the pure PLLA-Biomer with lower molecular weight. The intensity of the secondary burst release corresponded to observed degradation characteristics allowing the conclusion of a degradation controlled drug release here.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.