Protein cysteine thiols can be divided into four groups based on their reactivities: those that form permanent structural disulfide bonds, those that coordinate with metals, those that remain in the reduced state, and those that are susceptible to reversible oxidation. Physicochemical parameters of oxidationsusceptible protein thiols were organized into a database named the Balanced Oxidation Susceptible Cysteine Thiol Database (BALOSCTdb). BALOSCTdb contains 161 cysteine thiols that undergo reversible oxidation and 161 cysteine thiols that are not susceptible to oxidation. Each cysteine was represented by a set of 12 parameters, one of which was a label (1/0) to indicate whether its thiol moiety is susceptible to oxidation. A computer program (the C4.5 decision tree classifier re-implemented as the J48 classifier) segregated cysteines into oxidation-susceptible and oxidation-non-susceptible classes. The classifier selected three parameters critical for prediction of thiol oxidation susceptibility: (1) distance to the nearest cysteine sulfur atom, (2) solvent accessibility, and (3) pKa. The classifier was optimized to correctly predict 136 of the 161 cysteine thiols susceptible to oxidation. Leave-one-out cross-validation analysis showed that the percent of correctly classified cysteines was 80.1% and that 16.1% of the oxidation-susceptible cysteine thiols were incorrectly classified. The algorithm developed from these parameters, named the Cysteine Oxidation Prediction Algorithm (COPA), is presented here. COPA prediction of oxidation-susceptible sites can be utilized to locate protein cysteines susceptible to redox-mediated regulation and identify possible enzyme catalytic sites with reactive cysteine thiols.Keywords: cysteine; thiol; oxidation; prediction; C4.5; J48; redox; classifier; decision tree Protein cysteine thiols can be divided into four broad categories: those that form permanent structural disulfide bonds, those that coordinate metals, those that are permanently in the reduced state, and those that are reversibly oxidized. Permanent structural disulfide bonds are formed during the folding process by oxidizing enzymes (for example, DsbA in bacteria and protein disulfide isomerase in eukaryotes) (Kadokura et al. 2003;Maattanen et al. 2006). Permanent structural disulfide bonds are typically observed in oxidizing environments such as extracellular spaces and the endoplasmic reticulum. Protein cysteine thiols can be coordinated to metal ions, typically iron, copper, or zinc. Metal coordinated thiols are found in oxidizing environments and in the cytosolic compartment of the cell. The remaining protein cysteine thiols in the cytosol are either permanently reduced or are susceptible to reversible oxidation (Thomas et al. 1995). The reversibly oxidized protein thiols (ROPTs) in the cytosol are often required for enzyme catalysis or for regulation of protein activity (Finkel 2003;Linke and Jakob 2003).Reprint requests to: Jamil Momand, Department of Chemistry and Biochemistry, California State Univ...
The MapReduce approach has been popular in computing large scale data since Google implemented its platform on Google Distributed File Systems (GFS) followed by Amazon Web Service (AWS) providing the Apache Hadoop platform in inexpensive computing nodes. Map/Reduce motivates to redesign and convert the existing sequential algorithms to MapReduce as restricted parallel programming so that the paper proposes Market Basket Analysis algorithm with MapReduce as well as apriority property. Two algorithms are proposed by adapting an existing Apriori-algorithm and building a simple algorithm that sorts data sets and converts it to (key, value) pairs to fit with MapReduce. It is executed on Amazon EC2 Map/Reduce platform. The experimental results show that the Apriori-algorithm does not perform as well as the simple algorithm. Using the simple algorithm, the code with Map/Reduce increases the performance by adding more nodes, but at a certain point there is a bottleneck that does not allow further performance gain. It is believed that the operations of distributing, aggregating, and reducing data in Map/Reduce, cause the bottleneck.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.