Identifying peptides, which are short polymeric chains of amino acid residues in a protein sequence, is of fundamental importance in systems biology research. The most popular approach to identify peptides is through database search. In this approach, an experimental spectrum ("query") generated from fragments of a target peptide using mass spectrometry is computationally compared with a database of already known protein sequences. The goal is to detect database peptides that are most likely to have generated the target peptide. The exponential growth rates and overwhelming sizes of biomolecular databases make this an ideal application to benefit from parallel computing. However, the present generation of software tools is not expected to scale to the magnitudes and complexities of data that will be generated in the next few years. This is because they are all either serial algorithms or parallel strategies that have been designed over inherently serial methods, thereby requiring high spaceand time-requirements. In this paper, we present an efficient parallel approach for peptide identification through database search. Three key factors distinguish our approach from that of existing solutions: i) (space) Given p processors and a database with N residues, we provide the first space-optimal algorithm (O( N p )) under distributed memory machine model; ii) (time) Our algorithm uses a combination of parallel techniques such as one-sided communication and masking of communication with computation to ensure that the overhead introduced due to parallelism is minimal; and iii) (quality) The run-time savings achieved using parallel processing has allowed us to incorporate highly accurate statistical models that have previously been demonstrated to ensure high quality prediction albeit on smaller scale data. We present the design and evaluation of two different algorithms to implement our approach. Experimental results using 2.65 million microbial proteins show linear scaling up to 128 processors of a Linux commodity cluster, with parallel efficiency at ∼50%. We expect that this new approach will be critical to meet the data-intensive and qualitative demands stemming from this important application domain.
Mobile Adhoc Network (MANET) is a wireless network where nodes communicate through other nodes without the aid of a base station. Security is a major challenge in MANET as the packets are prone vulnerability and eavesdropping in wireless environment. Generally MAC layer provides the security in such wireless network through encryption and authentication and the protocol is called WEP. Many authentication and encryption techniques are proposed to increase the security of the MANET. But stronger Security leads to more energy loss as mobiles have less energy and limited processing capability. In this work a Cross layer timestamp based network security technique is developed. The technique reduces the encryption packet overflow which is due to PKE or public key exchange, and derives the public key directly from the neighbor's table which is transmitted using routing information exchange. The simulation is performed with omnet++ simulator. Performance results demonstrate that the energy overhead due to encryption or performance compromise are very low in the proposed system. Further as the protocol is embedded in the network layer it is easily adoptable to any existing architecture without modifying the MAC or Physical layer standard or protocol.
We report the development of a novel high performance computing method for the identification of proteins from unknown (environmental) samples. The method uses computational optimization to provide an effective way to control the false discovery rate for environmental samples and complements de novo peptide sequencing. Furthermore, the method provides information based on the expressed protein in a microbial community, and thus complements DNA-based identification methods. Testing on blind samples demonstrates that the method provides 79-95% overlap with analogous results from searches involving only the correct genomes. We provide scaling and performance evaluations for the software that demonstrate the ability to carry out large-scale optimizations on 1258 genomes containing 4.2M proteins.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.