Abstract-We study the following problem: how to efficiently find in a collection of strings those similar to a given query string? Various similarity functions can be used, such as edit distance, Jaccard similarity, and cosine similarity. This problem is of great interests to a variety of applications that need a high real-time performance, such as data cleaning, query relaxation, and spellchecking. Several algorithms have been proposed based on the idea of merging inverted lists of grams generated from the strings. In this paper we make two contributions. First, we develop several algorithms that can greatly improve the performance of existing algorithms. Second, we study how to integrate existing filtering techniques with these algorithms, and show that they should be used together judiciously, since the way to do the integration can greatly affect the performance. We have conducted experiments on several real data sets to evaluate the proposed techniques.
We report on the layer-by-layer design principles of poly(methacrylic acid) (PMAA) ultrathin hydrogel coatings that release antimicrobial agents (AmAs) in response to pH variations. The studied AmAs include gentamicin and an antibacterial cationic peptide L5. Adipic acid dihydrazide (AADH) is a cross-linker which, relative to ethylenediamine (EDA), increases the hydrogel hydrophobicity and introduces centers for hydrogen bonding to AmAs. AmA retention in AADH-cross-linked hydrogels in high-salt solutions was enhanced while AmA release at low pH was suppressed. L5 retains its antibacterial activity toward planktonic Staphylococcus epidermidis after release from PMAA hydrogels in response to pH decreases in the surrounding medium due to bacterial growth. Staphylococcus epidermidis adhesion and colonization was almost completely inhibited by L5 loading of hydrogels. The AmA-releasing and AmA-retaining properties of these hydrogel coatings provide new opportunities to study the fundamental mechanisms of AmA-coating-bacteria interactions and develop a new class of clinically relevant antibacterial coatings for medical devices.
Neddylation, the covalent attachment of ubiquitin-like protein Nedd8, of the Cullin-RING E3 ligase family regulates their ubiquitylation activity. However, regulation of HECT ligases by neddylation has not been reported to date. Here we show that the C2-WW-HECT ligase Smurf1 is activated by neddylation. Smurf1 physically interacts with Nedd8 and Ubc12, forms a Nedd8-thioester intermediate, and then catalyses its own neddylation on multiple lysine residues. Intriguingly, this autoneddylation needs an active site at C426 in the HECT N-lobe. Neddylation of Smurf1 potently enhances ubiquitin E2 recruitment and augments the ubiquitin ligase activity of Smurf1. The regulatory role of neddylation is conserved in human Smurf1 and yeast Rsp5. Furthermore, in human colorectal cancers, the elevated expression of Smurf1, Nedd8, NAE1 and Ubc12 correlates with cancer progression and poor prognosis. These findings provide evidence that neddylation is important in HECT ubiquitin ligase activation and shed new light on the tumour-promoting role of Smurf1.
BackgroundMultiplex PCR, defined as the simultaneous amplification of multiple regions of a DNA template or multiple DNA templates using more than one primer set (comprising a forward primer and a reverse primer) in one tube, has been widely used in diagnostic applications of clinical and environmental microbiology studies. However, primer design for multiplex PCR is still a challenging problem and several factors need to be considered. These problems include mis-priming due to nonspecific binding to non-target DNA templates, primer dimerization, and the inability to separate and purify DNA amplicons with similar electrophoretic mobility.ResultsA program named MPprimer was developed to help users for reliable multiplex PCR primer design. It employs the widely used primer design program Primer3 and the primer specificity evaluation program MFEprimer to design and evaluate the candidate primers based on genomic or transcript DNA database, followed by careful examination to avoid primer dimerization. The graph-expanding algorithm derived from the greedy algorithm was used to determine the optimal primer set combinations (PSCs) for multiplex PCR assay. In addition, MPprimer provides a virtual electrophotogram to help users choose the best PSC. The experimental validation from 2× to 5× plex PCR demonstrates the reliability of MPprimer. As another example, MPprimer is able to design the multiplex PCR primers for DMD (dystrophin gene which caused Duchenne Muscular Dystrophy), which has 79 exons, for 20×, 20×, 20×, 14×, and 5× plex PCR reactions in five tubes to detect underlying exon deletions.ConclusionsMPprimer is a valuable tool for designing specific, non-dimerizing primer set combinations with constrained amplicons size for multiplex PCR assays.
The amount of genomic sequence data being generated and made available through public databases continues to increase at an ever-expanding rate. Downloading, copying, sharing and manipulating these large datasets are becoming difficult and time consuming for researchers. We need to consider using advanced compression techniques as part of a standard data format for genomic data. The inherent structure of genome data allows for more efficient lossless compression than can be obtained through the use of generic compression programs. We apply a series of techniques to James Watson's genome that in combination reduce it to a mere 4MB, small enough to be sent as an email attachment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.