The assembly of multiple genomes from mixed sequence reads is a bottleneck in metagenomic analysis. A single-genome assembly program (assembler) is not capable of resolving metagenome sequences, so assemblers designed specifically for metagenomics have been developed. MetaVelvet is an extension of the single-genome assembler Velvet. It has been proved to generate assemblies with higher N50 scores and higher quality than single-genome assemblers such as Velvet and SOAPdenovo when applied to metagenomic sequence reads and is frequently used in this research community. One important open problem for MetaVelvet is its low accuracy and sensitivity in detecting chimeric nodes in the assembly (de Bruijn) graph, which prevents the generation of longer contigs and scaffolds. We have tackled this problem of classifying chimeric nodes using supervised machine learning to significantly improve the performance of MetaVelvet and developed a new tool, called MetaVelvet-SL. A Support Vector Machine is used for learning the classification model based on 94 features extracted from candidate nodes. In extensive experiments, MetaVelvet-SL outperformed the original MetaVelvet and other state-of-the-art metagenomic assemblers, IDBA-UD, Ray Meta and Omega, to reconstruct accurate longer assemblies with higher N50 scores for both simulated data sets and real data sets of human gut microbial sequences.
Background Recently, SARS-CoV-2 virus with the D614G mutation has become a public concern due to rapid dissemination of this variant across many countries. Our study aims were (1) to report full-length genome sequences of SARS-CoV-2 collected from four COVID-19 patients in the Special Region of Yogyakarta and Central Java provinces, Indonesia; (2) to compare the clade distribution of full-length genome sequences from Indonesia (n = 60) from March to September 2020 and (3) to perform phylogenetic analysis of SARS-CoV-2 complete genomes from different countries, including Indonesia. Methods Whole genome sequencing (WGS) was performed using next-generation sequencing (NGS) applied in the Illumina MiSeq instrument. Full-length virus genomes were annotated using the reference genome of hCoV-19/Wuhan/Hu-1/2019 (NC_045512.2) and then visualized in UGENE v. 1.30. For phylogenetic analysis, a dataset of 88 available SARS-CoV-2 complete genomes from different countries, including Indonesia, was retrieved from GISAID. Results All patients were hospitalized with various severities of COVID-19. Phylogenetic analysis revealed that one and three virus samples belong to clade L and GH. These three clade GH virus samples (EPI_ISL_525492, EPI_ISL_516800 and EPI_ISL_516829) were not only located in a cluster with SARS-CoV-2 genomes from Asia but also those from Europe, whereas the clade L virus sample (EPI_ISL_516806) was located amongst SARS-CoV-2 genomes from Asia. Using full-length sequences available in the GISAID EpiCoV Database, 39 of 60 SARS-CoV-2 (65%) from Indonesia harbor the D614G mutation. Conclusion These findings indicate that SARS-CoV-2 with the D614G mutation appears to become the major circulating virus in Indonesia, concurrent with the COVID-19 situation worldwide.
Background: Severe acute respiratory syndrome Coronavirus 2 (SARS-CoV-2) Delta variant (B.1.617.2) has been responsible for the current increase in Coronavirus disease 2019 (COVID-19) infectivity rate worldwide. We compared the impact of the Delta variant and non-Delta variant on the COVID-19 outcomes in patients from Yogyakarta and Central Java provinces, Indonesia.Methods: In this cross-sectional study, we ascertained 161 patients, 69 with the Delta variant and 92 with the non-Delta variant. The Illumina MiSeq next-generation sequencer was used to perform the whole-genome sequences of SARS-CoV-2.Results: The mean age of patients with the Delta variant and the non-Delta variant was 27.3 ± 20.0 and 43.0 ± 20.9 (p = 3 × 10−6). The patients with Delta variant consisted of 23 males and 46 females, while the patients with the non-Delta variant involved 56 males and 36 females (p = 0.001). The Ct value of the Delta variant (18.4 ± 2.9) was significantly lower than that of the non-Delta variant (19.5 ± 3.8) (p = 0.043). There was no significant difference in the hospitalization and mortality of patients with Delta and non-Delta variants (p = 0.80 and 0.29, respectively). None of the prognostic factors were associated with the hospitalization, except diabetes with an OR of 3.6 (95% CI = 1.02–12.5; p = 0.036). Moreover, the patients with the following factors have been associated with higher mortality rate than the patients without the factors: age ≥65 years, obesity, diabetes, hypertension, and cardiovascular disease with the OR of 11 (95% CI = 3.4–36; p = 8 × 10−5), 27 (95% CI = 6.1–118; p = 1 × 10−5), 15.6 (95% CI = 5.3–46; p = 6 × 10−7), 12 (95% CI = 4–35.3; p = 1.2 × 10−5), and 6.8 (95% CI = 2.1–22.1; p = 0.003), respectively. Multivariate analysis showed that age ≥65 years, obesity, diabetes, and hypertension were the strong prognostic factors for the mortality of COVID-19 patients with the OR of 3.6 (95% CI = 0.58–21.9; p = 0.028), 16.6 (95% CI = 2.5–107.1; p = 0.003), 5.5 (95% CI = 1.3–23.7; p = 0.021), and 5.8 (95% CI = 1.02–32.8; p = 0.047), respectively.Conclusions: We show that the patients infected by the SARS-CoV-2 Delta variant have a lower Ct value than the patients infected by the non-Delta variant, implying that the Delta variant has a higher viral load, which might cause a more transmissible virus among humans. However, the Delta variant does not affect the COVID-19 outcomes in our patients. Our study also confirms that older age and comorbidity increase the mortality rate of patients with COVID-19.
Background Transmission within families and multiple spike protein mutations have been associated with the rapid transmission of SARS-CoV-2. We aimed to: (1) describe full genome characterization of SARS-CoV-2 and correlate the sequences with epidemiological data within family clusters, and (2) conduct phylogenetic analysis of all samples from Yogyakarta and Central Java, Indonesia and other countries. Methods The study involved 17 patients with COVID-19, including two family clusters. We determined the full-genome sequences of SARS-CoV-2 using the Illumina MiSeq next-generation sequencer. Phylogenetic analysis was performed using a dataset of 142 full-genomes of SARS-CoV-2 from different regions. Results Ninety-four SNPs were detected throughout the open reading frame (ORF) of SARS-CoV-2 samples with 58% (54/94) of the nucleic acid changes resulting in amino acid mutations. About 94% (16/17) of the virus samples showed D614G on spike protein and 56% of these (9/16) showed other various amino acid mutations on this protein, including L5F, V83L, V213A, W258R, Q677H, and N811I. The virus samples from family cluster-1 (n = 3) belong to the same clade GH, in which two were collected from deceased patients, and the other from the survived patient. All samples from this family cluster revealed a combination of spike protein mutations of D614G and V213A. Virus samples from family cluster-2 (n = 3) also belonged to the clade GH and showed other spike protein mutations of L5F alongside the D614G mutation. Conclusions Our study is the first comprehensive report associating the full-genome sequences of SARS-CoV-2 with the epidemiological data within family clusters. Phylogenetic analysis revealed that the three viruses from family cluster-1 formed a monophyletic group, whereas viruses from family cluster-2 formed a polyphyletic group indicating there is the possibility of different sources of infection. This study highlights how the same spike protein mutations among members of the same family might show different disease outcomes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.