Hi-C is a genome-wide sequencing technique to investigate the 3D chromatin conformation inside the nucleus. The most studied structures that can be identified from Hi-C - chromatin interactions and topologically associating domains (TADs) - require computational methods to analyze genome-wide contact probability maps. We quantitatively compared the performances of 13 algorithms for the analysis of Hi-C data from 6 landmark studies and simulations. The comparison revealed clear differences in the performances of methods to identify chromatin interactions and more comparable results of algorithms for TAD detection.
The advent of high-throughput genome scale technologies has enabled us to unravel a large amount of the previously unknown transcriptionally active regions of the genome. Recent genome-wide studies have provided annotations of a large repertoire of various classes of noncoding transcripts. Long noncoding RNAs (lncRNAs) form a major proportion of these novel annotated noncoding transcripts, and presently known to be involved in a number of functionally distinct biological processes. Over 18 000 transcripts are presently annotated as lncRNA, and encompass previously annotated classes of noncoding transcripts including large intergenic noncoding RNA, antisense RNA and processed pseudogenes. There is a significant gap in the resources providing a stable annotation, cross-referencing and biologically relevant information. lncRNome has been envisioned with the aim of filling this gap by integrating annotations on a wide variety of biologically significant information into a comprehensive knowledgebase. To the best of our knowledge, lncRNome is one of the largest and most comprehensive resources for lncRNAs.Database URL: http://genome.igib.res.in/lncRNome
In the epigenetics field, large-scale functional genomics datasets of ever-increasing size and complexity have been produced using experimental techniques based on high-throughput sequencing. In particular, the study of the 3D organization of chromatin has raised increasing interest, thanks to the development of advanced experimental techniques. In this context, Hi-C has been widely adopted as a high-throughput method to measure pairwise contacts between virtually any pair of genomic loci, thus yielding unprecedented challenges for analyzing and handling the resulting complex datasets. In this review, we focus on the increasing complexity of available Hi-C datasets, which parallels the adoption of novel protocol variants. We also review the complexity of the multiple data analysis steps required to preprocess Hi-C sequencing reads and extract biologically meaningful information. Finally, we discuss solutions for handling and visualizing such large genomics datasets.
In Drosophila melanogaster the single male chromosome X undergoes an average twofold transcriptional upregulation for balancing the transcriptional output between sexes. Previous literature hypothesised that a global change in chromosome structure may accompany this process. However, recent studies based on Hi-C failed to detect these differences. Here we show that global conformational differences are specifically present in the male chromosome X and detectable using Hi-C data on sex-sorted embryos, as well as male and female cell lines, by leveraging custom data analysis solutions. We find the male chromosome X has more mid-/long-range interactions. We also identify differences at structural domain boundaries containing BEAF-32 in conjunction with CP190 or Chromator. Weakening of these domain boundaries in male chromosome X co-localizes with the binding of the dosage compensation complex and its co-factor CLAMP, reported to enhance chromatin accessibility. Together, our data strongly indicate that chromosome X dosage compensation affects global chromosome structure.
A growing amount of evidence in literature suggests that germline sequence variants and somatic mutations in non-coding distal regulatory elements may be crucial for defining disease risk and prognostic stratification of patients, in genetic disorders as well as in cancer. Their functional interpretation is challenging because genome-wide enhancer–target gene (ETG) pairing is an open problem in genomics. The solutions proposed so far do not account for the hierarchy of structural domains which define chromatin three-dimensional (3D) architecture. Here we introduce a change of perspective based on the definition of multi-scale structural chromatin domains, integrated in a statistical framework to define ETG pairs. In this work (i) we develop a computational and statistical framework to reconstruct a comprehensive map of ETG pairs leveraging functional genomics data; (ii) we demonstrate that the incorporation of chromatin 3D architecture information improves ETG pairing accuracy and (iii) we use multiple experimental datasets to extensively benchmark our method against previous solutions for the genome-wide reconstruction of ETG pairs. This solution will facilitate the annotation and interpretation of sequence variants in distal non-coding regulatory elements. We expect this to be especially helpful in clinically oriented applications of whole genome sequencing in cancer and undiagnosed genetic diseases research.
A large repertoire of gene-centric data has been generated in the field of zebrafish biology. Although the bulk of these data are available in the public domain, most of them are not readily accessible or available in nonstandard formats. One major challenge is to unify and integrate these widely scattered data sources. We tested the hypothesis that active community participation could be a viable option to address this challenge. We present here our approach to create standards for assimilation and sharing of information and a system of open standards for database intercommunication. We have attempted to address this challenge by creating a community-centric solution for zebrafish gene annotation. The Zebrafish GenomeWiki is a ‘wiki’-based resource, which aims to provide an altruistic shared environment for collective annotation of the zebrafish genes. The Zebrafish GenomeWiki has features that enable users to comment, annotate, edit and rate this gene-centric information. The credits for contributions can be tracked through a transparent microattribution system. In contrast to other wikis, the Zebrafish GenomeWiki is a ‘structured wiki’ or rather a ‘semantic wiki’. The Zebrafish GenomeWiki implements a semantically linked data structure, which in the future would be amenable to semantic search.Database URL: http://genome.igib.res.in/twiki
Summary Genome-wide chromosome conformation capture based on high-throughput sequencing (Hi-C) has been widely adopted to study chromatin architecture by generating datasets of ever-increasing complexity and size. HiCBricks offers user-friendly and efficient solutions for handling large high-resolution Hi-C datasets. The package provides an R/Bioconductor framework with the bricks to build more complex data analysis pipelines and algorithms. HiCBricks already incorporates functions for calling domain boundaries and functions for high-quality data visualization. Availability and implementation http://bioconductor.org/packages/devel/bioc/html/HiCBricks.html. Contact francesco.ferrari@ifom.eu Supplementary information Supplementary data are available at Bioinformatics online.
A metal-free approach for C(sp 3)–H activation followed by intramolecular Giese reaction to construct a wide range of cyclic-ether scaffolds of different ring sizes is reported using environmentally benign and straightforward conditions. An easy-to-prepare pyrylium salt is employed as an organo-photocatalyst for this visible-light-driven, high atom-economical (PMI = 64.34 g/g for 0.2 mmol scale), cost-effective and chemoselective transformation. The reported methodology has high functional group tolerance, resulting in good-quality products. Further, downstream functionalization of the products and a gram scale synthesis (PMI = 17.41 g/g for 10 mmol scale) is demonstrated, highlighting our methodology's advancement.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.