There are 10× more bacterial cells in our bodies from the microbiome than human cells. Viral DNA is known to integrate in the human genome, but the integration of bacterial DNA has not been described. Using publicly available sequence data from the human genome project, the 1000 Genomes Project, and The Cancer Genome Atlas (TCGA), we examined bacterial DNA integration into the human somatic genome. Here we present evidence that bacterial DNA integrates into the human somatic genome through an RNA intermediate, and that such integrations are detected more frequently in (a) tumors than normal samples, (b) RNA than DNA samples, and (c) the mitochondrial genome than the nuclear genome. Hundreds of thousands of paired reads support random integration of Acinetobacter-like DNA in the human mitochondrial genome in acute myeloid leukemia samples. Numerous read pairs across multiple stomach adenocarcinoma samples support specific integration of Pseudomonas-like DNA in the 5′-UTR and 3′-UTR of four proto-oncogenes that are up-regulated in their transcription, consistent with conversion to an oncogene. These data support our hypothesis that bacterial integrations occur in the human somatic genome and may play a role in carcinogenesis. We anticipate that the application of our approach to additional cancer genome projects will lead to the more frequent detection of bacterial DNA integrations in tumors that are in close proximity to the human microbiome.
Abstract-1 Traditional approaches to building a large scale knowledge graph have usually relied on extracting information (entities, their properties, and relations between them) from unstructured text (e.g. Dbpedia). Recent advances in Convolutional Neural Networks (CNN) allow us to shift our focus to learning entities and relations from images, as they build robust models that require little or no pre-processing of the images. In this paper, we present an approach to identify and extract spatial relations (e.g., The girl is standing behind the table) from images using CNNs. Our research addresses two specific challenges: providing insight into how spatial relations are learned by the network and which parts of the image are used to predict these relations. We use the pre-trained network VGGNet to extract features from an image and train a Multi-layer Perceptron (MLP) on a set of synthetic images and the sun09 dataset to extract spatial relations. The MLP predicts spatial relations without a bounding box around the objects or the space in the image depicting the relation. To understand how the spatial relations are represented in the network, a heatmap is overlayed on the image to show the regions that are deemed important by the network. Also, we analyze the MLP to show the relationship between the activation of consistent groups of nodes and the prediction of a spatial relation. We show how the loss of these groups affects the network's ability to identify relations.
We propose a Bi-Directional Manifold Alignment (BDMA) that learns a non-linear mapping between two manifolds by explicitly training it to be bijective. We demonstrate BDMA by training a model for a pair of languages rather than individual, directed source and target combinations, reducing the number of models by 50%. We show that models trained with BDMA in the "forward" (source to target) direction can successfully map words in the "reverse" (target to source) direction, yielding equivalent (or better) performance to standard unidirectional translation models where the source and target language is flipped. We also show how BDMA reduces the overall size of the model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.