Family trees have vast applications in fields as diverse as genetics, anthropology, and economics. However, the collection of extended family trees is tedious and usually relies on resources with limited geographical scope and complex data usage restrictions. We collected 86 million profiles from publicly available online data shared by genealogy enthusiasts. After extensive cleaning and validation, we obtained population-scale family trees, including a single pedigree of 13 million individuals. We leveraged the data to partition the genetic architecture of human longevity and to provide insights into the geographical dispersion of families. We also report a simple digital procedure to overlay other data sets with our resource.
Compressive sensing microarrays (CSMs) are DNA-based sensors that operate using group testing and compressive sensing (CS) principles. In contrast to conventional DNA microarrays, in which each genetic sensor is designed to respond to a single target, in a CSM, each sensor responds to a set of targets. We study the problem of designing CSMs that simultaneously account for both the constraints from CS theory and the biochemistry of probe-target DNA hybridization. An appropriate cross-hybridization model is proposed for CSMs, and several methods are developed for probe design and CS signal recovery based on the new model. Lab experiments suggest that in order to achieve accurate hybridization profiling, consensus probe sequences are required to have sequence homology of at least 80% with all targets to be detected. Furthermore, out-of-equilibrium datasets are usually as accurate as those obtained from equilibrium conditions. Consequently, one can use CSMs in applications in which only short hybridization times are allowed.
Hemifacial microsomia (HFM) is the second most common facial anomaly after cleft lip and palate. The phenotype is highly variable and most cases are sporadic. We investigated the disorder in a large pedigree with five affected individuals spanning eight meioses. Whole-exome sequencing results indicated the absence of a pathogenic coding point mutation. A genome-wide survey of segmental variations identified a 1.3 Mb duplication of chromosome 14q22.3 in all affected individuals that was absent in more than 1000 chromosomes of ethnically matched controls. The duplication was absent in seven additional sporadic HFM cases, which is consistent with the known heterogeneity of the disorder. To find the critical gene in the duplicated region, we analyzed signatures of human craniofacial disease networks, mouse expression data, and predictions of dosage sensitivity. All of these approaches implicated OTX2 as the most likely causal gene. Moreover, OTX2 is a known oncogenic driver in medulloblastoma, a condition that was diagnosed in the proband during the course of the study. Our findings suggest a role for OTX2 dosage sensitivity in human craniofacial development and raise the possibility of a shared etiology between a subtype of hemifacial microsomia and medulloblastoma.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.