Reconstruction of sibling relationships from genetic data is an important component of many biological applications. In particular, the growing application of molecular markers (microsatellites) to study wild populations of plant and animals has created the need for new computational methods of establishing pedigree relationships, such as sibgroups, among individuals in these populations. Most current methods for sibship reconstruction from microsatellite data use statistical and heuristic techniques that rely on a priori knowledge about various parameter distributions. Moreover, these methods are designed for data with large number of sampled loci and small family groups, both of which typically do not hold for wild populations. We present a deterministic technique that parsimoniously reconstructs sibling groups using only Mendelian laws of inheritance. We validate our approach using both simulated and real biological data and compare it to other methods. Our method is highly accurate on real data and compares favorably with other methods on simulated data with few loci and large family groups. It is the only method that does not rely on a priori knowledge about the population under study. Thus, our method is particularly appropriate for reconstructing sibling groups in wild populations.
W ith improved tools for collecting genetic data from natural and experimental populations, new opportunities arise to study fundamental biological processes, including behavior, mating systems, adaptive trait evolution, and dispersal patterns. Full use of the newly available genetic data often depends upon reconstructing genealogical relationships of individual organisms, such as sibling reconstruction. This paper presents a new optimization framework for sibling reconstruction from single generation microsatellite genetic data. Our framework is based on assumptions of parsimony and combinatorial concepts of Mendel's inheritance rules. Here, we develop a novel optimization model for sibling reconstruction as a large-scale mixed-integer program (MIP), shown to be a generalization of the set covering problem. We propose a new heuristic approach to efficiently solve this large-scale optimization problem. We test our approach on real biological data as presented in other studies as well as simulated data, and compare our results with other state-of-the-art sibling reconstruction methods. The empirical results show that our approaches are very efficient and outperform other methods while providing the most accurate solutions for two benchmark data sets. The results suggest that our framework can be used as an analytical and computational tool for biologists to better study ecological and evolutionary processes involving knowledge of familial relationships in a wide variety of biological systems.
A software suite KINALYZER reconstructs full-sibling groups without parental information using data from codominant marker loci such as microsatellites. KINALYZER utilizes a new algorithm for sibling reconstruction in diploid organisms based on combinatorial optimization. KINALYZER makes use of a Minimum 2-Allele Set Cover approach based on Mendelian inheritance rules and finds the smallest number of sibling groups that contain all the individuals in the sample. Also available is a 'Greedy Consensus' approach that reconstructs sibgroups using subsets of loci and finds the consensus of the partial solutions. Unlike likelihood methods for sibling reconstruction, KINALYZER does not require information about population allele frequencies and it makes no assumptions regarding the mating system of the species. KINALYZER is freely available as a web-based service.
A 10.6 m female whale shark Rhincodon typus caught off the coast of eastern Taiwan in 1995 carried 304 embryos that ranged in developmental stage from individuals still in egg cases to hatched and free-swimming near-term animals. This litter established that whale sharks develop by aplacental yolk-sac viviparity, with embryos hatching from eggs within the female. The range of developmental stages in this litter suggested ongoing fertilization over an extended period of time, with embryos of different ages possibly sired by different males. A series of 9 microsatellite markers for R. typus have now been used to investigate paternity in a subset of these embryos. We determined the paternity of 29 embryos representing 10% of the original litter, and spanning most of the range of size and developmental stage of the 304 embryos. All were full siblings sired by the same male, suggesting that this male may have sired the entire litter. Probability analysis indicates that a second male could go undetected if it sired less than 10% of the litter. The range of developmental stages of embryos from this single sire further suggests that female whale sharks may have the ability to store sperm for later fertilization. In the absence of any tissue to determine parental genotypes, maternal mitochondrial sequence was obtained from the embryos, identifying a novel haplotype linked to those from the western Indian Ocean. This finding adds further support for the global population structure emerging for R. typus.
Using complex roots of unity and the Fast Fourier Transform, we design a new thermodynamics-based algorithm, FFTbor, that computes the Boltzmann probability that secondary structures differ by base pairs from an arbitrary initial structure of a given RNA sequence. The algorithm, which runs in quartic time and quadratic space , is used to determine the correlation between kinetic folding speed and the ruggedness of the energy landscape, and to predict the location of riboswitch expression platform candidates. A web server is available at http://bioinformatics.bc.edu/clotelab/FFTbor/.
While full sibling group reconstruction from microsatellite data is a well studied problem, reconstruction of half sibling groups is much less studied, theoretically challenging, and a computationally demanding problem. In this paper, we present a formulation of the half-sibling reconstruction problem and prove it APX-hardness. We also present exact solutions for this formulation and develop heuristics. Using biological and synthetic datasets we present experimental results and compare them with the leading alternative software COLONY. We show that our results are competitive and allow half-sibling group reconstruction in the presence of polygamy, which is prevalent in nature.
Abstract. Predicting the folding of an RNA sequence, while allowing general pseudoknots (PK), consists in finding a minimal free-energy matching of its n positions. Assuming independently contributing basepairs, the problem can be solved in Θ(n 3 )-time using a variant of the maximal weighted matching. By contrast, the problem was previously proven NP-Hard in the more realistic nearest-neighbor energy model. In this work, we consider an intermediate model, called the stackingpairs energy model. We extend a result by Lyngsø, showing that RNA folding with PK is NP-Hard within a large class of parametrization for the model. We also show the approximability of the problem, by giving a practical Θ(n 3 ) algorithm that achieves at least a 5-approximation for any parametrization of the stacking model. This contrasts nicely with the nearest-neighbor version of the problem, which we prove cannot be approximated within any positive ratio, unless P = N P .
Kinship analysis using genetic data is important for many biological applications, including many in conservation biology. Wide availability of microsatellites has boosted studies in wild populations that rely on the knowledge of kinship, particularly sibship. While there exist many methods for reconstructing sibling relationships, almost none account for errors and mutations in microsatellite data, which are prevalent and affect quality of reconstruction. We present an error-tolerant method for reconstructing sibling relationships based on the ideas of consensus methods. We test our approach on both real and simulated data, with both pre-existing and introduced errors. Our method is highly accurate on almost all simulations, giving over 90% accuracy in most cases. Ours is the first method designed to tolerate errors while making no assumptions about the population or the sampling.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.