DNA is now routinely used in criminal investigations and court cases, although DNA samples taken at crime scenes are of varying quality and therefore present challenging problems for their interpretation. We present a statistical model for the quantitative peak information obtained from an electropherogram of a forensic DNA sample and illustrate its potential use for the analysis of criminal cases. In contrast with most previously used methods, we directly model the peak height information and incorporate important artefacts that are associated with the production of the electropherogram. Our model has a number of unknown parameters, and we show that these can be estimated by the method of maximum likelihood in the presence of multiple unknown individuals contributing to the sample, and their approximate standard errors calculated; the computations exploit a Bayesian network representation of the model. A case example from a UK trial, as reported in the literature, is used to illustrate the efficacy and use of the model, both in finding likelihood ratios to quantify the strength of evidence, and in the deconvolution of mixtures for finding likely profiles of the individuals contributing to the sample. Our model is readily extended to simultaneous analysis of more than one mixture as illustrated in a case example. We show that the combination of evidence from several samples may give an evidential strength which is close to that of a single-source trace and thus modelling of peak height information provides a potentially very efficient mixture analysis.
Bayesian networks is an emerging tool for a wide range of risk management applications, one of which is the modeling of operational risk. This comes at a time when changes in the supervision of financial institutions have resulted in increased scrutiny on the risk management of banks and insurance companies, thus giving the industry an impetus to measure and manage operational risk. The more established methods for risk quantification are linear models such as time series models, econometric models, empirical actuarial models, and extreme value theory. Due to data limitations and complex interaction between operational risk variables, various nonlinear methods have been proposed, one of which is the focus of this article: Bayesian networks. Using an idealized example of a fictitious on line business, we construct a Bayesian network that models various risk factors and their combination into an overall loss distribution. Using this model, we show how established Bayesian network methodology can be applied to: (1) form posterior marginal distributions of variables based on evidence, (2) simulate scenarios, (3) update the parameters of the model using data, and (4) quantify in real-time how well the model predictions compare to actual data. A specific example of Bayesian networks application to operational risk in an insurance setting is then suggested. Copyright The Journal of Risk and Insurance, 2007.
We describe an expert system, Maies, under development for analysing forensic identification problems involving DNA mixture traces using quantitative peak area information. Peak area information is represented by conditional Gaussian distributions, and inference based on exact junction tree propagation ascertains whether individuals, whose profiles have been measured, have contributed to the mixture. The system can also be used to predict DNA profiles of unknown contributors by separating the mixture into its individual components. The use of the system is illustrated with an application to a real world example. The system implements a novel MAP (maximum a posteriori) search algorithm that is briefly described.
This is the accepted version of the paper.This version of the publication may differ from the final published version. Permanent AbstractA simple and efficient algorithm is presented for finding a maximum likelihood pedigree using microsatellite (STR) genotype information on a complete sample of related individuals. The computational complexity of the algorithm is at worst (O(n 3 2 n )), where n is the number of individuals. Thus it is possible to exhaustively search the space of all pedigrees of up to thirty individuals for one that maximizes the likelihood. A priori age and sex information can be used if available, but is not essential. The algorithm is applied in a simulation study, and to some real data on humans.
We show how probabilistic expert systems can be used to analyse forensic identification problems involving DNA mixture traces using quantitative peak area information. Peak area is modelled with conditional Gaussian distributions. The expert system can be used for ascertaining whether individuals, whose profiles have been measured, have contributed to the mixture, but also to predict DNA profiles of unknown contributors by separating the mixture into its individual components. The potential of our methodology is illustrated on case data examples and compared with alternative approaches. The advantages are that identification and separation issues can be handled in a unified way within a single network model and the uncertainty associated with the analysis is quantified.Some key words and phrases: Bayesian network, conditional Gaussian distributions, DNA mixture, DNA profile, forensic identification, mixture separation, probabilistic expert system, peak weight.
This is the published version of the paper.This version of the publication may differ from the final published version. Abstract: We introduce a subclass of chain event graphs that we call stratified chain event graphs, and present a dynamic programming algorithm for the optimal selection of such chain event graphs that maximizes a decomposable score derived from a complete independent sample. We apply the algorithm to such a dataset, with a view to deducing the causal structure of the variables under the hypothesis that there are no unobserved confounders. We show that the algorithm is suitable for small problems. Similarities with and differences to a dynamic programming algorithm for MAP learning of Bayesian networks are highlighted, as are the relations to causal discovery using Bayesian networks. Permanent repository link
This paper presents a coherent probabilistic framework for taking account of allelic dropout, stutter bands and silent alleles when interpreting STR DNA profiles from a mixture sample using peak size information arising from a PCR analysis. This information can be exploited for evaluating the evidential strength for a hypothesis that DNA from a particular person is present in the mixture. It extends an earlier Bayesian network approach that ignored such artifacts. We illustrate the use of the extended network on a published casework example.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.