SummaryThe need for new methods to deal with big data is a common theme in most scientific fields, although its definition tends to vary with the context. Statistical ideas are an essential part of this, and as a partial response, a thematic program on statistical inference, learning and models in big data was held in 2015 in Canada, under the general direction of the Canadian Statistical Sciences Institute, with major funding from, and most activities located at, the Fields Institute for Research in Mathematical Sciences. This paper gives an overview of the topics covered, describing challenges and strategies that seem common to many different areas of application and including some examples of applications to make these challenges and strategies more concrete.
Blest (2000) proposed a new nonparametric measure of correlation between two random variables. His coefficient, which is dissymmetric in its arguments, emphasizes discrepancies observed among the first ranks in the orderings induced by the variables. The authors derive the limiting distribution of Blest's index and suggest symmetric variants whose merits as statistics for testing independence are explored using asymptotic relative efficiency calculations and Monte Carlo simulations. A propos de la mesure de correlation des rangs de Blest Resumef: Blest (2000) a propose une nouvelle mesure non parametrique de correlation entre deux variables aleatoires. Son coefficient, qui est asymetrique en ses arguments, met l'accent sur les ecarts observes dans les premiers rangs des classements induits par les aleas. Les auteurs determinent la loi limite de l'indice de Blest et en suggerentdes variantes symetriques dont ils explorent les merites en tant que statistiques de tests d'independance au moyen de calculs d'efficacite relative asymptotique et d'une etude de Monte-Carlo. C(u,v) = P{F(X) < u, G(Y)
IntroductionIn marine studies, the secondary production of macrobenthic populations can be estimated in a number of ways. Classic techniques can be broadly classified as cohort-and size-based. Prominent in the first group are the Allen-curve, incrementand removal-summation, and instantaneous-growth methods. The second group consists mainly of the size-frequency, massspecific growth rate, and mass-specific mortality rate methods.These estimation techniques, and numerous variants thereof, have been described, e.g., by Benke (1984), Crisp (1984), andRigler andDowning (1984). They have been extensively compared through simulations, most notably by Cushman et al. (1978), Lapchin andNeveu (1980), andMorin et al. (1987). These and other authors examined how various estimates were affected by different assumptions on the growth and mortality curves, as well as by sampling effort. Their hypothetical populations, which were alternately assumed to be synchronous or not, were modeled on freshwater species having a lifespan of 1 year or less.This article revisits the issue and expands on the comparison of the above production estimates, and several of their variants introduced in the literature, using a hypothetical population of mussels characterized by a number of realistic factors. Most prominent among them are the simultaneous presence of different cohorts, quadrat-dependent population density, seasonal growth oscillation, gradual recruit arrival, and random individual variation both in survival and in weight gain.The simulations presented herein are also novel in that several aspects of the sampling design are considered, namely the number of sampling occasions (either regularly spaced or concentrated in the growing season), quadrats sampled per occasion, and use of two different sieves. Following Cushman et al. (1978), the bias and variance of the various estimators are compared to the exact production obtained by summing all individual body-mass gains, rather than to a theoretical Allen curve which may or may not reflect reality. Effect of different sampling designs and methods on the estimation of secondary production: A simulation AbstractThis article reports the results of a simulation study designed to investigate the effect of several sampling design factors on the accuracy and precision of various estimates of secondary production. Whereas most previous studies of this sort were concerned with freshwater fauna (e.g., insects), the hypothetical population used here reflects the characteristics of marine mussels from cold-temperate and subarctic regions. It features the simultaneous presence of different cohorts, gradual recruit arrival, seasonal growth oscillation, and quadratdependent population density, as well as random individual variation both in survival and in weight gain. For this population, the percentage relative bias (PRB) and relative root mean squared error (RRMSE) of 4 classic cohort-based methods, 3 size-based methods, and several variants thereof were computed as a function of sampling frequency...
Abstract:The weighted likelihood can be used to make inference about one population when data from similar populations are available. The author shows heuristically that the weighted likelihood can be seen as a special case of the entropy maximization principle. This leads him to propose the minimum averaged mean squared error (MAMSE) weights. He describes an algorithm for calculating these weights and shows its convergence using the Kuhn-Tucker conditions. He explores the performance and properties of the weighted likelihood based on MAMSE weights through simulations. Poids empiriques non paramétriques pour la vraisemblanceRésumé : La vraisemblance pondérée permet de faire de l'inférence sur une population en incorporant des données issues de populations semblables. L'auteur montre heuristiquement que la vraisemblance pondé-rée peutêtre vue comme un cas particulier du principe d'entropie maximale. Ceci le conduità proposer les poids EQMIM (pour erreur quadratique moyenne intégrée minimale). Il décrit un algorithme pour le calcul de ces poids et en montre la convergence grâce aux conditions de Kuhn-Tucker. Il explore la performance et les propriétés de la vraisemblance pondérée basée sur les poids EQMIMà l'aide de simulations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.