Associations between the 'Meat' and 'Nuts & Seeds' protein factors and cardiovascular outcomes were strong and could not be ascribed to other associated nutrients considered to be important for cardiovascular health. Healthy diets can be advocated based on protein sources, preferring low contributions of protein from meat and higher intakes of plant protein from nuts and seeds.
Modelling relationships between individuals is a classical question in social sciences and clustering individuals according to the observed patterns of interactions allows us to uncover a latent structure in the data. The stochastic block model is a popular approach for grouping individuals with respect to their social comportment. When several relationships of various types can occur jointly between individuals, the data are represented by multiplex networks where more than one edge can exist between the nodes. We extend stochastic block models to multiplex networks to obtain a clustering based on more than one kind of relationship. We propose to estimate the parameters-such as the marginal probabilities of assignment to groups (blocks) and the matrix of probabilities of connections between groups-through a variational expectation-maximization procedure. Consistency of the estimates is studied. The number of groups is chosen by using the integrated completed likelihood criterion, which is a penalized likelihood criterion. Multiplex stochastic block models arise in many situations but our applied example is motivated by a network of French cancer researchers. The two possible links (edges) between researchers are a direct connection or a connection through their laboratories. Our results show strong interactions between these two kinds of connection and the groups that are obtained are discussed to emphasize the common features of researchers grouped together.
This paper deals with non-observed dyads during the sampling of a network and consecutive issues in the inference of the Stochastic Block Model (SBM). We review sampling designs and recover Missing At Random (MAR) and Not Missing At Random (NMAR) conditions for the SBM. We introduce variants of the variational EM algorithm for inferring the SBM under various sampling designs (MAR and NMAR) all available as an R package. Model selection criteria based on Integrated Classification Likelihood are derived for selecting both the number of blocks and the sampling design. We investigate the accuracy and the range of applicability of these algorithms with simulations. We explore two real-world networks from ethnology (seed circulation network) and biology (protein-protein interaction network), where the interpretations considerably depends on the sampling designs considered.Stochastic Block Model · Variational inference · Missing data · Sampled network arXiv:1707.04141v6 [stat.ME] 9 Jan 2019 1 More complex sampling schemes -for instance adversarial strategies -are thus not handled
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.