Exploring statistics of locally connected subgraph patterns (also known as network motifs) has helped researchers better understand the structure and function of biological and Online Social Networks (OSNs). Nowadays, the massive size of some critical networks-often stored in already overloaded relational databases-effectively limits the rate at which nodes and edges can be explored, making it a challenge to accurately discover subgraph statistics. In this work, we propose sampling methods to accurately estimate subgraph statistics from as few queried nodes as possible. We present sampling algorithms that efficiently and accurately estimate subgraph properties of massive networks. Our algorithms require no precomputation or complete network topology information. At the same time, we provide theoretical guarantees of convergence. We perform experiments using widely known datasets and show that, for the same accuracy, our algorithms require an order of magnitude less queries (samples) than the current state-of-the-art algorithms.
Counting 3-, 4-, and 5-node graphlets in graphs is important for graph mining applications such as discovering abnormal/evolution patterns in social and biology networks. In addition, it is recently widely used for computing similarities between graphs and graph classification applications such as protein function prediction and malware detection. However, it is challenging to compute these metrics for a large graph or a large set of graphs due to the combinatorial nature of the problem. Despite recent efforts in counting triangles (a 3-node graphlet) and 4-node graphlets, little attention has been paid to characterizing 5-node graphlets. In this paper, we develop a computationally efficient sampling method to estimate 5-node graphlet counts. We not only provide fast sampling methods and unbiased estimators of graphlet counts, but also derive simple yet exact formulas for the variances of the estimators which is of great value in practice-the variances can be used to bound the estimates' errors and determine the smallest necessary sampling budget for a desired accuracy. We conduct experiments on a variety of real-world datasets, and the results show that our method is several orders of magnitude faster than the state-of-the-art methods with the same accuracy. Index Terms-graphlet kernel, subgraph sampling, graph mining. ! 1 INTRODUCTION F Or complex networks such as online social networks (OSNs), computer networks, and biological networks, designing tools for estimating the counts (or frequencies) of 3-, 4-, and 5-node connected subgraph patterns (i.e., graphlets) shown in Fig. 1 is fundamental for detecting evolution and anomaly patterns in a large graph and computing graph similarities for graph classification, which have been widely used for a variety of graph mining and learning tasks. To explore patterns in a large graph, Milo et al. [1] defined network motifs as graphlets occurring in networks at numbers that are significantly larger than those found in random networks. Network motifs have been used for pattern recognition in gene expression profiling [2], evolution patterns in OSNs [3]-[6], and Internet traffic classification and anomaly detection [7,8]. In addition to mining a single large graph, graphlet counts also have been used to classify a large number of graphs. The graphlet kernel [9] (the dot product of two vectors of normalized graphlet counts) and RGF-distance [10]
Legal Judgment Prediction (LJP) is the task of automatically predicting a law case's judgment results given a text describing its facts, which has excellent prospects in judicial assistance systems and convenient services for the public. In practice, confusing charges are frequent, because law cases applicable to similar law articles are easily misjudged. For addressing this issue, the existing method relies heavily on domain experts, which hinders its application in different law systems. In this paper, we present an end-to-end model, LADAN, to solve the task of LJP. To distinguish confusing charges, we propose a novel graph neural network to automatically learn subtle differences between confusing law articles and design a novel attention mechanism that fully exploits the learned differences to extract compelling discriminative features from fact descriptions attentively. Experiments conducted on realworld datasets demonstrate the superiority of our LADAN.
Characterizing motif (i.e., locally connected subgraph patterns) statistics is important for understanding complex networks such as online social networks and communication networks. Previous work made the strong assumption that the graph topology of interest is known in advance. In practice, sometimes researchers have to deal with the situation where the graph topology is unknown because it is expensive to collect and store all topological and meta information. Hence, typically what is available to researchers is only a snapshot of the graph, i.e., a subgraph of the graph. Crawling methods such as breadth first sampling can be used to generate the snapshot. However, these methods fail to sample a streaming graph represented as a high speed stream of edges. Therefore, graph mining applications such as network traffic monitoring use random edge sampling (i.e., sample each edge with a fixed probability) to collect edges and generate a sampled graph, which we called a "RESampled graph". Clearly, a RESampled graph's motif statistics may be quite different from those of the underlying original graph. To resolve this, we propose a framework and implement a system called Minfer, which takes the given RESampled graph and accurately infers the underlying graph's motif statistics. We also apply Fisher information to bound the errors of our estimates. Experiments using large scale datasets show the accuracy and efficiency of our method.
BACKGROUND Cirrhosis is a major risk factor for the development of hepatocellular carcinoma (HCC). Portal vein thrombosis is not uncommon after splenectomy in cirrhotic patients, and many such patients take oral anticoagulants including aspirin. However, the long-term impact of postoperative aspirin on cirrhotic patients after splenectomy remains unknown. AIM The main purpose of this study was to investigate the effect of postoperative long-term low-dose aspirin administration on the development of HCC and long-term survival of cirrhotic patients after splenectomy. METHODS The clinical data of 264 adult patients with viral hepatitis-related cirrhosis who underwent splenectomy at the First Affiliated Hospital of Xi’an Jiaotong University from January 2000 to December 2014 were analyzed retrospectively. Among these patients, 59 who started taking 100 mg/d aspirin within seven days were enrolled in the aspirin group. The incidence of HCC and overall survival were analyzed. RESULTS During follow-up, 41 (15.53%) patients developed HCC and 37 (14.02%) died due to end-stage liver diseases or other serious complications. Postoperative long-term low-dose aspirin therapy reduced the incidence of HCC from 19.02% to 3.40% after splenectomy (log-rank test, P = 0.028). Univariate and multivariate analyses showed that not undertaking postoperative long-term low-dose aspirin therapy [odds ratio (OR) = 6.211, 95% confidence interval (CI): 1.142-27.324, P = 0.016] was the only independent risk factor for the development of HCC. Similarly, patients in the aspirin group survived longer than those in the control group (log-rank test, P = 0.041). Univariate and multivariate analyses showed that the only factor that independently associated with improved overall survival was postoperative long-term low-dose aspirin therapy [OR = 0.218, 95%CI: 0.049-0.960, P = 0.044]. CONCLUSION In patients with viral hepatitis-related cirrhosis, long-term post-splenectomy administration of low-dose aspirin reduces the incidence of HCC and improves the long-term overall survival.
Predicting interactions between structured entities lies at the core of numerous tasks such as drug regimen and new material design. In recent years, graph neural networks have become attractive. They represent structured entities as graphs, and then extract features from each individual graph using graph convolution operations. However, these methods have some limitations: i) their networks only extract features from a fix-sized subgraph structure (i.e., a fix-sized receptive field) of each node, and ignore features in substructures of different sizes, and ii) features are extracted by considering each entity independently, which may not effectively reflect the interaction between two entities. To resolve these problems, we present MR-GNN, an end-to-end graph neural network with the following features: i) it uses a multi-resolution based architecture to extract node features from different neighborhoods of each node, and, ii) it uses dual graph-state long short-term memory networks (LSTMs) to summarize local features of each graph and extracts the interaction features between pairwise graphs. Experiments conducted on real-world datasets show that MR-GNN improves the prediction of state-of-the-art methods. * Corresponding Authors † Nuo Xu and Pinghui Wang contributed equally to this work. Molecular formula a combined medication scheme Whether Allopurinol would increase the risk of a hypersensitivity reaction to Amoxicillin ? Bad scheme Not bad scheme No Allopurinol Amoxicillin Interaction prediction Represent as graphs Yes N O arXiv:1905.09558v1 [cs.LG]
Background: Pyogenic liver abscess (PLA) is an inflammatory disease with increasing incidence. When it occurs with diabetes mellitus (DM), the risk of recurrence and mortality may increase. However, the effect of DM on the short-term prognosis of PLA patients after hospitalization remained unknown. Methods: Two hundred twenty-seven PLA patients who received treatment at the First were retrospectively enrolled. They were divided into two groups as the DM group (n = 61) and the Non-DM group (n = 166). In the DM group, HbA1C level < 7% was considered to be good-control of glycaemia (n = 23). The clinical characteristics and overall short-term survival were analyzed. Results: The proportion of PLA patients with DM was 26.87%. In the DM group, there was a higher incidence of hypertension and Candida spp. infection. Conservative administration and percutaneous drainage were mainly used in patients with good-(60.87%) and poor-control (60.53%) of glycaemia, respectively. During follow-up, 24 (10.57%) died due to uncontrolled systemic infections and other serious complications. Compared with PLA patients without DM, patients in the DM group had significantly increased 6-month mortality rate after discharge (Log-Rank test, P = 0.021). Poor-control of glycaemia did not reduce the six-month survival, while the recurrence rate of PLA within 3 months showed an almost 3-fold increase (13.16% vs. 4.35%). Further multivariate analyses found that DM was the only independent risk factor for the PLA six-month survival (odds ratio [OR]: 3.019, 95% confidence interval [CI]: 1.138-8.010, P = 0.026). However, the blood glucose level had no significant effect on the short-term survival of PLA patients with DM (Log-Rank test, P = 0.218). Conclusions: In PLA patients, DM aggravated short-term mortality and blood glucose levels should be well controlled.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.