With the advent of IoT, storing, indexing and querying XML data efficiently is critical. To minimize the cost of querying XML data, researchers have proposed many indexing techniques. Nearly all the techniques, partition the XML data into a number of data-streams. To evaluate a query, existing twig pattern matching algorithms process a subset of the data-streams simultaneously. Processing many data-streams simultaneously results in some or all of the following four problems, namely, the accessing of many data nodes which don't appear in the final solution of a given query, the generation of duplicate results, the generation of huge number of intermediate results, and the cost of merging the generated intermediate results. To the best of our knowledge, all the existing twig pattern matching algorithms suffer from some or all of the above mentioned problems. This paper proposes a new twig pattern matching algorithm called MatchQTP which processes one data-stream at a time and avoids all the above mentioned four problems. It also proposes a new indexing technique called RLP-Index and a new XML node labeling scheme called RLP-Scheme, both of which are used by MatchQTP. Unlike the existing indexing techniques, RLP-Index stores a subset of the data nodes. The rest of the data nodes can be generated efficiently. This minimizes storage space utilization and query processing time and makes RLP-Index the first of its kind. Many experiments were conducted to study the performance of MatchQTP. The results show that MatchQTP is very efficient and highly scalable. It was also compared with four algorithms, three of which are used frequently in the literature to compare the performance of new algorithms and the fourth algorithm is the state-of-the-art algorithm. MatchQTP significantly and consistently outperformed all of them. INDEX TERMS IoT, XML indexing, twig queries, XML query processing, tree-pattern matching, node labeling.
This paper describes the design, prototype implementation and performance characteristics of a DiseaseOutbreak Notification System (DONS). The prototype was implemented in a hybrid cloud environment as an online/real-time system. It detects potential outbreaks of both listed and unknown diseases. It uses data mining techniques to choose the correct algorithm to detect outbreaks of unknown diseases. Our experiments showed that the proposed system has very high accuracy rate in choosing the correct detection algorithm. To our best knowledge, DONS is the first of its kind to detect outbreaks of unknown diseases using data mining techniques.
Data that is needed to detect outbreaks of known and unknown diseases is often gathered from sources that are scattered in many geographical locations. Often these scattered data exist in a wide variety of formats, structures, and models. The collection, pre-processing, and analysis of these data to detect potential disease outbreaks is very challenging, time-consuming and error-prone. To fight disease outbreaks, healthcare practitioners, epidemiologists and researchers need to access the scattered data in a secure and timely manner. They also require a uniform and logical framework or methodology to access the relevant data. In this paper, authors propose a federated framework for Disease Outbreak Notification Systems (DONSFed). Using advanced design and an XML technique patented in the US in 2016 by our team, the framework was tested and validated as part of this work. The proposed approach enables healthcare professionals to quickly and uniformly access data that is required to detect potential disease outbreaks. This research focuses on implementing a cloud-based prototype as a proof-of-concept to demonstrate the functionalities and to verify the concept of the proposed framework.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.