Most real systems can be represented as a graph of multi-typed components with a large number of interactions. Heterogeneous Information Networks (HIN) are interconnected structures with data of multiple types which support the rich semantic meaning of structural types of nodes and edges. In HIN, different information can be presented using different types and forms of data, but may have the same or complementary information. So there is knowledge to be discovered. Terminology Knowledge Structures (TKS) como terminology products can be sources of linguistic representations and knowledge to be used for enrich the HIN and create a measure of similarity to extract the documents similar to each other, even if these documents are of different types (for example, finding medical articles that are in some way related to medical records). In this sense, this work presents the creation of a Heterogeneous Information Network using classical similarity measures, terminology products and the attributes of documents by an algorithm called NetworkCreator. As a contribution, an algorithm called NetworkCreator was created that from medical records and scientific articles builds an HIN with related documents, was also created. The algorithm HeteSimTKSQuery to calculate similarity measures between documents of different types which are in HIN. Terminology products with meta-paths were also explored. The results were efficient, reaching on average 89% accuracy in some cases. However, it is important to note that all HIN presented in the researched literature were constructed only by one type of data coming from a single source. The results show that the algorithms are feasible to solve the problems of HIN construction and search for similarity. But it still needs improvement. In the future one can work on detection in the detection of node granularity of these networks and try to reduce the network construction runtime.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.