Soft document clustering using a novel graph covering approach

Dörpinghaus, Jens; Schaaf, Sebastian; Jacobs, Marc

doi:10.1186/s13040-018-0172-x

Cited by 8 publications

(2 citation statements)

References 38 publications

(42 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Clustering is usually not perceived as a graph problem, although several attempts have been made (e.g. [30]) and here we will show how to generalize it on knowledge graphs. Usually the problem can be formulated in the following way: Given a similarity function for the document or data space D as sim : D × D → R + and an ǫ ∈ R + .…”

Section: Document or Data Clusteringmentioning

confidence: 99%

Knowledge Detection and Discovery using Semantic Graph Embeddings on Large Knowledge Graphs generated on Text Mining Results

Dörpinghaus

Jacobs

2020

Proceedings of the 2020 Federated Conference on Computer Science and Information Systems

Self Cite

View full text Add to dashboard Cite

Knowledge graphs play a central role in big data integration, especially for connecting data from different domains. Bringing unstructured texts, e.g. from scientific literature, into a structured, comparable format is one of the key assets. Here, we use knowledge graphs in the biomedical domain working together with text mining based document data for knowledge extraction and retrieval from text and natural language structures. For example cause and effect models, can potentially facilitate clinical decision making or help to drive research towards precision medicine. However, the power of knowledge graphs critically depends on context information. Here we provide a novel semantic approach towards a context enriched biomedical knowledge graph utilizing data integration with linked data applied to language technologies and text mining. This graph concept can be used for graph embedding applied in different approaches, e.g with focus on topic detection, document clustering and knowledge discovery. We discuss algorithmic approaches to tackle these challenges and show results for several applications like search query finding and knowledge discovery. The presented remarkable approaches lead to valuable results on large knowledge graphs.

show abstract

Section: Document or Data Clusteringmentioning

confidence: 99%

Knowledge Detection and Discovery using Semantic Graph Embeddings on Large Knowledge Graphs generated on Text Mining Results

Dörpinghaus

Jacobs

2020

Proceedings of the 2020 Federated Conference on Computer Science and Information Systems

Self Cite

View full text Add to dashboard Cite

show abstract

“…Text clustering which is also called document clustering [59] is multi disciplineary clustering technique based on information retrieval, natural language processing and machine learning [38]. In document clustering, document collections are grouped together where the same documents in the group have similar topics [39].…”

Section: Text Processingmentioning

confidence: 99%

Spectral Clustering Approximation For Large Scale Crew Disruption Data Of An Airline Company For Intelligent Crew Recovery

Herekoglu,

Kabak

2023

J. Soft. Comput. Decis. Anal.

View full text Add to dashboard Cite

In the airline industry, after fuel costs, the crew costs con- stitute airlines’ second-highest cost items. For this reason, an airline needs to manage the valuable crew resource effi- ciently. Deviations from plans are fact in airline business and fixing deviations from crew schedules that occurred during operations by minimizing the crew-related delays and associated costs is one of the most important opera- tional burdens of airlines. In this context, the analysis of crew disruption data is vital in order to find disruption characteristics. Clustering analysis is one of the key meth- ods for analyzing the disruption characteristics. In this context, although there have been satisfactory studies in the literature and applications in the industry for small and medium-sized airlines, there is no good solution or industry practice for airlines with extensive networks and fleets. This study aims to analyze and categorize large- scale crew disruption data of a European airline. The relationship between categories of crew disruption and variables such as flight and crew types etc., are determined, and the disruption characteristics are revealed. For this purpose, clusters hidden in the large data set are extracted by spectral clustering. Due to the large size of the input data, a new approximation approach for spectral clustering is introduced. With the help of this new approximation approach, spectral clustering techniques are applied within a limited computational power and time frame as most real world scenario require. Even if the data set is gathered from one airline, the characteristics that are derived from the data is representing most of the cases an airline may face today. and will serve as a basis for further estimation and analysis of crew disruption.

show abstract

Topic Modelling-Based Approach for Clustering Legal Documents

Halgekar

Khankhoje

et al. 2022

Information and Communication Technology for Competitive Strategies (ICTCS 2021)

View full text Add to dashboard Cite

Soft document clustering using a novel graph covering approach

Cited by 8 publications

References 38 publications

Knowledge Detection and Discovery using Semantic Graph Embeddings on Large Knowledge Graphs generated on Text Mining Results

Knowledge Detection and Discovery using Semantic Graph Embeddings on Large Knowledge Graphs generated on Text Mining Results

Spectral Clustering Approximation For Large Scale Crew Disruption Data Of An Airline Company For Intelligent Crew Recovery

Topic Modelling-Based Approach for Clustering Legal Documents

Contact Info

Product

Resources

About