Large-scale Data Exploration Using Explanatory Regression Functions

Savva, Fotis; Anagnostopoulos, Christos; Triantafillou, Peter; Kolomvatsos, Kostas

doi:10.1145/3410448

Cited by 4 publications

(3 citation statements)

References 49 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Hence, a node selection mechanism is required to determine which node could improve the model performance and which can lead to a model that can forget what it has learned. Unfortunately, upon any incoming analytics query [13], we do not have full access to data of all participants; thus, we cannot extract the data space and patterns. In order to extract this knowledge per incoming query, it is deemed appropriate to define a pre-test mechanism to check if participants have a similar or different data patterns and relevant to the current query or not.…”

Section: Related Work and Rationalementioning

confidence: 99%

Query-driven Edge Node Selection in Distributed Learning Environments

Aladwani

Anagnostopoulos

Kolomvatsos

et al. 2023

2023 IEEE 39th International Conference on Data Engineering Workshops (ICDEW)

Self Cite

View full text Add to dashboard Cite

show abstract

Section: Related Work and Rationalementioning

confidence: 99%

Query-driven Edge Node Selection in Distributed Learning Environments

Aladwani

Anagnostopoulos

Kolomvatsos

et al. 2023

2023 IEEE 39th International Conference on Data Engineering Workshops (ICDEW)

Self Cite

View full text Add to dashboard Cite

show abstract

“…For a particular data mining application, the importance of the prediction and description may differ some times. The goals of description and prediction are accomplished by using variety of mining methods of data [7] [8].…”

Section: Introductionmentioning

confidence: 99%

CFLCA: High Performance based Heart disease Prediction System using Fuzzy Learning with Neural Networks

Damodharan

Goel

Agrawal

et al. 2023

IJRITCC

View full text Add to dashboard Cite

Human Diseases are increasing rapidly in today’s generation mainly due to the life style of people like poor diet, lack of exercises, drugs and alcohol consumption etc. But the most spreading disease that is commonly around 80% of people death direct and indirectly heart disease basis. In future (approximately after 10 years) maximum number of people may expire cause of heart diseases. Due to these reasons, many of researchers providing enormous remedy, data analysis in various proposed technologies for diagnosing heart diseases with plenty of medical data which is related to heart disease. In field of Medicine regularly receives very wide range of medical data in the form of text, image, audio, video, signal pockets, etc. This database contains raw dataset which consist of inconsistent and redundant data. The health care system is no doubt very rich in aspect of storing data but at the same time very poor in fetching knowledge. Data mining (DM) methods can help in extracting a valuable knowledge by applying DM terminologies like clustering, regression, segmentation, classification etc. After the collection of data when the dataset becomes larger and more complex than data mining algorithms and clustering algorithms (D-Tree, Neural Networks, K-means, etc.) are used. To get accuracy and precision values improved with proposed method of Cognitive Fuzzy Learning based Clustering Algorithm (CFLCA) method. CFLCA methodology creates advanced meta indexing for n-dimensional unstructured data. The heart disease dataset used after data enrichment and feature engineering with UCI machine learning algorithm, attain high level accurate and prediction rate. Through this proposed CFLCA algorithm is having high accuracy, precision and recall values of data analysis for heart diseases detection.

show abstract

“…Typical use case scenarios include: (i) data scientists analyzing a large volume of data to obtain in-depth knowledge about the data for follow-up tasks like, data trend explanation (Savva et al, 2020), report summarization (Marcel and Negre, 2011) and (ii) users gathering information by discovering scholarly articles to conduct literature review using online services (Krause and Guestrin, 2011). These tasks can be conceptualized as the system recommending queries and the user either accepting or rejecting these recommendations, thus forming a closed loop interactive environment.…”

Section: Introductionmentioning

confidence: 99%

Max-Utility Based Arm Selection Strategy For Sequential Query Recommendations

Parambath,

Anagnostopoulos,

Murray-Smith

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

We consider the query recommendation problem in closed loop interactive learning settings like online information gathering and exploratory analytics. The problem can be naturally modelled using the Multi-Armed Bandits (MAB) framework with countably many arms. The standard MAB algorithms for countably many arms begin with selecting a random set of candidate arms and then applying standard MAB algorithms, e.g., UCB, on this candidate set downstream. We show that such a selection strategy often results in higher cumulative regret and to this end, we propose a selection strategy based on the maximum utility of the arms. We show that in tasks like online information gathering, where sequential query recommendations are employed, the sequences of queries are correlated and the number of potentially optimal queries can be reduced to a manageable size by selecting queries with maximum utility with respect to the currently executing query. Our experimental results using a recent real online literature discovery service log file demonstrate that the proposed arm selection strategy improves the cumulative regret substantially with respect to the state-of-the-art baseline algorithms. Our data model and source code are available at https://anonymous.4open.science/r/0e5ad6b7-ac02-4577-9212-c9d505d3dbdb/.

show abstract

Large-scale Data Exploration Using Explanatory Regression Functions

Cited by 4 publications

References 49 publications

Query-driven Edge Node Selection in Distributed Learning Environments

Query-driven Edge Node Selection in Distributed Learning Environments

CFLCA: High Performance based Heart disease Prediction System using Fuzzy Learning with Neural Networks

Max-Utility Based Arm Selection Strategy For Sequential Query Recommendations

Contact Info

Product

Resources

About