2019
DOI: 10.1186/s13174-019-0121-z
|View full text |Cite
|
Sign up to set email alerts
|

DOD-ETL: distributed on-demand ETL for near real-time business intelligence

Abstract: The competitive dynamics of the globalized market demand information on the internal and external reality of corporations. Information is a precious asset and is responsible for establishing key advantages to enable companies to maintain their leadership. However, reliable, rich information is no longer the only goal. The time frame to extract information from data determines its usefulness. This work proposes DOD-ETL, a tool that addresses, in an innovative manner, the main bottleneck in Business Intelligence… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 25 publications
(11 citation statements)
references
References 21 publications
0
11
0
Order By: Relevance
“…Generating materialized and virtualized RDF is constrained with respect to execution time [30], computing resources [18,30], bandwidth [31], performance [32], and query execution [31] because producers do not know which generation approach is the most suitable given its own resources and the RDF use. Since the producer needs to provide most resources for generating RDF from heterogeneous data sources and answering consumers' queries, guidelines for selecting the right approach are needed to minimize its effort.…”
Section: Problem Statement and Contributionsmentioning
confidence: 99%
“…Generating materialized and virtualized RDF is constrained with respect to execution time [30], computing resources [18,30], bandwidth [31], performance [32], and query execution [31] because producers do not know which generation approach is the most suitable given its own resources and the RDF use. Since the producer needs to provide most resources for generating RDF from heterogeneous data sources and answering consumers' queries, guidelines for selecting the right approach are needed to minimize its effort.…”
Section: Problem Statement and Contributionsmentioning
confidence: 99%
“…1) Requirements/challenges for real-time stream processing for real-time DWH Following requirements and challenges for implementation of real-time stream processing for real-time DWH were identified after exploring various studies [4], [14]- [18], [20]- [31], [34]- [36], [39]- [41], [43]- [50], [52], [59], [63]- [70], [78]:…”
Section: Assessment Of Rq3: Which Approaches/tools Have Been Repormentioning
confidence: 99%
“…• stream-disk join for structured data • stream-stream join • sql query decomposition • multi-join query processing in cloud DWHs • survey of design approaches from distributed systems, social media and real-time ETL tools • architecture/framework for supporting distributed streaming ETL and data integration in real-time DWH • development of stream ETL engine • distributed on demand ETL framework • code-based real-time ETL tools Other emerging concept related to near real-time ETL has been addressed recently in [78]. They identified and proposed a solution for distributed on demand ETL, and developed a stream processing framework based on Kafka, Beam and Spark Streaming.…”
Section: Relationalmentioning
confidence: 99%
“…To improve the performance of data warehousing a conceptual model is presented, which defines different dimensionality and stereotypes. Similarly, a Data on Demand ETL (DOD-ETL) model is proposed in Machado et al Dec. ( 2019 ), which combines on demand data stream and pipelines in distributed and parallel memory caches to support effective portioning of data. An event-driven architecture is presented in Rieke et al ( 2018 ), to manage spatial data obtained from different capturing devices to maintain the geospatial information according to the real-time data (Bouali et al 2019 ).…”
Section: Introductionmentioning
confidence: 99%