OATAO is an open access repository that collects the work of Toulouse researchers and makes it freely available over the web where possible. This is an author-deposited version published in : http://oatao.univ-toulouse.fr/ Eprints ID : 17173The contribution was presented at MEDI 2016 :http://indalog.ual.es/MEDI2016/HOME.html
IntroductionIn today's highly dynamic business context, decision-makers should access internal and external sources to obtain an overall perspective over an organization [2]. Data Warehouses (DWs) have been widely used as internal sources to support online, interactive analyses, while Linked Open Data (LOD) 1 have become one of the most important external information sources allowing enhancing business analyses on a web scale [12]. However, warehoused data and LOD follow different models in each domain, which makes it difficult to analyze both types of data in a unified way. Moreover, dispersion of related data in different schemas results in repetitive searches for relevant information in different sources, which reduces the efficiency of analysis.Motivating example. In a company selling home appliances, a decision-maker looks up in an internal R-OLAP DW to assess the performance of sales staff. The DW relates to an analysis subject (i.e. fact), named Sales Analysis, which contains a set of numeric indicators (i.e. measures), namely unit price and quantity. Each measure can be computed according to three analysis axes (i.e. dimensions): salesman, product and time (cf. figure 1(a)). The R-OLAP DW alone does not provide enough information to support effective and well-informed decisions. The decision-maker must 1 http://linkeddata.org search for additional information to obtain other complementary perspectives over the sales activities. Since the sales of some home appliances (e.g., heaters) are strongly influenced by the climate changes, the decision-maker browses in an online dataset denoted LOD1 revealing the monthly average temperature according to countries. The LOD are published in RDF Data Cube Vocabulary (QB) 2 format, which is the current W3C standard to publish multidimensional statistical data. Moreover, since retail sales may compete with the company's promotions in the same catchment area, the decision-maker consults another online dataset denoted LOD2 about the outlet prices offered by rival retailers. The LOD2 dataset is published in QB4OLAP, it involves the retail price for a class (i.e., type) of merchandise offered by a retailers' shop. Extracts of the LOD datasets in tabular form are available in figure 1(b) and (c). Without a comprehensive representation of related data, analyses involving several sources are carried out in a sequential way. Decision-makers must explore all data sources one after another before obtaining an overall vision on an analysis subject. Carrying out such analyses is inefficient and difficult, because all schemas do not include the same information at the same analytical granularities: (a) the same analysis axes present in different sources may in...