Organising historical data

Often it is needed a database that provides long-term historical storage of data coming in from multiple systems ERP, OSS, CRM, Legacy and Data Warehouse. Let call it the Household data set.

From a design point of view, the Household results as a set of normalized tables that support one or more functional areas of business.  Data contained in such database are oriented to details, represent historical tracking and are uniquely linked.

dtx_bi_models

According to many data driven business transformation practices, today most company are capturing data to build Households for many applications looking at historical data for many purposes, including Data vault, auditability, transparency, traceability, governance, review, etc.

Challenges and issues

Today, most of the Household implementations may result in a structured project, often planned as a part of a global Data Lake or Master Data Management initiative.

Unfortunately, apart from the modeling aspects, those projects deal with issues such as auditing, tracing of data, loading speed and resilience to application changes. For this reasons, often proprietary BI Platform’s data management capability (Pentaho, Qlik, SAS etc.) cannot sustain the resulting data preparation or access workload.

To get around those issues, a typical approach is to build Households in a Secondary RDBMS or in a database appliance that will be filled with data coming from complex ETL or data preparation. But designing, optimizing and managing these data base takes just too much time. Furthermore, such a workaround solution goes against what a Qlik , or Sas, etc. platform should give you, which is agility and flexibility.

Worse, analyzing data directly on a such Household inherently causes very complex SQL statements to be created and run. The reason for this is because data is distributed across many database tables. So, this inevitably leads to poor performance levels.

The Solution

We have experimented a solution that integrates datonix data scan capability with WordPress.

Using datonix, the Household modeling will become much more agile and flexible than ever before.

The ETL is simplified since it is just needed to implement a Scan layer on top of your input data. The datonix technology will provide automatic build of hubs, satellites and links using its Data Discovery Fractal Engine. Through the automatic intelligent fractal base optimization processes, queries on the Scan layer will run more efficiently.

picture

After original DWs are slimmed, since most of historical data have been scanned, Views of DW can be unioned with datonix archives in full transparency, this way resulting significant space compression and performance improvement.

Existing BI and reporting has been improved by implementing the datonix native Data story telling features.

Simple php blocks has been written and embedded into the WordPress to embed results of queries into WordPress pages at unprecedented performances.

The result is an implementation that offers maximum performance both when running queries and when in the development phase since the overall complexity of the Household is reduced by using a simple ETL process.

With this solution you can now take advantage of the potential of the Household without sacrificing the benefits of modeling techniques and finding performance workarounds.