Udemy-Head-Banner

April2516-25off-sitewide728X90

Monday, March 3, 2014

Big Data is a Technology & Data Warehouse is an Architecture

Bigdata (aka Hadoop) is gaining popularity in recent years. Often I hear people saying that we dont need a data warehouse if we have Big data.

I do agree that there are some similarities between a data warehouse and a big data solution.

Both can be used for Reporting. 


Both are managed by electronic storage devices.


Both can hold lot of data


So if a company starts to build a Big data solution doesnt that obviate the need for a data warehouse?

What Big Data offers to an organization


- Technology capable of holding very large amounts of data.


- Technology that can hold the data in inexpensive storage devices.


- Technology where processing is done by the "Roman Census" method.


- Technology where the data is stored in an unstructured format.

What Data Warehouse offers to an organization

In principle there is the Kimball approach to  data warehouse and Inmon approach to a data warehouse

The Inmon approach to data warehouse defines a data warehouse is a subject oriented, non volatile, integrated, time variant collection of data created for the purpose of management decision making. 

In simple terms a data warehouse provides a single version of the truth for decision making in the corporation.

Companies need a data warehouse in order to make informed decisions from the data
that is reliable, believable, readily available and accessible to every one.

So what Big data offers in addition to data Warehouse -
 


In large corporations there is lot of data which are not transported into their data warehouse.
 
There are numerous reasons for not exporting this data to their data warehouses.

[This data cannot be De-normalized or require more additional data to be imported into data warehouse.]
 
For example - Tweets and Facebook posts regarding a product or service discussed by the consumers really helps the companies to understand the consumers opinion about the product or service.
 
By understanding the feedback or comments these companies can make changes accordingly
 
If a company can unlock this valuable unstructured data into a meaningful  information from various sources and then combine them with the reports from their data warehouse they can accurately predict what their customer wants and how it reflect their sales & revenue.

The difference between a Big data and Data warehouse is the difference between a hammer and nail.
 
Big data is a technology and Data warehouse is an architecture. A technology is just a means to store and manage large amount of data.
 
A data warehouse is a way of organizing data so there is a credibility and integrity. We can do compliance reporting like Sarbanes-Oxley, Base II or other styles of  compliance reporting we can depend on Data warehouse.

For all practical purposes a data warehouse and big data have little or no relationship. Finally to conclude The Data warehouse is an Architecture and Big data is a Technology.