Member-only story

How to architect the perfect Data Warehouse

Key warehousing techniques made simple

Lewis Gavin
7 min readJul 26, 2018
Data Warehousing: Simplified

On the surface, it may seem like a lot has changed in recent years in regards to data collection, storage and warehousing. The introduction and takeover of NoSQL, “Big Data”, Graphing and Streaming technologies may appear to have changed the landscape, but there are some fundamentals that remain.

In my current role, we use Amazon Redshift for our data warehousing. However, whether we built a traditional data warehouse using Oracle or a data lake in Hadoop, the core architecture would remain the same.

The core architecture boils down to some preprocessing and three separate areas (schemas if you’re using redshift) called Staging, Master and Reporting. In this post I’ll talk through each in detail.

Preprocessing

Unfortunately, not all data is created equally, but it’s data nonetheless and therefore carries value.

In order to deal with the complexities of external data, some pre-processing is almost inevitable, especially when collection from a number of different sources. The main goal of the pre-processing step is to get the data into a consistent format that can be loaded by the data warehouse.

--

--

Lewis Gavin
Lewis Gavin

Written by Lewis Gavin

Data and Productivity Writer — Data Architect at easyfundraising.org.uk

Responses (12)