Lewis Gavin
1 min readJan 6, 2019

--

Yeah it’s a fair point and depends on your use case. If you want to use it as a data lake then there’s no reason why you couldn’t keep the staging data forever.

In the past I’ve also kept the raw data in S3 for the reasons you mentioned. You’re right, it is better to have a copy just in case and if it’s not being accessed regularly, you can change the storage option in S3 to significantly reduce the cost.

Having lots of copies of data just sat there in redshift can be more costly hence why I’ve recently tended to transform and purge the staging area after a period of time – its more for cost saving than anything else, especially when working for smaller businesses.

--

--

Lewis Gavin
Lewis Gavin

Written by Lewis Gavin

Data and Productivity Writer — Data Architect at easyfundraising.org.uk

Responses (1)