Building A Data Lakehouse On Google Cloud Platform

Whitepaper

Published January 2022

x

For over a decade, the technology industry has searched for optimal ways to store and analyze vast amounts of data. These storage solutions need to handle the volume, latency, resilience, and varying data access requirements demanded by organizations.

To tackle these issues, companies have been making the best out of existing technology stacks. This typically involves trying to either make a data lake behave like an interactive data warehouse or trying to make a data warehouse act like a data lake, processing and storing vast amounts of semi-structured data. Both approaches have resulted in unhappy users, high costs, and data duplication across the enterprise. The Google Cloud data lake house pattern solves these shortcomings.