“Zero-Copy” Hybrid Bursting With No App Changes

Whitepaper

Published January 2022

x

Many companies have data stored in a Hadoop Distributed File System (HDFS) cluster in their on-premises environment. As part of data-driven transformation efforts, the amount of data stored and the number of queries is growing fast. This puts more load on the HDFS systems. As the number of frameworks to derive insights from the data have increased over the past few years, platform engineering teams at enterprises have been pushed to support newer and popular frameworks on their already busy data lake. In addition, the data lakes become the landing zone for all enterprise data. It is not uncommon to see Hadoop-based data lakes running at beyond 100% utilization. All of this leads to very large and busy Hadoop clusters.