Performing Fast Data Analytics Using Apache Kudu - Episode 64

The Hadoop platform is purpose built for processing large, slow moving data in long-running batch jobs. As the ecosystem around it has grown, so has the need for fast data analytics on fast moving data. To fill this need the Kudu project was created with a column oriented table format that was tuned for high volumes of writes and rapid query execution across those tables. For a perfect pairing, they made it easy to connect to the Impala SQL engine. In this episode Brock Noland and Jordan Birdsell from PhData explain how Kudu is architected, how it compares to other storage systems in the Hadoop orbit, and how to start integrating it into you analytics pipeline.

2356 232

Suggested Podcasts

Comedy Central

Brigit Esselmont: Founder of Biddy Tarot, Tarot Teacher a Mentor, and Intui

Ravneet Bawa

Steve Porino

Daana Townsend & Carlos Kareem Windham

Medieval Death Trip

The Insurgents

Riveting Stories