Introduction to Apache Kudu
CDH

About This Course
This course teaches students the basics of Apache Kudu, a new data storage system for the Hadoop platform that is optimized for analytical queries. The course covers common Kudu use cases and Kudu architecture. Students will learn how to create, manage, and query Kudu tables, and to develop Spark applications that use Kudu.
View the full course outline
Payment and Registration
You can purchase this course on its own, or as part of our Full Library subscription.
- Purchase this course alone
- Purchase the full OnDemand library (includes courses for developers, administrators, and data analysts)
Course Length
This course includes over one hour of video content. Students who have purchased this course on its own are given up to 5 hours of lab time. (Students who have a full OnDemand library subscription are given 100 hours of lab time across all courses.)
Course Outline
Through videos and hands-on exercises, participants will learn the basics of using Apache Kudu and integrating it with Apache Impala, including these topics:
- Kudu Overview and Architecture
- Apache Kudu Tables
- Using Apache Kudu with Apache Impala
- Developing Apache Spark Applications with Apache Kudu
View the full course outline
Audience and Prerequisites
These modules are intended for a broad audience of students involved with either software development or data analysis, including software developers, data engineers, DBAs, data scientists, and data analysts.
Students should know SQL. Familiarity with Impala is preferred but not required. Students should also know how to develop Apache Spark applications using either Python or Scala. Basic Linux experience is expected.