Skip to main content

Cloudera Data Analyst Training

About This Course

Cloudera University’s four-day data analyst training course will teach you to apply traditional data analytics and business intelligence skills to big data tools like Apache Impala (incubating), Apache Hive, and Apache Pig. Cloudera presents the tools data professionals need to access, manipulate, transform, and analyze complex data sets using SQL and familiar scripting languages.

Apache Impala (incubating) enables instant interactive analysis of the data stored in Hadoop via a native SQL environment. Apache Hive provides a SQL-like query language with HiveQL that makes data accessible to analysts, database administrators, and others without Java programming expertise. Apache Pig applies the fundamentals of familiar scripting languages to the Hadoop cluster.

Payment and Registration

This course is available as a Cloudera Data Analyst Training: Using Pig, Hive, and Impala with Hadoop - OnDemand (180-day Subscription), or as part a complete catalog through a Cloudera University OnDemand Library (365-day Subscription).

Course Length

This course includes 6.5 hours of video content. Students who have purchased this course on its own are allowed up to 20 hours of lab time. (Subscribers to the full OnDemand library are given 100 hours of lab time to use across all courses.)

Course Outline

Through videos and hands-on exercises, participants will navigate the Hadoop ecosystem, learning how to:

  • Acquire, store, and analyze data using features in Pig, Hive, and Impala
  • Perform fundamental ETL (extract, transform, and load) tasks with Hadoop tools
  • Use Pig, Hive, and Impala to improve productivity for typical analysis tasks
  • Join diverse datasets to gain valuable business insight
  • Perform interactive, complex queries on datasets

Audience and Prerequisites

This course is designed for data analysts, business intelligence specialists, developers, system architects, and database administrators. Prior knowledge of Apache Hadoop is not required. Knowledge of SQL is assumed. Basic familiarity with the Linux command line is expected. Knowledge of a scripting language (such as Bash scripting, Perl, Python, or Ruby) is helpful but not essential.


Upon completion of the course, attendees are encouraged to continue their study and register for the CCA Data Analyst exam. Certification is a great differentiator. It helps establish you as a leader in the field, providing employers and customers with tangible evidence of your skills and expertise.