About This Course
Cloudera University’s Search training course is for developers and data engineers who want to index data in Hadoop for more powerful real-time queries. Participants will learn to get more value from their data by integrating Cloudera Search with external applications.
Payment and Registration
You can purchase this course on its own, or as part of our Full Library subscription.
- Purchase this course alone
- Purchase the full OnDemand library (includes courses for developers, administrators, and data analysts)
This course includes 6 hours of video content. Students who have purchased this course on its own are allowed up to 15 hours of lab time. (Subscribers to the full OnDemand library are given 100 hours of lab time across all courses.)
Through videos and hands-on exercises, participants will navigate the Hadoop ecosystem, learning how to:
- Perform batch indexing of data stored in HDFS and HBase
- Perform indexing of streaming data in near-real-time with Flume
- Index content in multiple languages and file formats
- Process and transform incoming data with Morphlines
- Create a user interface for your index using Hue
- Integrate Cloudera Search with external applications
- Improve the Search experience using features such as faceting, highlighting, spelling correction
Audience and Prerequisites
This course is intended for developers and data engineers with at least basic familiarity with Hadoop and experience programming in a general-purpose language such as Java, C, C++, Perl, or Python. Participants should be comfortable with the Linux command line and should be able to perform basic tasks such as creating and removing directories, viewing and changing file permissions, executing scripts, and examining file output. No prior experience with Apache Solr or Cloudera Search is required, nor is any experience with HBase or SQL.