Subscribe to our Newsletter

Hadoop Training

Course Description: Training course is designed for developers who want to better understand how to create Apache Hadoop solutions. This 35 Hours provides Java programmers the necessary training for creating enterprise solutions using Apache Hadoop. It consists of an prudent combination of interactive lecture and extensive hand-on lab exercises.

Course Highlights

  • Write a MapReduce program using Hadoop API.
  • Learn how to configure Hadoop on single/multiple machines.
  • Perform different Hadoop admin activities on Hadoop cluster.
  • Use Pig, Hive, HBase and HCatalog effectively.

Click here for details

Course Duration: 35 hours. Class Delivery: On-Line (Interactive Web Based ). Contents:

Big Data

  • The problem space and example applications
  • Why don't traditional approaches scale?
  • Requirements

Hadoop Background

  • Hadoop History
  • The ecosystem and stack: HDFS, MapReduce, Hive, Pigs
  • Cluster architecture overview

Development Environment

  • Hadoop distribution and basic commands
  • Eclipse development

HDFS Introduction

  • The HDFS command line and web interfaces
  • The HDFS Java API (lab)

MapReduce Introduction

  • Key philosophy: move computation, not data
  • Core concepts: Mappers, reducers, drivers
  • The MapReduce Java API (lab)

Real-World MapReduce

  • Optimizing with Combiners and Partitioners (lab)
  • More common algorithms: sorting, indexing and searching (lab)
  • Relational manipulation: map-side and reduce-side joins (lab)
  • Chaining Jobs
  • Testing with MRUnit

Higher-level Tools

  • Patterns to abstract "thinking in MapReduce"
  • The Cascading library (lab)
  • The Hive database (lab)

 

 

 

E-mail me when people leave their comments –

You need to be a member of Hadoop360 to add comments!

Join Hadoop360

Resources

Research