Course Overview

Big Data Hadoop Developer
5 star rating


Collabera TACTs Big Data Hadoop developer certification brings out key ideas and proficiency necessary to make strong knowledge for managing Big Data with Apaches open source – Hadoop. Gaining in-depth knowledge on core ideas through the course and executing it on wide-ranging industry use-cases. It imparts new opportunities and challenges to organizations of all sizes and equips you with in-depth data of writing codes on MapReduce framework. The course also consists of advance modules like Yarn, Zookeeper, Oozie, Flume, Sqoop, Spark, Mongo, prophetess and Neo4J.

Course Content

Introduction to Hadoop
Big Data Introduction
Hadoop Introduction
What is Hadoop?
Why Hadoop?
Hadoop History
Different types of Components in Hadoop?
HDFS, MapReduce, PIG, Hive, SQOOP, HBASE, OOZIE, Flume, Zookeeper and so on…
What is the scope of Hadoop?

Deep Drive in HDFS (for Storing the Data)
Introduction of HDFS
HDFS Design
HDFS role in Hadoop
Features of HDFS
Daemons of Hadoop and its functionality
Anatomy of File Wright
Anatomy of File Read
Network Topology
Parallel Copying using DistCp
Basic Configuration for HDFS
Data Organization
Rack Awareness
Heartbeat Signal
How to Store the Data into HDFS
How to Read the Data from HDFS
Accessing HDFS (Introduction of Basic UNIX commands)
CLI commands
Planning your Hadoop cluster
Planning a Hadoop cluster and its capacity
Hadoop software and hardware configuration
HDFS Block replication and rack awareness
Network topology for Hadoop cluster

MapReduce using Java (Processing the Data)
The introduction of MapReduce.
MapReduce Architecture
Data flow in MapReduce
Understand Difference Between Block and InputSplit
Role of RecordReader
Basic Configuration of MapReduce
MapReduce life cycle
How MapReduce Works
Writing and Executing the Basic MapReduce Program using Java
Submission & Initialization of MapReduce Job.
File Input/Output Formats in MapReduce Jobs
Map-side Joins
Reducer-side Joins
Word Count Example
Partition MapReduce Program
Side Data Distribution
Distributed Cache (with Program)
Counters (with Program)
Job Scheduling

Introduction to Apache PIG
Introduction to PIG Data Flow Engine
MapReduce vs. PIG in detail
When should PIG use?
Data Types in PIG
Basic PIG programming
Modes of Execution in PIG
Execution Mechanisms
Grunt Shell
Operators/Transformations in PIG
PIG UDF’s with Program
Word Count Example in PIG
The difference between the MapReduce and PIG

Introduction to SQOOP
Use of SQOOP
Connect to mySql database
SQOOP commands
Joins in SQOOP
Export to MySQL
Export to HBase

Introduction to HIVE
HIVE Meta Store
HIVE Architecture
Tables in HIVE
Hive Data Types
Primitive Types
Complex Types
Joins in HIVE
HIVE UDF’s and UADF’s with Programs
Word Count Example

Introduction to HBASE
Basic Configurations of HBASE
Fundamentals of HBase
What is NoSQL?
HBase Data Model
Categories of NoSQL Data Bases
Key-Value Database
Document Database
Column Family Database
HBASE Architecture
How HBASE is differed from RDBMS
HDFS vs. HBase
Client-side buffering or bulk uploads
HBase Designing Tables
HBase Operations
What is MongoDB?
Where to Use?
Configuration On Windows
Inserting the data into MongoDB?
Reading the MongoDB data.

Cluster Setup
Downloading and installing the Ubuntu12.x
Installing Java
Installing Hadoop
Creating Cluster
Increasing Decreasing the Cluster size
Monitoring the Cluster Health
Starting and Stopping the Nodes

Introduction Zookeeper
Data Modal

Introduction to OOZIE
Use of OOZIE
Where to use?

Introduction to Flume
Uses of Flume
Flume Architecture

Project Explanation with Architecture

Pre Requisite

hands-on experience in Core Java and good analytical skills
experience of Linux environment will help

Required Exam

Big Data Hadoop Developer

hideRegular TrackFast Track
Duration 6 weeks 2 weeks

Success Stories

Trained 1000+ Students From 10+ Countries