VIDEOS TO LEARN ABOUT OUR UNIQUE TRAINING PROCESS:

Facebook Recommendations

Training Details

Course Duration: 30-35 hours Training + Assignments + Actual Project Based Case Studies

Training Materials: All attendees will receive,

  • Assignment after each module, Video recording of every session
  • Notes and study material for examples covered.
  • Access to the Training Blog & Repository of Materials

Audience:

This course is designed for anyone who is:

  • Wanting to architect a project using Hadoop and its Eco System components.
  • Wanting to develop Map Reduce programs
  • A Business Analyst or Data Warehousing person looking at alternative approach to data analysis and storage.

Pre-Requisites:

  • The participants should have at least basic knowledge of Java.
  • Any experience of Linux environment will be very helpful.
  • We will provide you with Pre-recorded Videos of Core Java if you need.

Training Format:

This course is delivered as a highly interactive session, with extensive live examples. This course is Live Instructor led Online training delivered using Cisco Webex Meeting center Web and Audio Conferencing tool.

Timing: Weekdays and Weekends after work hours.

Training Highlights

  • Focus on Hands on training
  • 30 hours of Assignments, Live Case Studies
  • Video Recordings of sessions provided
  • Demonstration of Concepts using different tools.
  • One Problem Statement discussed across the Whole training program.
  • HADOOP Certification Guidance.
  • Resume prep, Interview Questions provided.
  • Introduction to HADOOP and BIG DATA
  • Covers All Important HADOOP Ecosystem Products.

HADOOP-Training-Roadmap

How are we Different from other Training Institutes?

Role-Specific Training – Traditional training focuses on an overall business process, without regard for the individual’s role. By contrast, Zaran Tech’s series of role-based courses delivers in-depth training on a specific role within the business process.

Hands-on Mentoring – Our expert trainer’s work side-by-side with each trainee and teach proven discipline and techniques for successful delivery. Mentoring is “hands-on” as part of actual projects to gain real-time experience.

Real-time Project Experience – Comprehension of practical skills is both reinforced and demonstrated in our training programs by completing Case studies of actual projects.

Certification Exams –Our trainers are all certified and provide expert industry level guidance to certification exams with tips and tricks to guarantee consistency of skills and practices for a successful completion of certification.

Ongoing Education – Zaran Tech continues to offer new courses so you can keep up-to-date on latest technologies and best practices.

After reviewing your goals, and assessing your current IT skills capability, ZaranTech will tailor an after-the-training program and placement plan that will meet your needs in successful placement. This includes Resume preparation assistance, Mock interviews (technical and project based), Soft Skills and On-the-job support.

Modules Covered in this Training

BASIC HADOOP

  1. Introduction and Overview of Hadoop
  2. Hadoop Distributed FileSystem (HDFS)
  3. HBase – The Hadoop Database
  4. Map/Reduce 2.0/YARN
  5. MapReduce Workflows
  6. Pig
  7. Hive
  8. Putting it all together

ADVANCED HADOOP

  1. Integrating Hadoop Into The Workflow
  2. Delving Deeper Into The Hadoop API
  3. Common Map Reduce Algorithms
  4. Using Hive and Pig
  5. Practical Development Tips and Techniques
  6. More Advanced Map Reduce Programming
  7. Joining Data Sets in Map Reduce
  8. Graph Manipulation in Hadoop
  9. Creating Workflows With Oozie
  10. HANDS ON EXCERCISE

Attendees also learn:

  1. Resume Preparation Guidelines and Tips
  2. Mock Interviews and Interview Preparation Tips

Topics Covered

BASIC HADOOP

Introduction and Overview of Hadoop

  • What is Hadoop?
  • History of Hadoop.
  • Building Blocks – Hadoop Eco-System.
  • Who is behind Hadoop?
  • What Hadoop is good for and what it is not?

Hadoop Distributed FileSystem (HDFS)

  • HDFS Overview and Architecture
  • HDFS Installation
  • HDFS Use Cases
  • Hadoop FileSystem Shell
  • FileSystem Java API
  • Hadoop Configuration

HBase – The Hadoop Database

  • HBase Overview and Architecture
  • HBase Installation
  • HBase Shell
  • Java Client API
  • Java Administrative API
  • Filters
  • Scan Caching and Batching
  • Key Design
  • Table Design

Map/Reduce 2.0/YARN

  • MapReduce 2.0 and YARN Overview
  • MapReduce 2.0 and YARN Architecture
  • Installation
  • YARN and MapReduce Command Line Tools
  • Developing MapReduce Jobs
  • Input and Output Formats
  • HDFS and HBase as Source and Sink
  • Job Configuration
  • Job Submission and Monitoring
  • Anatomy of Mappers, Reducers, Combiners and Partitioners/li>
  • Anatomy of Job Execution on YARN
  • Distributed Cache
  • Hadoop Streaming

MapReduce Workflows

  • Decomposing Problems into MapReduce Workflow
  • Using Job Control
  • Oozie Introduction and Architecture
  • Oozie Installation
  • Developing, deploying, and Executing Oozie Workflows

Pig

  • Pig Overview
  • Installation
  • Pig Latin
  • Developing Pig Scripts
  • Processing Big Data with Pig
  • Joining data-sets with Pig

Hive

  • Hive Overview
  • Installation
  • Hive QL

Putting it all together

  • Distributed installations
  • Best Practices

ADVANCED HADOOP

Outline:

Our Advanced Hadoop is an extension of Essential Hadoop Module designed with objective of in-depth coverage
with case study illustration.

Integrating Hadoop into the Workflow

  • Relational Database Management Systems
  • Storage Systems
  • Importing Data from RDBMSs With Sqoop
  • Hands-on exercise
  • Importing Real-Time Data with Flume
  • Accessing HDFS Using FuseDFS and Hoop

Delving Deeper Into the Hadoop API

  • More about ToolRunner
  • Testing with MRUnit
  • Reducing Intermediate Data With Combiners
  • The configure and close methods for Map/Reduce Setup and Teardown
  • Writing Partitioners for Better Load Balancing
  • Hands-On Exercise
  • Directly Accessing HDFS
  • Using the Distributed Cache
  • Hands-On Exercise

Common MapReduce Algorithms

  • Sorting and Searching
  • Indexing
  • Machine Learning With Mahout
  • Term Frequency – Inverse Document Frequency
  • Word Co-Occurrence
  • Hands-On Exercise

Using Hive and Pig

  • Hive Basics
  • Pig Basics
  • Hands-on exercise

Practical Development Tips and Techniques

  • Debugging MapReduce Code
  • Using LocalJobRunner Mode For Easier Debugging
  • Retrieving Job Information with Counters
  • Logging
  • Splittable File Formats
  • Determining the Optimal Number of Reducers
  • Map-Only MapReduce Jobs
  • Hands-On-Exercise

More Advanced MapReduce Programming

  • Custom Writables and WritableComparables
  • Saving Binary Data using SequenceFiles and Avro Files
  • Creating InputFormats and OutputFormats
  • Hands-On Exercise

Joining Data Sets in MapReduce

  • Map-Side Joins
  • The Secondary Sort
  • Reduce-Side Joins

Graph Manipulation in Hadoop

  • Introduction to graph techniques
  • Representing graphs in Hadoop
  • Implementing a sample algorithm: Single Source Shortest Path

Creating Workflows with Oozie

  • The Motivation for Oozie
  • Oozie’s Workflow Definition Format
  • HANDS ON EXERCISE

About the trainer

  • 14 years of experience in consulting / training and mentoring participants on the design, infrastructure, integration aspects in the training.
  • Have trained more than 5,000 participants in the areas of Java, J2EE, Android and BPM and always looking forward to share his knowledge in the IT domain with anyone.
  • Have extensively travelled and mentioned participants in different organizations in countries like RBC [Luxemburg], Motorola [Germany],PayPal [Dublin],GVT [Brazil], Virtusa [Sri Lanka], Damac [Dubai], Rogers Telecom [Canada],D&B, HBO, Micron, EMC, e-Rewards, Maximus [USA].
  • Have assisted and providing consulting to ADP, Diebold, Level 3 Communications, e- Rewards, South West Airlines and other Corporates on their Process Requirements in the areas of BPM.
  • Have been on the Code Review Panel for multiple organizations for their product development efforts and have brain stormed multiple new ideas which have turned into reality.
  • Was a part of the Core Initial Team for exploring HD Insight [Hadoop on Windows] for Microsoft India Development Center and have mentored multiple batches of Developers, Project Managers and Development Testers.
  • Have mentored participants in J P Morgan, TCS, HCL, Accenture in Hadoop and its eco-system components like Hike, Hbase, Pig and Sqoop. Have also been involved in assisting the organizations in setting up their initial Hadoop team.

CASE STUDY # 1 – “Healthcare System”

Healthcare System Application:

As the Product Manager for Inner Expressions you are asked to provide one of your largest clients with additional features in the EMR ( Electronic Medical Records Management) System. The client has requested an integrated Referral Management System that tracks patients from Primary care into the Specialist departments. Appointments are created by either the Primary Care Physicians themselves or other clinical staff like Nurse Practitioners or Clinical Assistants. Each appointment must go through the appropriate checks including checking if the patient has an active insurance with the client, whether the insurance program covers the condition of the patient, patient’s preference for location and timings and availability of the Specialist doctor.

Some appointments may have to be reviewed by the Specialists themselves before they can be approved, the administrator of the facility (hospital) must have the ability to choose by appointment type to either make it directly bookable by the Primary Care Staff or as a type that requires review by the specialist. The system should also allow the Primary Care Staff and specialists departments to exchange notes and comments about a particular appointment. If the specialist department requests tests or reports as mandatory for the appointment, the system must ensure that the patient has these available on the date of the appointment.

The system shall also allow users to track the status of patients’ appts & must store the entire clinical history of each patient. This will be used by the hospital for two main purposes; the specialist and the primary care providers will have access to the patients complete medical history before the patient walks in for the appt and hence allowing for better patient care, the Hospital also stores this data in a general data warehouse ( without Protected Health Information) to do analytics on it and come up with local disease management programs for the area. This is aligned with the Hospitals mission of providing top quality preventive medical care.

The Hospital sets about 300 appointments per day and must support about 50 users at the same time. The existing EMR system is based on Java and an Oracle database system.

TASKS

  • Identify Actors, Use Cases, Relationships,
  • Draw Use Case Diagrams
  • Identify Ideal, Alternate and Exception Flows
  • Write a Business Requirements Document

CASE STUDY # 2 – “Asset Management System”

Asset Management Application:

An e Examination system is also known as (e-Pariksha/ Online Examination Scheduler), an Intelligent Web Application which automates the process of pre examination scheduling of Any Academic Institutions, Universities, Colleges and School. This automations primary scope is to save nature by saving tons of paper involved in conducting the examination. All examination communications are done via email management between student and Academia. Usually any examination would start with Exam Registrations, which is connected to Subject Creation, Exam Room Management , Room Allotment, Examination Hall Dairy, and Absentees Information (Variety of Reports) – Required by UniversityThis WebApp edges two sides of Client side and Server side Application. Client side enables student community to fill up their examination registration form online via internet and also they have privileges to check out their examination details like (Day of Start, Complete Time Table, Day-wise Exam Details and Day seating details of the candidate- like room name, seating number subject, date and time. The Server side involves the processing of each candidate exam registration form into workflow like, Subject Loader, Room Management, Seating Manager, Room Allotment, Room Dairies, Absentee Marking, and Rich Crystal Reports to meet various needs of Data set.The WebApp Admin records new chattel into database, deletes archaic ones, and revises any information related to examination. “User”. All users are known to the system by their USN, ID and their The asset management system keeps track of a number of assets that can be borrowed, their ownership, their availability, their current location, the current borrower and the asset history. Assets include books, software, computers, and peripherals. Assets are entered in the database when acquired, deleted from the database when disposed. The availability is updated whenever it is borrowed or returned. When a borrower fails to return an asset on time, the asset management system sends a reminder to the borrower and informs the asset owner.

The administrator enters new assets in the database, deletes obsolete ones, and updates any information related to assets. The borrower search for assets in the database to determine their availability and borrows and returns assets. The asset owner loans assets to borrowers. Each system has exactly one administrator, one or more asset owners, and one or more borrowers. When referring to any of the above actor, we use the term “user”. All users are known to the system by their name and their email address. The system may keep track of other attributes such as the owner’s telephone number, title, address, and position in the organization.

The system should support at least 200 borrowers and 2000 assets. The system should be extensible to other types of assets. The system should checkpoint the state of the database every day such that it can be recovered in case of data loss. Owners and the administrator are authenticated using a user/password combination. Actors interact with the system via a web browser capable of rendering HTML and HTTP without support for JavaScript and Java.

The persistent storage is realized using an SQL database. The business logic is realized using the WebObjects runtime system. The system includes:

TASKS

  • Identify Actors, Use Cases, Relationships,
  • Draw Use Case Diagrams
  • Identify Ideal, Alternate and Exception Flows
  • Write a Business Requirements Document

OTHER CASE STUDIES:

Social Networking, Cruise Management System, Collegiate Sporting system


What is the Difference between Live training and Video training?

These Videos here will help you understand the difference,
VIDEO – What is Instructor led LIVE Training -http://www.youtube.com/watch?v=G908QvF-gVA
VIDEO – What is Instructor led VIDEO Training – http://www.youtube.com/watch?v=naPdAyKvAI0

How soon after I Enroll would I get access to the Training Program and Content?

Right after you have Enrolled, we will send you an Email to your Gmail id with a Video on How To login to the training blog and get access to the training program and content.

What are the pre-requisites of taking this training?

- The participants should have at least basic knowledge of Java.
- Any experience of Linux environment will be very helpful.

Who are the instructors and what are their qualifications?

All our instructors are Senior Consultants themselves with a minimum of 10 years of real-time experience in their respective fields. Each trainer has also trained more than 100 students in the individual and/or corporate training programs.

How will be the practicals/assignments done?

Practicals/assignments will be done using the training blog. Instructions will be sent after you enroll.

When are the classes held and How many hours effort would I need to put in every day/week?

Online Live sessions are held weekdays evening CST (Central Standard Time GMT-6) or on Weekends. The schedule is posted for each batch on the website. You have to put in a effort of 8-10 hrs per week going thru the videos once again and completing your assignments.

What if I miss a class?

We Video record every Live session and after the session is complete, we will post the Video recording of that session in the blog. You will have access to these Video recordings for 6 months from the date you start your training. Material access will be provided using Google Drive Cloud for lifetime.

How can I request for a support session?

You can do that by posting a question on the training blog.

What If I have queries after I complete this course?

You can post those questions on the training blog.

Will I get 24*7 Support ?

You will get 24*7 accesss to the blog to post your questions. Trainers will answer your questions within 24 hrs of time. Normally they answer very frequently, like about 1-2 hrs. You can also approach your training coordinator for the same.

Can I get the recorded sessions of a class from some other batches before attending a live class?

Yes, you can. Or you can see our Youtube page for previous batch session recordings.

How will I get the recorded sessions?

It will be provided to you through the trainng blog.

How can I download the class videos?

You wont be able to download any videos. They are available for you to View using the training blog 24*7.

Is the course material accessible to the students even after the course training finishes?

Yes.

Do you provide Placements as well

We are infact, a Consulting company which provides training so we are mainly looking for trainees who are looking for Placement after training.
After the Training Process explained (Video): http://www.youtube.com/watch?v=BrBJjoH46VI
Our 6-step training to placement process (Video): http://www.youtube.com/watch?v=BrBJjoH46VI

How can I complete the course in a shorter Duration?

Yes you can. Go for the Video Training.
VIDEO – What is Instructor led VIDEO Training – http://www.youtube.com/watch?v=naPdAyKvAI0

Do you provide any Certification? If yes, what is the Certification process?

We provide Certification guidance at the end of each course. You will also receive a “Certificate of Completion” from ZaranTech at the end of the course.

Are these classes conducted via LIVE video streaming?

We have both the options available

What internet speed is required to attend the LIVE classes?

1Mbps of internet speed is recommended to attend the LIVE classes. However, we have seen people attending the classes from a much slower internet.

What are the payment options?

We accept Credit Cards, Paypal, Bank payments from anywhere in USA, Money orders, International Wire transfer, ACH transfers, Chase Quickpay, Bank of America transfers, Wellsfargo Surepay. All the payments details are mentioned on the Enrollment page.

What if I have more queries?

Call the number listed on the Course Details page of our website.