BIG DATA CERTIFICATION
- 40 days of instructor-led training
- Certification exam
- Soft Copy of Course material
- 100% Success Rate
- Highly experienced and certified trainers
- Free refreshment classes within 6 months
It is a comprehensive Big Data course designed by industry experts considering current industry job requirements to help you learn Big Data Hadoop and Spark modules. This is an industry-recognized Big Data Hadoop certification course that is a combination of the course in Hadoop developer, Hadoop administrator, Hadoop testing and analytics with Apache Spark.
- Fundamentals of Hadoop and YARN and write applications using them
- HDFS, MapReduce, Hive, Pig, Sqoop, Flume, and ZooKeeper
- Spark, Spark SQL, Streaming, Data Frame, RDD, GraphX and MLlib writing Spark applications
- Working with Avro data formats
- Practicing real-life projects using Hadoop and Apache Spark
- Be equipped to clear Big Data Hadoop Certification
- Programming Developers and System Administrators
- Experienced working professionals and Project Managers
- Big Data Hadoop Developers eager to learn other verticals like testing, analytics and administration
- Mainframe Professionals, Architects and Testing Professionals
- Business Intelligence, Data Warehousing and Analytics Professionals
- Graduates and undergraduates eager to learn Big Data
Skill Development with Certification
01 Key Features
- 48 Hours of Live- Virtual Training
- 4 Industry Basd Projects
- Hands on Assignment for Each Module
- Life Time Access to All Recorded Lessons
- 44 PDUs offered
- 100% Money Back Guarantee*
02 Course Content
Lesson 01- Introduction to Bigdata and Hadoop
- Introduction to Big Data and Hadoop
- Introduction to Big Data
- Big Data Analytics
- What is Big Data?
- Four vs of Big Data
- Case Study
- Challenges of Traditional System
- Distributed Systems
- Introduction to Hadoop
- Components of Hadoop Ecosystem Part One
- Components of Hadoop Ecosystem Part Two
- Components of Hadoop Ecosystem Part Three
- Commercial Hadoop Distributions
- Demo
- Key Takeaways
- Knowledge Check
Lesson 02- Hadoop Architecture Distributed Storage (HDFS)and YARN
- Hadoop Architecture Distributed Storage (HDFS) and YARN
- What is HDFS
- Need for HDFS
- Regular File System vs HDFS
- Characteristics of HDFS
- HDFS Architecture and Components
- Data Block Split
- Data Replication Topology
- HDFS Command Line
- Demo: Common HDFS Commands
- Practice Project : HDFS Command Line
- Yarn Introduction
- Yarn Use Case
- Yarn and its Architecture
- Resource Manager
- How Resource Manager Operates
- Application Master
- How Yarn Runs an Application
- Tools for Yarn Developers
- Demo: Walkthrough of Cluster Part One
- Demo: Walkthrough of Cluster Part Two
- Key Takeaways
- Practice Project: Hadoop Architecture, distributed Storage (HDFS) and Yarn
Lessons 03- Data Ingestion into Big Data Systems and ETL
- Data Ingestion Into Big Data Systems and ETL
- Data Ingestion Overview Part One
- Data Ingestion Overview Part Two
- Apache Sqoop
- Sqoop and Its Uses
- Sqoop Processing
- Sqoop Import Process
- Sqoop Connectors
- Demo: Importing and Exporting Data From MySQL TO HDFS
- Practice Project : Apache Sqoop
- Apache Flume
- Flume Model
- Scalability in Flume
- Components in Flume’s Architecture
- Configuring Flume Components
- Demo: Ingest Twitter Data
- Apache Kafka
- Aggregating User Activity Using Kafka
- Kafka Data Model
- Partitions
- Apache Kafka Architecture
- Demo: Setup Kafka Cluster
- Producer Side API Example
- Consumer Side API & Example
- Kafka Connect
- Practice Project : Data Ingestion Into Big Data Systems and ETL
Lesson 04- Distributed Processing MapReduce Framework and Pig
- Distributed Processing MapReduce Framework and Pig
- Distributed Processing in MapReduce
- Word Count Example
- Map Execution Phases
- Map Execution Distributed Two Node Environment
- MapReduce Jobs
- Hadoop MapReduce Job Work Interaction
- Setting up the environment for MapReduce Development
- Set of Classes
- Creating a New Project
- Advanced MapReduce
- Data Types in Hadoop
- Output formats in MapReduce
- Using Distributed Cache
- Joins in MapReduce
- Replicated Join
- Introduction to Pig
- Components of Pig
- Pig Data Model
- Pig Interactive Modes
- Pig Operations
- Various Relations Performed by Developers
- Demo : Analyzing Web Log Data Using MapReduce
- Demo : Analyzing Sales Data and Solving Kpis Using Pig
- Practice Project : Apache Pig
- Demo : Wordcount
- Key Takeaways
- Knowledge Check
- Practice Project : Distributed Processing-MapReduce Framework and Pig
Lessons 05- Apache Hive
- Apache Hive
- Hive SQLover Hadoop MapReduce
- Hive Architecture
- Interface to Run Hive Queries
- Running Beeline From Command Line
- Hive Metastore
- Hive DDL and DML
- Creating New Table
- Data Types
- Validation of Data
- File Format types
- Data serialization
- Hive Table and Avro schema
- Hive optimization partitioning Bucketing and sampling
- None –partitioned Table
- Data insertion
- Dynamic Partioning in Hive
- Bucketing
- What Do Buckets Do ?
- Hive Analytics UDFand UDAF
- Other Functions of Hive
- Demo ;Real-time Analysis Data filtration
- Demo: Real-World problem
- Demo:Data Representation and Import using Hive
- Key Takeaways
- Knowledge Check
- Practice project:Apache Hive
Lesson 06-No SQL Databases HBase
- No SQL Data bases HBase
- NoSQL Introduction
- Demo :Yarn Tuning
- Hbase Overview
- Hbase Architecture
- Data Model
- Connecting to Hbase
- Practice project:Hbase shell
- Key Takeaways
- Knowledge check
- Practice project :No SQL Databases-Hbase
Lesson 07-Basics of Functional Programming and Scala
- Basics of Functional programming and Scala
- Introduction to Scala
- Demo :Scala Installation Function Programming
- Programming with Scala
- Demo: Basic Literals and Arithmetic Programming
- Demo: Logical Operators
- Type Inferences Classes Objects and Functions in Scala
- Demo:Type Inference Functions Anonymous Function and Class
- Collections
- Types of Collections
- Demo:Five types of Collections
- Demo :operations on List
- Scala REPL
- Demo :Features of Scala REPL
- Key Takeaways
- Knowledge Check
- Practice project :Apache Hive
Lesson 08 –Apache Spark Next-Generation Big Data Framework
- Apache Spark Next –Generation Big Data Framework
- History of Spark
- Limitations of Mapreduce in Hadoop
- Introduction of Spark
- Application of in –memory Processing
- Hadoop Ecosystem vs Spark
- Advantages of Spark
- Spark Architecture
- Spark Cluster in Real World
- Demo:Running a Scale Programs in Spark shell
- Demo: Setting Up Execution Environment inIDE
- Demo:Spark Web UI
- Key Takeaways
- Practice Project :Apache Spark Next –Generation Big Data Framework
Lesson 09-Spark Core Processing RDD
- Introduction to Spark RDD
- RDD in Spark
- Creating Spark RDDS
- Pair RDD
- RDD Operations
- Demo: Spark Transformation Detailed Exploration Using Scala Examples
- Demo: Spark Action Detailed Exploration Using Scala
- Caching and Persistence
- Storage Levels
- Lineage and DAG
- Need for DAG
- Debugging in Spark
- Partitioning in Spark
- Scheduling in Spark
- Shuffling in Spark
- Sort Shuffle
- Aggregating Data with Paired RDD
- Demo: Spark Application with Data Written Back to HDFS AND Spark UI
- Demo: Changing Spark Application Parameters
- Demo: Handling Different File Formats
- Demo: Spark RDD WITH Real world Application
- Demo: Optimizing Spark Jobs
- Key Takeaways
- Knowledge Check
- Practice Project : Spark Core Processing RDD
Lesson 10: Spark SQL Processing DataFrames
- Spark SQL Processing Data Frames
- Spark SQL Introduction
- DataFRAMES
- Demo : Handling Different Data Formats
- Demo: Implement various DataFrame Operations
- Demo: UDF AND UDAF
- Interoperating with RDDs
- Demo: Process DataFrame Using SQL Query
- RDD vs DataFrame vs Dataset
- Practice Project : Processing DataFrames
- Key Takeaways
- Knowledge Check
- Practice Project : Spark SQL- Processing DataFrames
Lesson11- Spark Mlib Modelling BigData with Spark
- Spark Mlib Modelling BigData with Spark
- Role of Data Scientist and Data Analyst in Big Data
- Analytics in Spark
- Machine Learning
- Supervised Learning
- Demo: Classification of Linear SVM
- Demo : Linear Regression with Real World Case Studies
Lesson 11- Spark Mlib Modelling BigData with Spark
- Spark Mlib Modeling Big Data with Spark
- Role of Data Scientist and Data Analyst in Big Data
- Analytics in Spark
- Machine learning
- Supervised Learning
- Demo: Classification of Linear SVM
- Demo: Linear Regression with real world case studies
- Unsupervised Learning
- Demo: Unsupervised Clustering K-means
- Reinforcement Learning
- Semi-supervised Learning
- Overview of Mlib
- Milb Pipelnes
- Key Takeaways
- Knowledge Check
- Practice Project: Spark Milb – Modelling Big Data with Spark
Star Certification – https://www.starcertification.org/
RELATED COURSES
"One of the best place to learn tech. Great support and amazing teachers! I'd recommend it to others, 10/10."
Abhinav Gyawali
"I completed my RHCSA and RHCE from Computer Point Nepal. The learning environment here is suited for both working professionals as well as students. From my experience tutors and staff members are very helpful. With their proper guidance I was able to complete RHCSA certification."
Avishek Pradhan
"CPN has always been a place of great learning and place to find proper guidance. They have experienced instructor who can provide guidance and suggestion about career and courses which is really helpful for beginners who want to know about career options in IT. CPN is highly recommended in my book as a place to learning and developing skills needed by both beginners and professionals alike."
Pravesh Shrestha
Red Hat StudentADDITIONAL LINKS
We accept payment by Cash, Bank Transfer, Cheque, Credit Cards, eSewa and Fonepay.
Jame Market, Ghantaghar, Kathmandu, Nepal
01-5333117 / 01-5333121
info@computerpoint.com.np
Sunday – Friday, 07:00 AM – 07:00 PM