BIG DATA CERTIFICATION
- 40 days of instructor-led training
- Certification exam
- Soft Copy of Course material
- 100% Success Rate
- Highly experienced and certified trainers
- Free refreshment classes within 6 months
It is a comprehensive Big Data course designed by industry experts considering current industry job requirements to help you learn Big Data Hadoop and Spark modules. This is an industry-recognized Big Data Hadoop certification course that is a combination of the course in Hadoop developer, Hadoop administrator, Hadoop testing and analytics with Apache Spark.
- Fundamentals of Hadoop and YARN and write applications using them
- HDFS, MapReduce, Hive, Pig, Sqoop, Flume, and ZooKeeper
- Spark, Spark SQL, Streaming, Data Frame, RDD, GraphX and MLlib writing Spark applications
- Working with Avro data formats
- Practicing real-life projects using Hadoop and Apache Spark
- Be equipped to clear Big Data Hadoop Certification
- Programming Developers and System Administrators
- Experienced working professionals and Project Managers
- Big Data Hadoop Developers eager to learn other verticals like testing, analytics and administration
- Mainframe Professionals, Architects and Testing Professionals
- Business Intelligence, Data Warehousing and Analytics Professionals
- Graduates and undergraduates eager to learn Big Data
Skill Development with Certification
01 Key Features
- 48 Hours of Live- Virtual Training
- 4 Industry Basd Projects
- Hands on Assignment for Each Module
- Life Time Access to All Recorded Lessons
- 44 PDUs offered
- 100% Money Back Guarantee*
02 Course Content
Lesson 01- Introduction to Bigdata and Hadoop
- Introduction to Big Data and Hadoop
- Introduction to Big Data
- Big Data Analytics
- What is Big Data?
- Four vs of Big Data
- Case Study
- Challenges of Traditional System
- Distributed Systems
- Introduction to Hadoop
- Components of Hadoop Ecosystem Part One
- Components of Hadoop Ecosystem Part Two
- Components of Hadoop Ecosystem Part Three
- Commercial Hadoop Distributions
- Demo
- Key Takeaways
- Knowledge Check
Lesson 02- Hadoop Architecture Distributed Storage (HDFS)and YARN
- Hadoop Architecture Distributed Storage (HDFS) and YARN
- What is HDFS
- Need for HDFS
- Regular File System vs HDFS
- Characteristics of HDFS
- HDFS Architecture and Components
- Data Block Split
- Data Replication Topology
- HDFS Command Line
- Demo: Common HDFS Commands
- Practice Project : HDFS Command Line
- Yarn Introduction
- Yarn Use Case
- Yarn and its Architecture
- Resource Manager
- How Resource Manager Operates
- Application Master
- How Yarn Runs an Application
- Tools for Yarn Developers
- Demo: Walkthrough of Cluster Part One
- Demo: Walkthrough of Cluster Part Two
- Key Takeaways
- Practice Project: Hadoop Architecture, distributed Storage (HDFS) and Yarn
Lessons 03- Data Ingestion into Big Data Systems and ETL
- Data Ingestion Into Big Data Systems and ETL
- Data Ingestion Overview Part One
- Data Ingestion Overview Part Two
- Apache Sqoop
- Sqoop and Its Uses
- Sqoop Processing
- Sqoop Import Process
- Sqoop Connectors
- Demo: Importing and Exporting Data From MySQL TO HDFS
- Practice Project : Apache Sqoop
- Apache Flume
- Flume Model
- Scalability in Flume
- Components in Flume’s Architecture
- Configuring Flume Components
- Demo: Ingest Twitter Data
- Apache Kafka
- Aggregating User Activity Using Kafka
- Kafka Data Model
- Partitions
- Apache Kafka Architecture
- Demo: Setup Kafka Cluster
- Producer Side API Example
- Consumer Side API & Example
- Kafka Connect
- Practice Project : Data Ingestion Into Big Data Systems and ETL
Lesson 04- Distributed Processing MapReduce Framework and Pig
- Distributed Processing MapReduce Framework and Pig
- Distributed Processing in MapReduce
- Word Count Example
- Map Execution Phases
- Map Execution Distributed Two Node Environment
- MapReduce Jobs
- Hadoop MapReduce Job Work Interaction
- Setting up the environment for MapReduce Development
- Set of Classes
- Creating a New Project
- Advanced MapReduce
- Data Types in Hadoop
- Output formats in MapReduce
- Using Distributed Cache
- Joins in MapReduce
- Replicated Join
- Introduction to Pig
- Components of Pig
- Pig Data Model
- Pig Interactive Modes
- Pig Operations
- Various Relations Performed by Developers
- Demo : Analyzing Web Log Data Using MapReduce
- Demo : Analyzing Sales Data and Solving Kpis Using Pig
- Practice Project : Apache Pig
- Demo : Wordcount
- Key Takeaways
- Knowledge Check
- Practice Project : Distributed Processing-MapReduce Framework and Pig
Lessons 05- Apache Hive
- Apache Hive
- Hive SQLover Hadoop MapReduce
- Hive Architecture
- Interface to Run Hive Queries
- Running Beeline From Command Line
- Hive Metastore
- Hive DDL and DML
- Creating New Table
- Data Types
- Validation of Data
- File Format types
- Data serialization
- Hive Table and Avro schema
- Hive optimization partitioning Bucketing and sampling
- None –partitioned Table
- Data insertion
- Dynamic Partioning in Hive
- Bucketing
- What Do Buckets Do ?
- Hive Analytics UDFand UDAF
- Other Functions of Hive
- Demo ;Real-time Analysis Data filtration
- Demo: Real-World problem
- Demo:Data Representation and Import using Hive
- Key Takeaways
- Knowledge Check
- Practice project:Apache Hive
Lesson 06-No SQL Databases HBase
- No SQL Data bases HBase
- NoSQL Introduction
- Demo :Yarn Tuning
- Hbase Overview
- Hbase Architecture
- Data Model
- Connecting to Hbase
- Practice project:Hbase shell
- Key Takeaways
- Knowledge check
- Practice project :No SQL Databases-Hbase
Lesson 07-Basics of Functional Programming and Scala
- Basics of Functional programming and Scala
- Introduction to Scala
- Demo :Scala Installation Function Programming
- Programming with Scala
- Demo: Basic Literals and Arithmetic Programming
- Demo: Logical Operators
- Type Inferences Classes Objects and Functions in Scala
- Demo:Type Inference Functions Anonymous Function and Class
- Collections
- Types of Collections
- Demo:Five types of Collections
- Demo :operations on List
- Scala REPL
- Demo :Features of Scala REPL
- Key Takeaways
- Knowledge Check
- Practice project :Apache Hive
Lesson 08 –Apache Spark Next-Generation Big Data Framework
- Apache Spark Next –Generation Big Data Framework
- History of Spark
- Limitations of Mapreduce in Hadoop
- Introduction of Spark
- Application of in –memory Processing
- Hadoop Ecosystem vs Spark
- Advantages of Spark
- Spark Architecture
- Spark Cluster in Real World
- Demo:Running a Scale Programs in Spark shell
- Demo: Setting Up Execution Environment inIDE
- Demo:Spark Web UI
- Key Takeaways
- Practice Project :Apache Spark Next –Generation Big Data Framework
Lesson 09-Spark Core Processing RDD
- Introduction to Spark RDD
- RDD in Spark
- Creating Spark RDDS
- Pair RDD
- RDD Operations
- Demo: Spark Transformation Detailed Exploration Using Scala Examples
- Demo: Spark Action Detailed Exploration Using Scala
- Caching and Persistence
- Storage Levels
- Lineage and DAG
- Need for DAG
- Debugging in Spark
- Partitioning in Spark
- Scheduling in Spark
- Shuffling in Spark
- Sort Shuffle
- Aggregating Data with Paired RDD
- Demo: Spark Application with Data Written Back to HDFS AND Spark UI
- Demo: Changing Spark Application Parameters
- Demo: Handling Different File Formats
- Demo: Spark RDD WITH Real world Application
- Demo: Optimizing Spark Jobs
- Key Takeaways
- Knowledge Check
- Practice Project : Spark Core Processing RDD
Lesson 10: Spark SQL Processing DataFrames
- Spark SQL Processing Data Frames
- Spark SQL Introduction
- DataFRAMES
- Demo : Handling Different Data Formats
- Demo: Implement various DataFrame Operations
- Demo: UDF AND UDAF
- Interoperating with RDDs
- Demo: Process DataFrame Using SQL Query
- RDD vs DataFrame vs Dataset
- Practice Project : Processing DataFrames
- Key Takeaways
- Knowledge Check
- Practice Project : Spark SQL- Processing DataFrames
Lesson11- Spark Mlib Modelling BigData with Spark
- Spark Mlib Modelling BigData with Spark
- Role of Data Scientist and Data Analyst in Big Data
- Analytics in Spark
- Machine Learning
- Supervised Learning
- Demo: Classification of Linear SVM
- Demo : Linear Regression with Real World Case Studies
Lesson 11- Spark Mlib Modelling BigData with Spark
- Spark Mlib Modeling Big Data with Spark
- Role of Data Scientist and Data Analyst in Big Data
- Analytics in Spark
- Machine learning
- Supervised Learning
- Demo: Classification of Linear SVM
- Demo: Linear Regression with real world case studies
- Unsupervised Learning
- Demo: Unsupervised Clustering K-means
- Reinforcement Learning
- Semi-supervised Learning
- Overview of Mlib
- Milb Pipelnes
- Key Takeaways
- Knowledge Check
- Practice Project: Spark Milb – Modelling Big Data with Spark
Star Certification – https://www.starcertification.org/
RELATED COURSES
TO GET EXCLUSIVE BENEFITS
ADDITIONAL LINKS
We accept payment by Cash, Bank Transfer, Cheque, Credit Cards, eSewa and Fonepay.
Jame Market, Ghantaghar, Kathmandu, Nepal
01-5333117 / 01-5333121
info@computerpoint.com.np
Sunday – Friday, 07:00 AM – 07:00 PM
