Email: [email protected]Phone: 080-42041080 +91 9611824441


Scala Essentials |  Traits and OOPs in Scala |Functional Programming in Scala Introduction to Big Data and Spark | Spark Baby Steps | Playing with RDDs Shark – When Spark meets Hive ( Spark SQL) | Spark Streaming Spark Mlib | Spark GraphX | Project and Installation

2000 Satisfied Learners

Our Courses

  • Drop A Query

    Hadoop – Advance
    Spark & Scala Course Content
    Module 1
    Introduction to Scala
    Learning Objectives – In this module, you will understand basic concepts of Scala,
    motives towards learning a new language and get your set-up ready.

    • Why Scala?
    • What is Scala?
    • Introducing Scala
    • Installing Scala
    • Journey – Java to Scala
    • First Dive – Interactive Scala
    • Writing Scala Scripts – Compiling Scala Programs
    • Scala Basics
    • Scala Basic Types
    • Defining Functions
    • IDE for Scala, Scala Community

    Module 2
    Scala Essentials
    Learning Objectives – In this module, you will learn essentials of Scala that are
    needed to work on it.

    • Immutability in Scala – Semicolons
    • Method Declaration, Literals
    • Lists
    • Tuples
    • Options
    • Maps
    • Reserved Words
    • Operators
    • Precedence Rules
    • If statements
    • Scala For Comprehensions
    • While Loops
    • Do-While Loops
    • Conditional Operators
    • Pattern Matching
    • Enumerations

    Module 3
    Traits and OOPs in Scala
    Learning Objectives – In this module, you will understand implementation of OOPs
    concepts in Scala and use Traits as Mixins

    • Traits Intro – Traits as Mixins
    • Stackable Traits
    • Creating Traits Basic OOPS – Class and Object Basics
    • Scala Constructors
    • Nested Classes
    • Visibility Rules

    Module 4
    Functional Programming in Scala
    Learning Objectives – In this module, you will understand functional programming
    know how for Scala.

    • What is Functional Programming?
    • Functional Literals and Closures
    • Recursion
    • Tail Calls
    • Functional Data Structures
    • Implicit Function Parameters
    • Call by Name
    • Call by Value

    Module 5 
    Introduction to Big Data and Spark
    Learning Objectives – In this module, you will understand what is Big Data, it’s
    associated challenges, various frameworks available and will get the first hand introduction
    to Spark

    • Introduction to Big Data
    • Challenges with Big Data
    • Batch Vs. Real Time Big Data Analytics
    • Batch Analytics – Hadoop Ecosystem Overview
    • Real Time Analytics Options, Streaming Data – Storm
    • In Memory Data – Spark
    • What is Spark?
    • Modes of Spark
    • Spark Installation Demo
    • Overview of Spark on a cluster
    • Spark Standalone Cluster

    Module 6
    Spark Baby Steps
    Learning Objectives – In this module, you will learn how to invoke Spark shell and
    use it for various common operations.

    • Invoking Spark Shell
    • Loading a File in Shell
    • Performing Some Basic Operations on Files in Spark Shell
    • Building a Spark Project with sbt, Building and Running Spark Project with sbt
    • Caching Overview, Distributed Persistence
    • Spark Streaming Overview
    • Example: Streaming Word Count

    Module 7
    Playing with RDDs
    Learning Objectives – In this module, you will learn one of the building blocks of
    Spark – RDDs and related manipulations for implementing business logics.

    • RDDs
    • Transformations in RDD
    • Actions in RDD
    • Loading Data in RDD
    • Saving Data through RDD
    • Scala and Hadoop Integration Hands on

    Module 8
    Shark – When Spark meets Hive ( Spark SQL)
    Learning Objectives – In this module, you will see various offspring’s of Spark like
    Shark, SparkSQL and Mlib. This session is primarily interactive for discussing industrial use
    cases of Spark and latest developments happening in this area.

    • Why Shark?
    • Installing Shark
    • Running Shark
    • Loading of Data
    • Hive Queries through Spark
    • Testing Tips in Scala
    • Performance Tuning Tips in Spark
    • Shared Variables: Broadcast Variables
    • Shared Variables: Accumulators

    Module 9
    Spark Streaming
    Learning Objectives – In this module, you will learn about the major APIs that Spark
    offers. You will get an opportunity to work on Spark streaming which makes it easy to build
    scalable fault-tolerant streaming applications.

    • Spark Streaming Architecture
    • First Spark Streaming Program
    • Transformations in Spark Streaming
    • Fault tolerance in Spark Streaming
    • Check pointing
    • Parallelism level

    Module 10 
    Spark Mlib
    Learning Objectives – In this module, you will learn about the machine learning
    concepts in Spark

    • Classification Algorithm
    • Clustering Algorithm
    • Sequence Mining Algorithm
    • Collbrative filtering

    Module 11
    Spark GraphX
    Learning Objectives – In this module, you will learn about Graph Analysis concepts in

    • Graph analysis with Spark
    • GraphX for graphs
    • Graph-parallel computation

    Module 12
    Project and Installation

    • Installation of Spark and Scala
    • Discussion of real time use cases using Spark
    • Mini project implementation in Spark

    DataQubez University creates meaningful big data & Data Science certifications that are recognized in the industry as a confident measure of qualified, capable big data experts. How do we accomplish that mission? DataQubez certifications are exclusively hands on, performance-based exams that require you to complete a set of tasks. Demonstrate your expertise with the most sought-after technical skills. Big data success requires professionals who can prove their mastery with the tools and techniques of the Hadoop stack. However, experts predict a major shortage of advanced analytics skills over the next few years. At DataQubez, we’re drawing on our industry leadership and early corpus of real-world experience to address the big data & Data Science talent gap.

    How To Become Certified Apache – Spark Developer


    Certification Code – DQCP – 504

    Certification Description – DataQubez Certified Professional Apache – Spark Developer

    Exam Objectives

    Configuration :-

    Define and deploy a rack topology script, Change the configuration of a service using Apache Hadoop, Configure the Capacity Scheduler, Create a home directory for a user and configure permissions, Configure the include and exclude DataNode files

    Troubleshooting :-

    Demonstrate ability to find the root cause of a problem, optimize inefficient execution, and resolve resource contention scenarios, Resolve errors/warnings in Hadoop Cluster, Resolve performance problems/errors in cluster operation, Determine reason for application failure, Configure the Fair Scheduler to resolve application delays, Restart an Cluster service, View an application’s log file, Configure and manage alerts, Troubleshoot a failed job

    High Availability :-

    Configure NameNode, Configure ResourceManager, Copy data between two clusters, Create a snapshot of an HDFS directory, Recover a snapshot, Configure HiveServer2

    Data Ingestion - with Sqoop & Flume :-

    Import data from a table in a relational database into HDFS, Import the results of a query from a relational database into HDFS, Import a table from a relational database into a new or existing Hive table, Insert or update data from HDFS into a table in a relational database, Given a Flume configuration file, start a Flume agent, Given a configured sink and source, configure a Flume memory channel with a specified capacity

    Data Processing through Spark & Spark SQL& Python :-

    Frame big data analysis problems as Apache Spark scripts, Optimize Spark jobs through partitioning, caching, and other techniques, Develop distributed code using the Scala programming language, Build, deploy, and run Spark scripts on Hadoop clusters, Transform structured data using SparkSQL and DataFrames

    Recomandtion Engine using Spark MLLIB & Python :-

    Using MLLib to Produce Recomandation Engine, Run Page rank algorithem, using dataframes with mllib, Machine Learning with Spark

    Stream Data Processing using Spark Streaming& Python :-

    Process Stream Data using spark streaming.

    Regression with Spark& Python:-

    Introduction to Linear Regression, Introduction to Regression Section, Linear Regression Documentation Alternate Linear Regression Data CSV File, Linear Regression Walkthrough , Linear Regression Project

    For Exam Registration of Apache – Spark Developer, Click here:

    Spark&Scala trainer is having 17 year experience in IT with 10 years in data warehousing &ETL experience. It has been six years now that he has been working extensively in BigData ecosystem tool sets for few of the banking-retail-manufacturing clients. He is a certified HDP-Spark Developer and Cloudera certified Hbase specialist. He also have done corporate sessions and seminars both in India and abroadRecently he was engaged by Pune University for 40 hour sessions on BigData analytics to the senior professors of Pune. All faculties at our organization are currently working on the technologies in reputed organization. The curriculum that is imparted is not just some theory or talk with some PPTs. We absolutely frame the forum in such a way so that at the end the lessons are imparted in easy language and the contents are well absorbed by the candidates. The sessions are backed by hands-on assignment. Also that the faculties are industry experience so during the course he does showcase his practical stories.
    • Spark & Scala Training By 17+ Years experienced Real Time Trainer
    • A pool of 200+ real time Practical Sessions on Spark & Scala
    • Scenarios and Assignments to make sure you compete with current Industry standards
    • World class training methods
    • Training  until the candidate get placed
    • Certification and Placement Support until you get certified and placed
    • All training in reasonable cost
    • 10000+ Satisfied candidates
    • 5000+ Placement Records
    • Corporate and Online Training in reasonable Cost
    • Complete End-to-End Project with Each Course
    • World Class Lab Facility which facilitates I3 /I5 /I7 Servers and Cisco UCS Servers
    •  Covers Topics other than from Books which is required for the IT Industry
    • Resume And Interview preparation with 100% Hands-on Practical sessions
    • Doubt clearing sessions any time after the course
    • Happy to help you any time after the course
    ML and GraphX ,’R’ Language Data Analytics / Science

    Cloudera Certified Professional (CCP)

    CCP Data Engineer


    We are glad that you preferred to contact us. Please fill our short form and one of our friendly team members will contact you back.

    Quik Enquiry