Learn Apache Spark

Apache Spark Course Introduction

Apache Spark is an open source lightning-fast cluster computing technology, it is mainly designed for fast computation. In this Apache Spark tutorial will study how Apache Spark becomes boon for Real Time Data Analytics in some of the e-commerce and social media companies like Facebook, e-bay, YouTube, amazon and twitter.

prerequisites to learn Spark

  • Minimum knowledge on core java and SQL
  • Hadoop (HDFS & YARN) knowledge is highly recommended

How Spark can help?

Problem: As we can see below Facebook, YouTube, amazon and twitter generates data for every seconds, how to analyse real time data in Bigdata hadoop.

Why spark

 

Solution:

Apache Spark is an open-source cluster computing framework designed for Real Time Data Analytics.

spark as solution

Contents

1.         Spark Application

2.         Apache Spark Features

3.         Apache Spark Architecture

4.         Apache Spark Components

4.1       Apache  Spark SQL 

4.2       Spark Streaming

4.3       SparkR

4.4       Spark GraphX

4.5       Spark MLib

5.         Spark RDD

6.         Spark Shell

7.         Spark Shell with Scala