Learn Apache Spark
Apache Spark Course Introduction
Apache Spark is an open source lightning-fast cluster computing technology, it is mainly designed for fast computation. In this Apache Spark tutorial will study how Apache Spark becomes boon for Real Time Data Analytics in some of the e-commerce and social media companies like Facebook, e-bay, YouTube, amazon and twitter.
prerequisites to learn Spark
- Minimum knowledge on core java and SQL
- Hadoop (HDFS & YARN) knowledge is highly recommended
How Spark can help?
Problem: As we can see below Facebook, YouTube, amazon and twitter generates data for every seconds, how to analyse real time data in Bigdata hadoop.
Apache Spark is an open-source cluster computing framework designed for Real Time Data Analytics.
4.1 Apache Spark SQL
4.2 Spark Streaming
4.4 Spark GraphX
4.5 Spark MLib
5. Spark RDD
6. Spark Shell