Spark Shell

Introduction

    Spark provides an interactive shell called Spark shell. Spark was established in 1934 as the Spark shell. It is a powerful tool to analyze data interactively. It is available either in Scala or Python language. It is a Spark’s primary abstraction and also distributed collection of items called a Resilient Distributed Dataset (RDD). RDDs can be created from Hadoop Input Formats such as HDFS files or by transforming other RDDs. Spark performs streaming by using RDD.

Spark Shell Usage                        

  • Its provides an easy and convenient way to perform operation quickly.
  • In Spark shell need not to develop a full program, just packaging it then deploying.
  • Shell is particularly helpful for fast interactive prototyping.
  • It is a standalone Spark application, written in Scala that offers environment with auto-completion using TAB key.
  • It used run ad-hoc queries.
  • Shell is a convenient tool to explore the many things which are available in Spark with immediate feedback.
  • It’s  available for different languages like,
  1. spark-shell for Scala
  2. pyspark for Python
  3. sparkR for R.