Pig Execution Mode Execution Mechanism

Pig has two Pig Execution Mode Execution Mechanism. They are,

  1. Local Mode
  2. MapReduce Mode

Local Mode

To run pig in Local Mode all the files are installed and run from the local host and local file system. There is no need of Hadoop or HDFS.

* This mode is generally used for testing purpose.

* It is suitable for small data sets.

* Pig runs in a single JVM and accesses the local filesystem in Local mode.

*  The following command is used to start the local mode of execution.

 $ pig -x local

grunt>

MapReduce Mode

In MapReduce mode Pig will take the input from HDFS paths only, and after processing data it will put output files on top of HDFS.

* It is a default mode in Apache pig.

* I this mode whenever we execute the Pig Latin statements to process the data, MapReduce job is invoked in the back-end to perform a particular operation on the data that exists in the HDFS.

* It is also called as HDFS mode or Clustered mode.

* To start MapReduce mode of execution, the following command is used.

 $ pig

grunt>

Pig Execution Mechanisms 

Pig scripts can be executed in two ways. They are,

  1. Interactive Mode
  2. Batch Mode

Interactive Mode

* In Interactive mode we can run Apache Pig using the Grunt shell.

* In Grunt shell we can enter the Pig Latin statements and get the output using Dump operator.

Example

In this example it loads the data from the file system of the path ‘user/beyondcorner/emp.txt’ and display the contents in the terminal only using Dump operator.

Grunt> A = load ‘user/beyondcorner/emp.txt’ using PigStorage(‘,’);

Grunt> Dump A;

Local Mode

$ pig -x local

– Connecting to …

grunt>

MapReduce Mode

$ pig

– Connecting to …

grunt>

Batch Mode

* Apache Pig can run in Batch mode by writing the Pig Latin script in a single file with .pig extension.

Example

The below pigLatin script is named as “name.pig”. It will load the employee details from the ‘/user/beyondemp/emp.txt’ path, extract only the names of the employee and result will be stored in‘/home/beyondcorner/name.out’ path.

/* name.pig */

Grunt> A= load ‘/user/beyondemp/emp.txt’ using PigStorage(‘,’)

AS (name:chararray,id:int,salary:int);   — load the emp.text file

Grunt> X = foreach A generate name;                    — extract the employee name

Grunt> store X into ‘/home/beyondcorner/name.out’;  — write the results to a file name name.out

 Local Mode

$ pig -x local name.pig

MapReduce Mode

$ pig name.pig

References

https://pig.apache.org/docs/r0.9.1/start.html#interactive-mode

https://pig.apache.org/docs/r0.9.1/start.html#batch-mode

https://pig.apache.org/docs/r0.9.1/start.html#execution-modes