Pig Execution Mode Execution Mechanism
Pig has two Pig Execution Mode Execution Mechanism. They are,
- Local Mode
- MapReduce Mode
To run pig in Local Mode all the files are installed and run from the local host and local file system. There is no need of Hadoop or HDFS.
* This mode is generally used for testing purpose.
* It is suitable for small data sets.
* Pig runs in a single JVM and accesses the local filesystem in Local mode.
* The following command is used to start the local mode of execution.
| $ pig -x local|
In MapReduce mode Pig will take the input from HDFS paths only, and after processing data it will put output files on top of HDFS.
* It is a default mode in Apache pig.
* I this mode whenever we execute the Pig Latin statements to process the data, MapReduce job is invoked in the back-end to perform a particular operation on the data that exists in the HDFS.
* It is also called as HDFS mode or Clustered mode.
* To start MapReduce mode of execution, the following command is used.
| $ pig|
Pig Execution Mechanisms
Pig scripts can be executed in two ways. They are,
- Interactive Mode
- Batch Mode
* In Interactive mode we can run Apache Pig using the Grunt shell.
* In Grunt shell we can enter the Pig Latin statements and get the output using Dump operator.
In this example it loads the data from the file system of the path ‘user/beyondcorner/emp.txt’ and display the contents in the terminal only using Dump operator.
|Grunt> A = load ‘user/beyondcorner/emp.txt’ using PigStorage(‘,’);|
Grunt> Dump A;
|$ pig -x local|
– Connecting to …
– Connecting to …
* Apache Pig can run in Batch mode by writing the Pig Latin script in a single file with .pig extension.
The below pigLatin script is named as “name.pig”. It will load the employee details from the ‘/user/beyondemp/emp.txt’ path, extract only the names of the employee and result will be stored in‘/home/beyondcorner/name.out’ path.
|/* name.pig */|
Grunt> A= load ‘/user/beyondemp/emp.txt’ using PigStorage(‘,’)
AS (name:chararray,id:int,salary:int); — load the emp.text file
Grunt> X = foreach A generate name; — extract the employee name
Grunt> store X into ‘/home/beyondcorner/name.out’; — write the results to a file name name.out
|$ pig -x local name.pig|
|$ pig name.pig|