Apache Pig Diagnostic Operators

Let’s study about Apache Pig Diagnostic Operators.

Diagnostic operators used to verify the loaded data in Apache pig. There are four different types of diagnostic operators as shown below.

1. Dump operator

* The Dump operator is used to run the Pig Latin statements and display the results on the screen.

* It is used for debugging Purpose.

Example

Assume we have a file called “employee.txt” in HDFS with the following content.

100,Roshan,23,HR

101,Roy,27,CS

102,Shruthi,31,IT

103,Disha,28,EC

104,Gowri,30,HR

 Step 1: In this step will load the data using “load” operator into the pig.

grunt> empdata = LOAD ‘hdfs://localhost:9000/emp_pigdata/employee.txt’  USING   PigStorage(‘,’);

Step 2: In this step using “dump” operator will display the results on the screen.

grunt> Dump empdata

2. Describe operator

* The describe operator is used to view the schema of a relation.

Example                                           

Let us consider a previous example file called “employee.txt” in HDFS.

Step 1: In this step will load the data using “load” operator into the pig.

grunt> empdata = LOAD ‘hdfs://localhost:9000/emp_pigdata/employee.txt’  USING PigStorage(‘,’);

Step 2: In this step view the schema of a relation using “describe” operator.

grunt> describe empdata;

3. Explain operator

* We can display the physical, logical, and MapReduce execution plans of a relation using explain operator.

Example                    

Let us consider a previous example file called “employee.txt” in HDFS.

Step 1: In this step will load the data using “load” operator into the pig.

grunt> empdata = LOAD ‘hdfs://localhost:9000/emp_pigdata/employee.txt’  USING PigStorage(‘,’);

Step 2: In this step will display the logical, physical, and MapReduce execution plans of a relation using explain operator.

grunt> explain empdata;

 4. Illustration operator                        

* The illustrate operator get the step-by-step execution of a sequence of statements.

Example

Let us consider a previous example file called “employee.txt” in HDFS.

Step 1: In this step will load the data using “load” operator into the pig.

grunt> empdata = LOAD ‘hdfs://localhost:9000/emp_pigdata/employee.txt’  USING PigStorage(‘,’);

Step 2: In this step will see the step-by-step execution of a sequence of statements using illustration operators.

grunt> illustrate empdata;

“That’s all about the Apache Pig – Diagnostic Operators”