Pig v/s Hive v/s SQL v/s MapReduce

Let’s discuss about the Pig v/s Hive v/s SQL v/s MapReduce.

Let’s see the differences between pig and hive.

Pig uses a language called Pig Latin.Hive uses a language called HiveQL.
Pig Latin is a data flow language.HiveQL is a query processing language.
It was originally created at Yahoo.It was originally created at Facebook.
Apache Pig can handle structured, unstructured, and semi-structured data.Basically Hive handle only structured data.
It is generally used by Researchers and Programmers.Hive is used mainly by data analysts.
It is mainly used for programming.It is mainly used for creating reports.
This component operates on the client side of any cluster.This component operates on the server side of any cluster.
Pig supports Avro.It does not support.

Let’s see the differences between pig and SQL.

Pig Latin is a procedural language.SQL is a declarative language.
Nested relational data model is used in pig. flat relational data model used in SQL.
Here schema is optional.Here Schema is mandatory.
It provides limited opportunity for Query optimization.It provides more opportunity for query optimization.

Let’s see the differences between pig and MapReduce.

It is a data flow language.It is a data processing paradigm.
Pig is a high level language.MapReduce is low level language.
Here Performing a Join operation  is simple.Here performing join between datasets  is quite difficult.
Basic knowledge of SQL is enough to work conveniently with Apache Pig.Exposure to Java is must to work with MapReduce.
It uses multi-query approach to decrese the length of the codes.Here length of the code is very high.
In pig there is no need for compilation,

because for every execution pig operators internally converted to MapReduce job.

MapReduce jobs have a long compilation process.



“That’s all about the comparison of Pig v/s hive v/s SQL v/s Mapreduce, acoording to our project requirement we can choose any tool in bigdata Hadoop”.