Apache Pig Course Introduction

Apache Pig is an open source platform, built on the top of Hadoop to analyzing large data sets. In this Apache Pig tutorial, we will study how Pig helps to handle any kind of data like structured, semi-structured and unstructured data and why Apache Pig is developers best choice to analyzing large data .

prerequisites to learn Apache Pig

  • knowledge on SQL is required
  • Basics of Hadoop and HDFS commands is highly recommended

Why Apache Pig?

Problem: Twitter is an online news and social networking service where users post and interact with messages, on daily basis Twitter generates around 10 TB data, how to analyze?

Apache pig problem


Apache Pig is the best Solution for Twitter to Analyse all Kind of huge data set in Bigdata Hadoop.

pig as solution