Learn Apache Sqoop

Apache Sqoop Course Introduction

Apache Sqoop is an open source tool mainly designed for import and export data between RDBMS and hadoop. In this Apache Sqoop tutorial we will study about loading of huge amounts of data from diverse sources into Hadoop clusters and importance of Sqoop in projects.

Prerequisites to Sqoop

  • Basics of computer technology and terminology
  • Familiarity with command-line interfaces like bash
  • Knowledge of Relational database management systems
  • Basic familiarity with the purpose and operation of Hadoop

Where Sqoop can help?

Problem: There are about 5,000 crime incidents that happened in the city of Bangalore in the last 3 months. It is stored in the form of relational data i.e RDBMS, how to import data into Hadoop.

Solution:

Apache Sqoop is an open source tool used to import RDBMS data into Hadoop for storage and processing.

Sqoop

 

Contents

1.         Sqoop Introduction and Architecture 

1.1       What is apache sqoop

1.2       Why Sqoop is Required

1.3       Features and Limitations of Sqoop 

1.4       Sqoop v/s flume

1.5       Sqoop 1 and 2 Architecture along with working

2.         Sqoop Commands and Connectors 

2.1       Sqoop Commands- Import Export 

2.2       Sqoop Export

2.3       Sqoop list database and Tables

2.4       Sqoop Connectors

3.         Sqoop Job 

4.         Sqoop Eval 

5.         Sqoop Codegen

6.         Sqoop Data Compression Techniques