What is Apache Hadoop Introduction
Learn Apache Hadoop from basic
Apache Hadoop Introduction, its very important as its create the base to learn the apache Hadoop.
Lot of engineers have question for big data or hadoop, They always search for ” difference between Big Data and Apache Hadoop”. Don’t get confuse, Big Data is just the generic term of very large amount of data and Apache Hadoop is the tool which can process/store the very large amount of data.
What is Apache Hadoop
As you know that, Apache Hadoop open source tool that can provide the solution for processing the large amount of data with efficient manner. Means if you have the data in GBs n TBs and you want to extract the information from that very large log file that can be possible using Apache Hadoop.
Real life Big Data example using Hadoop: Let us suppose you have a 50GB of log file of your application and you need that, “How many time the error was occurred in one day”, Using Hadoop you can write the program and get the number of errors that was occurred in one day.
Key Points for Hadoop Below key points will give you the overview of ‘Apache Hadoop Introduction’
- Apache Hadoop is an open source technology
- Java and much more technologies can be used to write the application for Apache Hadoop.
- Apache Hadoop was developed using Java-based programming framework that supports the processing of large data sets in a distributed computing environment. That environment will be discussed in details later.
- Apache Hadoop use GFS(Google File System) technology.
Basic Architecture for Apache Hadoop Introduction:
Just have a overview as part of apache Hadoop introduction. Hadoop mainly consist of two component, one is for data storage is call HDFS and another one for data processing which is called MapReduce. HDFS: It has a distributed file system, called the Hadoop Distributed File System (HDFS used for stroing the data), which enables fast data transfer among the nodes. MapReduce: it is a distributed computation framework called MapReduce. Using mapReduce, Apache hadoop can process the HDFS data. Hope you understand the “Apache Hadoop Introduction”. Do not think about the details of apache Hadoop architecture,