Apache HDFS Architecture
In this topic we are going to discuss on the Apache HDFS Architecture and a complete picture of HDFS architecture. At the end of this topic we will get to know the components of HDFS and its functions. The forth coming “HDFS Features” topic will give us a detailed understanding on each of the HDFS components.
As you can see in the below picture, Apache HDFS Storage Architecture has components are categorized into “Master and Salve”.
- A single Master -Name Node is the one which contains the Meta data. Meta data is the set of data that describes storage location and other information of other information.
- There is multiple Slave-Data Nodes which are placed or arranged in different racks in the cluster. Data blocks or actual data are stored in one Data Node are duplicated to couple of other Data Nodes placed in different racks and this duplication process is otherwise called replication.
- A data block once created or written cannot be edited. Client communicates only with the Name node for any task to be performed because name nodes acts as the master or center piece of the cluster.
Coming up topics will give us a clear picture of HDFS architecture.
For example; a home page of a website having a list of Names “Raja, David, Siva” so when you click on those names it will take you to another sub page where you can access all the data (Age, DOB, Address, Parents, Education etc) about those Names. So in the above example, home page of the website is the Master -Name Node, the names (Raja, David, Siva etc) are the meta data , the sub page of the website is the Slave -Data Node and the data (Age, DOB, Address, Parents, Education etc) are the data blocks stored in the data nodes. The above example is framed just for understanding purpose. You can just relate this example to the HDFS architecture.
Its very important to understand the Apache HDFS Architecture as its a base, read it again if you having doubt. Its worth to understand, lets read HDFS feature in next section.