Components of Big Data - Hadoop System
In this blog i will explain important components which are part of Hadoop System. I will give very brief overview of these components. Below diagram shows very high level components in Hadoop system. Master Node (MN) Name Node (NN) It is a daemon process runs on Master Node. Takes care of reading the data file to be analyzed. Splits the data file based on block size configured, default is 64MB and 128 MB. Distributes the split data file across multiple Data Node. Maintains the index file to keep track of where the data has been distributed. Think this as "Table of Content" in a book. It provides input to Job Tracker for location of the data files in Data Node. This is one part of HDFS system in Hadoop. Job Tracker (JT) Job tracker is also a daemon process. This is part of Processing Engine of Hadoop system. It is responsible for running the program which will analyze the data and produce results. Job Tracker communicates with NN to identify the location of the the data file. ...