A Hadoop cluster is a special type of computational cluster designed specifically for storing and analyzing huge amounts of unstructured data in a distributed computing environment.
These clusters run Hadoop's distributed processing software on low-cost commodity computers. Typically one machine in the cluster is designated to work as the NameNode and another machine the as JobTracker which are known as the masters and rest of the machines in the cluster act as both DataNode and TaskTracker which are called slaves. Hadoop clusters are referred as shared nothing systems because the only thing that is shared between nodes is the network that connects them.
Hadoop clusters boost the speed of data analysis applications. They also are highly scalable because as volume of data increases nodes could be included in a cluster. Hadoop clusters also are highly resistant to failure because each piece of data is copied onto other cluster nodes which ensures that the data is not lost if one node fails.
Facebook is having the largest Hadoop cluster in the world.
These clusters run Hadoop's distributed processing software on low-cost commodity computers. Typically one machine in the cluster is designated to work as the NameNode and another machine the as JobTracker which are known as the masters and rest of the machines in the cluster act as both DataNode and TaskTracker which are called slaves. Hadoop clusters are referred as shared nothing systems because the only thing that is shared between nodes is the network that connects them.
Hadoop clusters boost the speed of data analysis applications. They also are highly scalable because as volume of data increases nodes could be included in a cluster. Hadoop clusters also are highly resistant to failure because each piece of data is copied onto other cluster nodes which ensures that the data is not lost if one node fails.
Facebook is having the largest Hadoop cluster in the world.