A distributed file system is a file system where a file is divided into blocks and is distributed across a network of machines (data nodes). A distributed file system consists of data and name nodes.

A data node stores the data.

A [name node](2501181855#Name Node) is acts as a master node and is responsible for:

  • metadata storing and lookup
  • data replication
  • deciding where a new file gets stored
  • instructs the client and data nodes