After you have installed the new Hadoop version

  1. Optionally, update the conf/slaves file before starting to reflect the current set of active nodes.
  2. Optionally, change the configuration of the NameNode’s and the JobTracker’s port numbers (i.e. the fs.default.name property in conf/core-site.xml plus conf/hdfs-site.xml and the mapred.job.tracker property in conf/mapred-site.xml respectively) in order to ignore unreachable nodes that are still running the old version (with the old port numbers), preventing them from connecting and disrupting system operation.
  3. Start the actual HDFS upgrade process.
    • Upgrade the NameNode by converting the checkpoint to the new version format
    • $ hadoop-daemon.sh start namenode -upgrade

Note: You need to add the -upgrade switch only once for actual upgrade process. Once it has successfully completed, you can start the NameNode via hadoop-daemon.sh start-dfs.sh and start-all.sh like you’d normally do. The un-finalized upgrade will be in effect until you either finalize the upgrade to make it permanent or until you perform a rollback of the upgrade (see below).

The NameNode log will show messages like the following:

NameNode log file

2011-06-21 14:40:32,579 INFO org.apache.hadoop.hdfs.server.common.Storage: Upgrading image directory /path/to/nn_namespace_dir.

old LV = -18; old CTime = 0.

new LV = -31; new CTime = 1308660032579

2011-06-21 14:40:32,581 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 8447 saved in 0 seconds.

2011-06-21 14:40:32,639 INFO org.apache.hadoop.hdfs.server.common.Storage: Upgrade of /path/to/nn_namespace_dir is complete.

2011-06-21 14:40:32,639 INFO org.apache.hadoop.hdfs.server.common.Storage: Upgrading image directory /path/to/nn_namespace_dir_bk.

old LV = -18; old CTime = 0.

new LV = -31; new CTime = 1308660032579

2011-06-21 14:40:32,644 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 8447 saved in 0 seconds.

2011-06-21 14:40:32,650 INFO org.apache.hadoop.hdfs.server.common.Storage: Upgrade of /path/to/nn_namespace_dir_bk is complete.

2011-06-21 14:40:32,651 INFO org.apache.hadoop.hdfs.server.namenode.NameCache: initialized with 0 entries 0 lookups

2011-06-21 14:40:32,652 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 441 msecs

2011-06-21 14:40:32,660 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe mode ON.

    • The NameNode upgrade process can take a while depending on how many files you have. You can follow the process by inspecting the NameNode logs, by running “hadoop dfsadmin -upgradeProgress status and/or by accessing the NameNode Web UI. Once the upgrade process has completed, the NameNode Web UI will show a message similar to Upgrade for version -31 has been completed. Upgrade is not finalized You will finalize the upgrade in a later step. Right now, the NameNode is in Safe Modewaiting for the DataNodes to connect.

The NameNode is in Safe Mode waiting for the DataNodes to connect.

An example status output of the upgradeProgress command at this stage:

$ hadoop dfsadmin -upgradeProgress status

Upgrade for version -31 has been completed.

Upgrade is not finalized.

    • Optionally, save a complete listing of the new HDFS namespace to a local file:
    • $ hadoop dfs -lsr / > dfs-v-new-lsr-0.log

and compare it with dfs-v-old-lsr-1.log you created previously.

    • Start the HDFS cluster. Since the NameNode is already running, only the DataNodes and the SecondaryNameNode will actually be started.
    • $ start-dfs.sh

Note: You do not need to add the_ upgrade _switch here because it is only passed to the NameNode anyways, and the NameNode has already been instructed to perform an upgrade.

After your DataNodes have completed the upgrade process, you should see a message Safe mode will be turned off automatically in X seconds. in the NameNode Web UI. The NameNode should then automatically exit Safe Mode and HDFS will be in full operation. You can monitor the process via the NameNode Web UI and the NameNode/DataNode logs.

DataNode log file

2011-06-21 14:50:56,103 INFO org.apache.hadoop.hdfs.server.common.Storage: Upgrading storage directory /app/hadoop/tmp/dfs/data.

old LV = -18; old CTime = 0.

new LV = -31; new CTime = 1308660032579

2011-06-21 14:50:56,196 INFO org.apache.hadoop.hdfs.server.common.Storage: HardLinkStats: 1 Directories, including 0 Empty Directories, 0 single Link o

perations, 1 multi-Link operations, linking 80 files, total 80 linkable files. Also physically copied 1 other files.

2011-06-21 14:50:56,196 INFO org.apache.hadoop.hdfs.server.common.Storage: Upgrade of /app/hadoop/tmp/dfs/data is complete.

You can check the NameNode Web UI whether the NameNode has already exited Safe Mode (see screenshot below). Alternatively, you can run hadoop dfsadmin -safemode get.

The NameNode has exited Safe Mode, and DataNodes have started to connect to it.

Note that the status output of the upgradeProgress command should not have changed at this point:

$ hadoop dfsadmin -upgradeProgress status

Upgrade for version -31 has been completed.

Upgrade is not finalized.

  1. Perform some sanity checks on the new HDFS
    • Create a list of DataNodes participating in the updated cluster.
    • $ hadoop dfsadmin -report > dfs-v-new-report-1.log

and compare it with dfs-v-old-report-1.log to ensure all DataNodes previously belonging to the cluster are up and running.

    • Save a complete listing of the new HDFS namespace to a local file:
    • $ hadoop dfs -lsr / > dfs-v-new-lsr-1.log

and compare it with dfs-v-old-lsr-1.log. These files should be identical unless the format of hadoop fs -lsr reporting or the data structures have changed in the new version.

    • Perform a filesystem check:
    • $ hadoop fsck / -files -blocks -locations > dfs-v-new-fsck-1.log

and compare with dfs-v-old-fsck-1.log. These files should be identical, unless the hadoop fsck reporting format has changed in the new version.

  1. Start the MapReduce cluster
  2. $ start-mapred.sh
  3. Let internal customers perform their own testing on the new HDFS filesystem version.
  4. Roll back or finalize the upgrade (optional).

results matching ""

    No results matching ""