Version: 2.1.0

Replacing Datanode Disks

Overview

Ozone Datanodes can manage multiple data volumes (disks) simultaneously. This document describes how volume failures are detected, handled, and how to replace failed disks in an Ozone Datanode.

Volume Configuration

Data Volumes

Data volumes are configured using the hdds.datanode.dir property. Each directory represents a volume (disk) that the Datanode will manage.

Example:

<property>
  <name>hdds.datanode.dir</name>
  <value>/data1,/data2,/data3</value>
</property>

This configuration tells the Datanode to manage three separate volumes: /data1, /data2, and /data3.

Metadata Volumes

Metadata volumes store Ratis metadata (logs) and are configured using hdds.container.ratis.datanode.storage.dir. If not explicitly configured, metadata is stored in the default metadata directories.

Database Volumes

Database volumes store per-volume RocksDB instances and are configured using hdds.datanode.container.db.dir. This is optional - if not specified, RocksDB instances are stored on the same disk as HDDS data.

note

By default, metadata and database volumes share the same physical disks as data volumes. They can be configured on separate disks (e.g., SSDs) for better performance, but this is optional.

Volume Failure Tolerance

Ozone Datanodes can tolerate a configurable number of volume failures before stopping service. The following configuration properties control this behavior:

Data Volume Failure Tolerance

Property: hdds.datanode.failed.data.volumes.tolerated

Default: -1 (unlimited)

Description: The number of data volumes that are allowed to fail before a Datanode stops offering service. When set to -1, unlimited failures are tolerated, but at least one good volume must remain for the Datanode to continue operating.

Example:

<property>
  <name>hdds.datanode.failed.data.volumes.tolerated</name>
  <value>2</value>
</property>

This allows up to 2 data volumes to fail before the Datanode stops service.

Metadata Volume Failure Tolerance

Property: hdds.datanode.failed.metadata.volumes.tolerated

Default: -1 (unlimited)

Description: The number of metadata volumes that are allowed to fail before a Datanode stops offering service. Similar to data volumes, -1 means unlimited failures are tolerated, but at least one good metadata volume must remain.

Database Volume Failure Tolerance

Property: hdds.datanode.failed.db.volumes.tolerated

Default: -1 (unlimited)

Description: The number of database volumes that are allowed to fail before a Datanode stops offering service. When set to -1, unlimited failures are tolerated, but at least one good database volume must remain.

Volume Failure Detection

Startup Checks

During Datanode startup, the system performs comprehensive checks on all configured volumes to determine if any have failed. The initialization process:

Parses all volume locations from configuration
Attempts to create and access each volume
Performs directory existence and permission checks
Moves failed volumes to a separate failed volume map

If a volume fails during initialization:

The volume is immediately marked as failed
It is moved from the active volume map to the failed volume map
The Datanode continues startup if enough volumes remain (based on tolerance settings)
If all volumes fail or tolerance limits are exceeded, the Datanode will abort startup

Periodic Disk Checks

After startup, the Datanode runs periodic disk health checks to detect volumes that fail during runtime.

Property: hdds.datanode.periodic.disk.check.interval.minutes

Default: 60 (minutes)

Description: The interval at which periodic disk checks are performed. These checks verify:

Directory existence and accessibility
Directory permissions
Disk I/O functionality (read/write tests)
For data volumes with RocksDB: database accessibility

Example:

<property>
  <name>hdds.datanode.periodic.disk.check.interval.minutes</name>
  <value>30</value>
</property>

This configures disk checks to run every 30 minutes instead of the default 60 minutes.

Disk Check Configuration

Additional properties control disk health check behavior:

Property	Default	Description
`hdds.datanode.disk.check.io.test.count`	`3`	Number of IO tests required to determine failure. Set to `0` to disable IO checks.
`hdds.datanode.disk.check.io.failures.tolerated`	`1`	IO test failures allowed before marking volume as failed
`hdds.datanode.disk.check.io.file.size`	`100B`	Size of temporary file used for I/O health checks
`hdds.datanode.disk.check.min.gap`	`10m`	Minimum time gap between successive checks of the same volume
`hdds.datanode.disk.check.timeout`	`10m`	Maximum time allowed for a disk check to complete

Failed Volume Handling

When a volume fails, it is:

Immediately removed from the active volume map and added to the failed volume map
Excluded from container allocation (only healthy volumes are considered by volume choosing policies)
Tracked in metrics (healthy count decremented, failed count incremented)

If failed volumes exceed the tolerance limit, the Datanode triggers fatal failure handling and may stop offering service.

Replacing Failed Disks

note

Important Limitation: No Hotswap Support

Ozone Datanodes do not currently support hotswap of disks. To update the disk list (add or remove volumes), you must restart the Datanode process.

Steps to Replace a Failed Disk

Identify Failed Volumes
- Check Datanode logs for volume failure messages
- Monitor Datanode Web UI or metrics to see volume states
- Failed volumes will show as "FAILED" in the volume state

Shut Down the Datanode

# Stop the Datanode service
ozone --daemon stop datanode

Update Configuration
- Edit the Datanode configuration file (ozone-site.xml or equivalent)
- Remove the failed volume directory from hdds.datanode.dir
- If replacing with a new disk, add the new volume directory
Example - Removing /data3 and adding /data4:
```

<property>
  <name>hdds.datanode.dir</name>
  <value>/data1,/data2,/data3</value>
</property>


<property>
  <name>hdds.datanode.dir</name>
  <value>/data1,/data2,/data4</value>
</property>
```

Restart the Datanode

# Start the Datanode service
ozone --daemon start datanode

Verify Volume Status
- Check Datanode logs to confirm volumes are initialized correctly
- Verify in Web UI that the new volume appears as healthy
- Monitor metrics to ensure the failed volume is no longer present

Physical Disk Replacement

When physically replacing a disk:

Verify containers have been replicated (handled automatically by Ozone)
Shut down the Datanode cleanly
Physically replace the disk
Format and mount the new disk to the desired path
Update hdds.datanode.dir configuration
Restart the Datanode

Monitoring Volume Health

Volume status can be viewed via:

Datanode Web UI: http://<datanode-host>:<datanode-http-port> - Shows volume directory, type, capacity, usage, and state (HEALTHY/FAILED)
JMX Metrics: http://<datanode-host>:<datanode-http-port>/jmx?qry=Hadoop:service=HddsDatanode,name=VolumeInfoMetrics* - Exposes per-volume metrics including state

Best Practices

Configure tolerance settings based on redundancy requirements (allow 1-2 failures for production)
Regularly monitor volume health via Web UI and metrics
Plan maintenance windows for disk replacement (hotswap not supported)
Verify container replication before removing failed disks
Adjust periodic disk check interval based on environment needs
Configure tolerance separately for data, metadata, and database volumes

Overview​

Volume Configuration​

Data Volumes​

Metadata Volumes​

Database Volumes​

Volume Failure Tolerance​

Data Volume Failure Tolerance​

Metadata Volume Failure Tolerance​

Database Volume Failure Tolerance​

Volume Failure Detection​

Startup Checks​

Periodic Disk Checks​

Disk Check Configuration​

Failed Volume Handling​

Replacing Failed Disks​

Steps to Replace a Failed Disk​

Physical Disk Replacement​

Monitoring Volume Health​

Best Practices​

Overview

Volume Configuration

Data Volumes

Metadata Volumes

Database Volumes

Volume Failure Tolerance

Data Volume Failure Tolerance

Metadata Volume Failure Tolerance

Database Volume Failure Tolerance

Volume Failure Detection

Startup Checks

Periodic Disk Checks

Disk Check Configuration

Failed Volume Handling

Replacing Failed Disks

Steps to Replace a Failed Disk

Physical Disk Replacement

Monitoring Volume Health

Best Practices