Configuration Key Appendix
This page provides a comprehensive overview of the configuration keys available in Ozone.
| Name | Default Value | Tags | Description |
|---|---|---|---|
fs.trash.classname | org.apache.hadoop.fs.ozone.OzoneTrashPolicy | OZONE, OZONEFS, CLIENT | Trash Policy to be used. |
hadoop.hdds.db.rocksdb.WAL_size_limit_MB | 0MB | OM, SCM, DATANODE | The total size limit of WAL log files. Once the total log file size exceeds this limit, the earliest files will be deleted.Default 0 means no limit. |
hadoop.hdds.db.rocksdb.WAL_ttl_seconds | 1200 | OM, SCM, DATANODE | The lifetime of WAL log files. Default 1200 seconds. |
hadoop.hdds.db.rocksdb.keep.log.file.num | 10 | OM, SCM, DATANODE | Maximum number of RocksDB application log files. |
hadoop.hdds.db.rocksdb.logging.enabled | false | OM, SCM, DATANODE | Enable/Disable RocksDB logging for OM. |
hadoop.hdds.db.rocksdb.logging.level | INFO | OM, SCM, DATANODE | OM RocksDB logging level (INFO/DEBUG/WARN/ERROR/FATAL) |
hadoop.hdds.db.rocksdb.max.log.file.size | 100MB | OM, SCM, DATANODE | Maximum size of RocksDB application log file. |
hadoop.hdds.db.rocksdb.writeoption.sync | false | OM, SCM, DATANODE | Enable/Disable Sync option. If true write will be considered complete, once flushed to persistent storage. If false, writes are flushed asynchronously. |
hadoop.http.authentication.kerberos.keytab | ${user.home}/httpfs.keytab | The Kerberos keytab file with the credentials for the HTTP Kerberos principal used by httpfs in the HTTP endpoint. httpfs.authentication.kerberos.keytab is deprecated. Instead use hadoop.http.authentication.kerberos.keytab. | |
hadoop.http.authentication.kerberos.principal | HTTP/${httpfs.hostname}@${kerberos.realm} | The HTTP Kerberos principal used by HttpFS in the HTTP endpoint. The HTTP Kerberos principal MUST start with 'HTTP/' per Kerberos HTTP SPNEGO specification. httpfs.authentication.kerberos.principal is deprecated. Instead use hadoop.http.authentication.kerberos.principal. | |
hadoop.http.authentication.signature.secret.file | ${httpfs.config.dir}/httpfs-signature.secret | File containing the secret to sign HttpFS hadoop-auth cookies. This file should be readable only by the system user running HttpFS service. If multiple HttpFS servers are used in a load-balancer/round-robin fashion, they should share the secret file. If the secret file specified here does not exist, random secret is generated at startup time. httpfs.authentication.signature.secret.file is deprecated. Instead use hadoop.http.authentication.signature.secret.file. | |
hadoop.http.authentication.type | simple | Defines the authentication mechanism used by httpfs for its HTTP clients. Valid values are 'simple' or 'kerberos'. If using 'simple' HTTP clients must specify the username with the 'user.name' query string parameter. If using 'kerberos' HTTP clients must use HTTP SPNEGO or delegation tokens. httpfs.authentication.type is deprecated. Instead use hadoop.http.authentication.type. | |
hadoop.http.idle_timeout.ms | 60000 | OZONE, PERFORMANCE, S3GATEWAY | OM/SCM/DN/S3GATEWAY Server connection timeout in milliseconds. |
hadoop.http.max.request.header.size | 65536 | The maxmimum HTTP request header size. | |
hadoop.http.max.response.header.size | 65536 | The maxmimum HTTP response header size. | |
hadoop.http.max.threads | 1000 | The maxmimum number of threads. | |
hadoop.http.temp.dir | ${hadoop.tmp.dir}/httpfs | HttpFS temp directory. | |
hdds.block.token.enabled | false | OZONE, HDDS, SECURITY, TOKEN | True if block tokens are enabled, else false. |
hdds.block.token.expiry.time | 1d | OZONE, HDDS, SECURITY, TOKEN | Default value for expiry time of block token. This setting supports multiple time unit suffixes as described in dfs.heartbeat.interval. If no suffix is specified, then milliseconds is assumed. |
hdds.command.status.report.interval | 30s | OZONE, DATANODE, MANAGEMENT | Time interval of the datanode to send status of commands executed since last report. Unit could be defined with postfix (ns,ms,s,m,h,d) |
hdds.container.action.max.limit | 20 | DATANODE | Maximum number of Container Actions sent by the datanode to SCM in a single heartbeat. |
hdds.container.balancer.balancing.iteration.interval | 70m | BALANCER | The interval period between each iteration of Container Balancer. |
hdds.container.balancer.datanodes.involved.max.percentage.per.iteration | 20 | BALANCER | Maximum percentage of healthy, in service datanodes that can be involved in balancing in one iteration. |
hdds.container.balancer.exclude.containers | BALANCER | List of container IDs to exclude from balancing. For example "1, 4, 5" or "1,4,5". | |
hdds.container.balancer.exclude.datanodes | BALANCER | A list of Datanode hostnames or ip addresses separated by commas. The Datanodes specified in this list are excluded from balancing. This configuration is empty by default. | |
hdds.container.balancer.include.containers | BALANCER | List of container IDs to include in balancing. Only these containers will be included in balancing. For example "1, 4, 5" or "1,4,5". | |
hdds.container.balancer.include.datanodes | BALANCER | A list of Datanode hostnames or ip addresses separated by commas. Only the Datanodes specified in this list are balanced. This configuration is empty by default and is applicable only if it is non-empty. | |
hdds.container.balancer.iterations | 10 | BALANCER | The number of iterations that Container Balancer will run for. |
hdds.container.balancer.move.networkTopology.enable | false | BALANCER | whether to take network topology into account when selecting a target for a source. This configuration is false by default. |
hdds.container.balancer.move.replication.timeout | 50m | BALANCER | The amount of time to allow a single container's replication from source to target as part of container move. For example, if "hdds.container.balancer.move.timeout" is 65 minutes, then out of those 65 minutes 50 minutes will be the deadline for replication to complete. |
hdds.container.balancer.move.timeout | 65m | BALANCER | The amount of time to allow a single container to move from source to target. |
hdds.container.balancer.size.entering.target.max | 26GB | BALANCER | The maximum size that can enter a target datanode in each iteration while balancing. This is the sum of data from multiple sources. The value must be greater than the configured (or default) ozone.scm.container.size. |
hdds.container.balancer.size.leaving.source.max | 26GB | BALANCER | The maximum size that can leave a source datanode in each iteration while balancing. This is the sum of data moving to multiple targets. The value must be greater than the configured (or default) ozone.scm.container.size. |
hdds.container.balancer.size.moved.max.per.iteration | 500GB | BALANCER | The maximum size of data in bytes that will be moved by Container Balancer in one iteration. |
hdds.container.balancer.trigger.du.before.move.enable | false | BALANCER | whether to send command to all the healthy and in-service data nodes to run du immediately before startinga balance iteration. note that running du is very time consuming , especially when the disk usage rate of a data node is very high |
hdds.container.balancer.utilization.threshold | 10 | BALANCER | Threshold is a percentage in the range of 0 to 100. A cluster is considered balanced if for each datanode, the utilization of the datanode (used space to capacity ratio) differs from the utilization of the cluster (used space to capacity ratio of the entire cluster) no more than the threshold. |
hdds.container.checksum.verification.enabled | true | OZONE, DATANODE | To enable/disable checksum verification of the containers. |
hdds.container.chunk.write.sync | false | OZONE, CONTAINER, MANAGEMENT | Determines whether the chunk writes in the container happen as sync I/0 or buffered I/O operation. |
hdds.container.close.threshold | 0.9f | OZONE, DATANODE | This determines the threshold to be used for closing a container. When the container used percentage reaches this threshold, the container will be closed. Value should be a positive, non-zero percentage in float notation (X.Yf), with 1.0f meaning 100%. |
hdds.container.ipc.port | 9859 | OZONE, CONTAINER, MANAGEMENT | The ipc port number of container. |
hdds.container.ipc.random.port | false | OZONE, DEBUG, CONTAINER | Allocates a random free port for ozone container. This is used only while running unit tests. |
hdds.container.ratis.admin.port | 9857 | OZONE, CONTAINER, PIPELINE, RATIS, MANAGEMENT | The ipc port number of container for admin requests. |
hdds.container.ratis.datanode.storage.dir | OZONE, CONTAINER, STORAGE, MANAGEMENT, RATIS | This directory is used for storing Ratis metadata like logs. If this is not set then default metadata dirs is used. A warning will be logged if this not set. Ideally, this should be mapped to a fast disk like an SSD. | |
hdds.container.ratis.datastream.enabled | false | OZONE, CONTAINER, RATIS, DATASTREAM | It specifies whether to enable data stream of container. |
hdds.container.ratis.datastream.port | 9855 | OZONE, CONTAINER, RATIS, DATASTREAM | The datastream port number of container. |
hdds.container.ratis.datastream.random.port | false | OZONE, CONTAINER, RATIS, DATASTREAM | Allocates a random free port for ozone container datastream. This is used only while running unit tests. |
hdds.container.ratis.enabled | false | OZONE, MANAGEMENT, PIPELINE, RATIS | Ozone supports different kinds of replication pipelines. Ratis is one of the replication pipeline supported by ozone. |
hdds.container.ratis.ipc.port | 9858 | OZONE, CONTAINER, PIPELINE, RATIS | The ipc port number of container for clients. |
hdds.container.ratis.ipc.random.port | false | OZONE, DEBUG | Allocates a random free port for ozone ratis port for the container. This is used only while running unit tests. |
hdds.container.ratis.leader.pending.bytes.limit | 1GB | OZONE, RATIS, PERFORMANCE | Limit on the total bytes of pending requests after which leader starts rejecting requests from client. |
hdds.container.ratis.log.appender.queue.byte-limit | 32MB | OZONE, DEBUG, CONTAINER, RATIS | Byte limit for ratis leader's log appender queue. |
hdds.container.ratis.log.appender.queue.num-elements | 1024 | OZONE, DEBUG, CONTAINER, RATIS | Limit for number of append entries in ratis leader's log appender queue. |
hdds.container.ratis.log.purge.gap | 1000000 | OZONE, DEBUG, CONTAINER, RATIS | Purge gap between the last purged commit index and the current index, when the leader decides to purge its log. |
hdds.container.ratis.log.queue.byte-limit | 4GB | OZONE, DEBUG, CONTAINER, RATIS | Byte limit for Ratis Log Worker queue. |
hdds.container.ratis.log.queue.num-elements | 1024 | OZONE, DEBUG, CONTAINER, RATIS | Limit for the number of operations in Ratis Log Worker. |
hdds.container.ratis.num.container.op.executors | 10 | OZONE, RATIS, PERFORMANCE | Number of executors that will be used by Ratis to execute container ops.(10 by default). |
hdds.container.ratis.num.write.chunk.threads.per.volume | 10 | OZONE, RATIS, PERFORMANCE | Maximum number of threads in the thread pool that Datanode will use for writing replicated chunks. This is a per configured locations! (10 thread per disk by default). |
hdds.container.ratis.rpc.type | GRPC | OZONE, RATIS, MANAGEMENT | Ratis supports different kinds of transports like netty, GRPC, Hadoop RPC etc. This picks one of those for this cluster. |
hdds.container.ratis.segment.preallocated.size | 4MB | OZONE, RATIS, PERFORMANCE | The pre-allocated file size for raft segment used by Apache Ratis on datanodes. (4 MB by default) |
hdds.container.ratis.segment.size | 64MB | OZONE, RATIS, PERFORMANCE | The size of the raft segment file used by Apache Ratis on datanodes. (64 MB by default) |
hdds.container.ratis.server.port | 9856 | OZONE, CONTAINER, PIPELINE, RATIS, MANAGEMENT | The ipc port number of container for server-server communication. |
hdds.container.ratis.statemachine.max.pending.apply-transactions | 100000 | OZONE, CONTAINER, RATIS | Maximum number of pending apply transactions in a data pipeline. The default value is kept same as default snapshot threshold hdds.ratis.snapshot.threshold. |
hdds.container.ratis.statemachine.write.wait.interval | 10m | OZONE, DATANODE | Timeout for the write path for container blocks. |
hdds.container.ratis.statemachinedata.sync.retries | OZONE, DEBUG, CONTAINER, RATIS | Number of times the WriteStateMachineData op will be tried before failing. If the value is not configured, it will default to (hdds.ratis.rpc.slowness.timeout / hdds.container.ratis.statemachinedata.sync.timeout), which means that the WriteStatMachineData will be retried for every sync timeout until the configured slowness timeout is hit, after which the StateMachine will close down the pipeline. If this value is set to -1, then this retries indefinitely. This might not be desirable since if due to persistent failure the WriteStateMachineData op was not able to complete for a long time, this might block the Ratis write pipeline. | |
hdds.container.ratis.statemachinedata.sync.timeout | 10s | OZONE, DEBUG, CONTAINER, RATIS | Timeout for StateMachine data writes by Ratis. |
hdds.container.replication.compression | NO_COMPRESSION | OZONE, HDDS, DATANODE | Compression algorithm used for closed container replication. Possible chooices include NO_COMPRESSION, GZIP, SNAPPY, LZ4, ZSTD |
hdds.container.report.interval | 60m | OZONE, CONTAINER, MANAGEMENT | Time interval of the datanode to send container report. Each datanode periodically send container report to SCM. Unit could be defined with postfix (ns,ms,s,m,h,d) |
hdds.container.scrub.data.scan.interval | 7d | STORAGE | Minimum time interval between two iterations of container data scanning. If an iteration takes less time than this, the scanner will wait before starting the next iteration. Unit could be defined with postfix (ns,ms,s,m,h,d). |
hdds.container.scrub.dev.data.scan.enabled | true | STORAGE | Can be used to disable the background container data scanner for developer testing purposes. |
hdds.container.scrub.dev.metadata.scan.enabled | true | STORAGE | Can be used to disable the background container metadata scanner for developer testing purposes. |
hdds.container.scrub.enabled | true | STORAGE | Config parameter to enable all container scanners. |
hdds.container.scrub.metadata.scan.interval | 3h | STORAGE | Config parameter define time interval between two metadata scans by container scanner. Unit could be defined with postfix (ns,ms,s,m,h,d). |
hdds.container.scrub.min.gap | 15m | DATANODE | The minimum gap between two successive scans of the same container. Unit could be defined with postfix (ns,ms,s,m,h,d). |
hdds.container.scrub.on.demand.volume.bytes.per.second | 5242880 | STORAGE | Config parameter to throttle I/O bandwidth used by the demand container scanner per volume. |
hdds.container.scrub.volume.bytes.per.second | 5242880 | STORAGE | Config parameter to throttle I/O bandwidth used by scanner per volume. |
hdds.container.token.enabled | false | OZONE, HDDS, SECURITY, TOKEN | True if container tokens are enabled, else false. |
hdds.datanode.block.delete.command.worker.interval | 2s | DATANODE | The interval between DeleteCmdWorker execution of delete commands. |
hdds.datanode.block.delete.max.lock.wait.timeout | 100ms | DATANODE, DELETION | Timeout for the thread used to process the delete block command to wait for the container lock. |
hdds.datanode.block.delete.queue.limit | 5 | DATANODE | The maximum number of block delete commands queued on a datanode, This configuration is also used by the SCM to control whether to send delete commands to the DN. If the DN has more commands waiting in the queue than this value, the SCM will not send any new block delete commands. until the DN has processed some commands and the queue length is reduced. |
hdds.datanode.block.delete.threads.max | 5 | DATANODE | The maximum number of threads used to handle delete blocks on a datanode |
hdds.datanode.block.deleting.limit.per.interval | 20000 | SCM, DELETION, DATANODE | Number of blocks to be deleted in an interval. |
hdds.datanode.block.deleting.max.lock.holding.time | 1s | DATANODE, DELETION | This configuration controls the maximum time that the block deleting service can hold the lock during the deletion of blocks. Once this configured time period is reached, the service will release and re-acquire the lock. This is not a hard limit as the time check only occurs after the completion of each transaction, which means the actual execution time may exceed this limit. Unit could be defined with postfix (ns,ms,s,m,h,d). |
hdds.datanode.block.deleting.service.interval | 60s | SCM, DELETION | Time interval of the Datanode block deleting service. The block deleting service runs on Datanode periodically and deletes blocks queued for deletion. Unit could be defined with postfix (ns,ms,s,m,h,d). |
hdds.datanode.check.empty.container.dir.on.delete | false | DATANODE | Boolean Flag to decide whether to check container directory or not to determine container is empty |
hdds.datanode.chunk.data.validation.check | false | DATANODE | Enable safety checks such as checksum validation for Ratis calls. |
hdds.datanode.client.address | OZONE, HDDS, MANAGEMENT | The address of the Ozone Datanode client service. It is a string in the host:port format. | |
hdds.datanode.client.bind.host | 0.0.0.0 | OZONE, HDDS, MANAGEMENT | The hostname or IP address used by the Datanode client service endpoint to bind. |
hdds.datanode.client.port | 19864 | OZONE, HDDS, MANAGEMENT | The port number of the Ozone Datanode client service. |
hdds.datanode.command.queue.limit | 5000 | DATANODE | The default maximum number of commands in the queue and command type's sub-queue on a datanode |
hdds.datanode.container.checksum.lock.stripes | 127 | DATANODE | The number of lock stripes used to coordinate modifications to container checksum information. This information is only updated after a container is closed and does not affect the data read or write path. Each container in the datanode will be mapped to one lock which will only be held while its checksum information is updated. |
hdds.datanode.container.client.cache.size | 100 | DATANODE | The maximum number of clients to be cached by the datanode client manager |
hdds.datanode.container.client.cache.stale.threshold | 10000 | DATANODE | The stale threshold in ms for a client in cache. After this threshold the client is evicted from cache. |
hdds.datanode.container.close.threads.max | 3 | DATANODE | The maximum number of threads used to close containers on a datanode |
hdds.datanode.container.db.dir | OZONE, CONTAINER, STORAGE, MANAGEMENT | Determines where the per-disk rocksdb instances will be stored. This setting is optional. If unspecified, then rocksdb instances are stored on the same disk as HDDS data. The directories should be tagged with corresponding storage types ([SSD]/[DISK]/[ARCHIVE]/[RAM_DISK]) for storage policies. The default storage type will be DISK if the directory does not have a storage type tagged explicitly. Ideally, this should be mapped to a fast disk like an SSD. | |
hdds.datanode.container.delete.threads.max | 2 | DATANODE | The maximum number of threads used to delete containers on a datanode |
hdds.datanode.container.schema.v3.enabled | true | DATANODE | Enable use of container schema v3(one rocksdb per disk). |
hdds.datanode.container.schema.v3.key.separator | | | DATANODE | The default separator between Container ID and container meta key name. |
hdds.datanode.data.dir.permissions | 700 | Permissions for the datanode data directories where actual file blocks are stored. The permissions can be either octal or symbolic. If the default permissions are not set then the default value of 700 will be used. | |
hdds.datanode.db.config.path | OZONE, CONTAINER, STORAGE | Path to an ini configuration file for RocksDB on datanode component. | |
hdds.datanode.delete.container.timeout | 60s | DATANODE | If a delete container request spends more than this time waiting on the container lock or performing pre checks, the command will be skipped and SCM will resend it automatically. This avoids commands running for a very long time without SCM being informed of the progress. |
hdds.datanode.df.refresh.period | 5m | DATANODE | Disk space usage information will be refreshed with thespecified period following the completion of the last check. |
hdds.datanode.dir | OZONE, CONTAINER, STORAGE, MANAGEMENT | Determines where on the local filesystem HDDS data will be stored. The directories should be tagged with corresponding storage types ([SSD]/[DISK]/[ARCHIVE]/[RAM_DISK]) for storage policies. The default storage type will be DISK if the directory does not have a storage type tagged explicitly. | |
hdds.datanode.dir.du.reserved | OZONE, CONTAINER, STORAGE, MANAGEMENT | Reserved space in bytes per volume. Always leave this much space free for non dfs use. Such as /dir1:100B, /dir2:200MB, means dir1 reserves 100 bytes and dir2 reserves 200 MB. | |
hdds.datanode.dir.du.reserved.percent | OZONE, CONTAINER, STORAGE, MANAGEMENT | Percentage of volume that should be reserved. This space is left free for other usage. The value should be between 0-1. Such as 0.1 which means 10% of volume space will be reserved. | |
hdds.datanode.disk.balancer.container.choosing.policy | org.apache.hadoop.ozone.container.diskbalancer.policy.DefaultContainerChoosingPolicy | DISKBALANCER | The policy for selecting source/destination volumes and containers to move for disk balancing. |
hdds.datanode.disk.balancer.enabled | false | OZONE, DATANODE, DISKBALANCER | If this property is set to true, then the Disk Balancer feature is enabled on Datanodes, and users can use this service. By default, this is disabled. |
hdds.datanode.disk.balancer.info.dir | DISKBALANCER | The path where datanode diskBalancer's conf is to be written to. if this property is not defined, ozone will fall back to use metadata directory instead. | |
hdds.datanode.disk.balancer.max.disk.throughputInMBPerSec | 10 | DISKBALANCER | The max balance speed. |
hdds.datanode.disk.balancer.parallel.thread | 5 | DISKBALANCER | The max parallel balance thread count. |
hdds.datanode.disk.balancer.replica.deletion.delay | 5m | DATANODE, DISKBALANCER | The delay after a container is successfully moved from source volume to destination volume before the source container replica is deleted. Unit could be defined with postfix (ns,ms,s,m,h,d). |
hdds.datanode.disk.balancer.service.interval | 60s | DATANODE, DISKBALANCER | Time interval of the Datanode DiskBalancer service. The Datanode will check the service periodically and update the config and running status for DiskBalancer service. Unit could be defined with postfix (ns,ms,s,m,h,d). |
hdds.datanode.disk.balancer.service.timeout | 300s | DATANODE, DISKBALANCER | Timeout for the Datanode DiskBalancer service. Unit could be defined with postfix (ns,ms,s,m,h,d). |
hdds.datanode.disk.balancer.should.run.default | false | DATANODE, DISKBALANCER | If DiskBalancer fails to get information from diskbalancer.info, it will choose this value to decide if this service should be running. |
hdds.datanode.disk.balancer.stop.after.disk.even | true | DISKBALANCER | If true, the DiskBalancer will automatically stop once disks are balanced. |
hdds.datanode.disk.balancer.volume.density.threshold.percent | 10 | DISKBALANCER | Threshold is a percentage in the range of 0 to 100. A datanode is considered balanced if for each volume, the utilization of the volume(used space to capacity ratio) differs from the utilization of the datanode(used space to capacity ratio of the entire datanode) no more than the threshold. |
hdds.datanode.disk.check.io.failures.tolerated | 1 | DATANODE | The number of IO tests out of the last hdds.datanode.disk.check.io.test.count test run that are allowed to fail before the volume is marked as failed. |
hdds.datanode.disk.check.io.file.size | 100B | DATANODE | The size of the temporary file that will be synced to the disk and read back to assess its health. The contents of the file will be stored in memory during the duration of the check. |
hdds.datanode.disk.check.io.test.count | 3 | DATANODE | The number of IO tests required to determine if a disk has failed. Each disk check does one IO test. The volume will be failed if more than hdds.datanode.disk.check.io.failures.tolerated out of the last hdds.datanode.disk.check.io.test.count runs failed. Set to 0 to disable disk IO checks. |
hdds.datanode.disk.check.min.gap | 10m | DATANODE | The minimum gap between two successive checks of the same Datanode volume. Unit could be defined with postfix (ns,ms,s,m,h,d). |
hdds.datanode.disk.check.timeout | 10m | DATANODE | Maximum allowed time for a disk check to complete. If the check does not complete within this time interval then the disk is declared as failed. Unit could be defined with postfix (ns,ms,s,m,h,d). |
hdds.datanode.dns.interface | default | OZONE, DATANODE | The name of the Network Interface from which a Datanode should report its IP address. e.g. eth2. This setting may be required for some multi-homed nodes where the Datanodes are assigned multiple hostnames and it is desirable for the Datanodes to use a non-default hostname. |
hdds.datanode.dns.nameserver | default | OZONE, DATANODE | The host name or IP address of the name server (DNS) which a Datanode should use to determine its own host name. |
hdds.datanode.du.factory.classname | DATANODE | The fully qualified name of the factory class that creates objects for providing disk space usage information. It should implement the SpaceUsageCheckFactory interface. | |
hdds.datanode.du.refresh.period | 1h | DATANODE | Disk space usage information will be refreshed with thespecified period following the completion of the last check. |
hdds.datanode.failed.data.volumes.tolerated | -1 | DATANODE | The number of data volumes that are allowed to fail before a datanode stops offering service. Config this to -1 means unlimited, but we should have at least one good volume left. |
hdds.datanode.failed.db.volumes.tolerated | -1 | DATANODE | The number of db volumes that are allowed to fail before a datanode stops offering service. Config this to -1 means unlimited, but we should have at least one good volume left. |
hdds.datanode.failed.metadata.volumes.tolerated | -1 | DATANODE | The number of metadata volumes that are allowed to fail before a datanode stops offering service. Config this to -1 means unlimited, but we should have at least one good volume left. |
hdds.datanode.handler.count | 10 | OZONE, HDDS, MANAGEMENT | The number of RPC handler threads for Datanode client service endpoints. |
hdds.datanode.hostname | OZONE, DATANODE | Optional. The hostname for the Datanode containing this configuration file. Will be different for each machine. Defaults to current hostname. | |
hdds.datanode.http-address | 0.0.0.0:9882 | HDDS, MANAGEMENT | The address and the base port where the Datanode web ui will listen on. If the port is 0 then the server will start on a free port. |
hdds.datanode.http-bind-host | 0.0.0.0 | HDDS, MANAGEMENT | The actual address the Datanode web server will bind to. If this optional address is set, it overrides only the hostname portion of hdds.datanode.http-address. |
hdds.datanode.http.auth.kerberos.keytab | /etc/security/keytabs/HTTP.keytab | HDDS, SECURITY, MANAGEMENT, KERBEROS | The kerberos keytab file for datanode http server |
hdds.datanode.http.auth.kerberos.principal | HTTP/_HOST@REALM | HDDS, SECURITY, MANAGEMENT, KERBEROS | The kerberos principal for the datanode http server. |
hdds.datanode.http.auth.type | simple | DATANODE, SECURITY, KERBEROS | simple or kerberos. If kerberos is set, SPNEGO will be used for http authentication. |
hdds.datanode.http.enabled | true | HDDS, MANAGEMENT | Property to enable or disable Datanode web ui. |
hdds.datanode.https-address | 0.0.0.0:9883 | HDDS, MANAGEMENT, SECURITY | The address and the base port where the Datanode web UI will listen on using HTTPS. If the port is 0 then the server will start on a free port. |
hdds.datanode.https-bind-host | 0.0.0.0 | HDDS, MANAGEMENT, SECURITY | The actual address the Datanode web server will bind to using HTTPS. If this optional address is set, it overrides only the hostname portion of hdds.datanode.http-address. |
hdds.datanode.kerberos.keytab.file | OZONE, DATANODE | The keytab file used by each Datanode daemon to login as its service principal. The principal name is configured with hdds.datanode.kerberos.principal. | |
hdds.datanode.kerberos.principal | dn/_HOST@REALM | OZONE, SECURITY, KERBEROS, DATANODE | The Datanode service principal. This is typically set to dn/_HOST@REALM.TLD. Each Datanode will substitute _HOST with its own fully qualified hostname at startup. The _HOST placeholder allows using the same configuration setting on all Datanodes. |
hdds.datanode.metadata.rocksdb.cache.size | 1GB | OZONE, DATANODE, MANAGEMENT | Size of the block metadata cache shared among RocksDB instances on each datanode. All containers on a datanode will share this cache. |
hdds.datanode.periodic.disk.check.interval.minutes | 60 | DATANODE | Periodic disk check run interval in minutes. |
hdds.datanode.plugins | Comma-separated list of HDDS datanode plug-ins to be activated when HDDS service starts as part of datanode. | ||
hdds.datanode.ratis.server.request.timeout | 2m | OZONE, DATANODE | Timeout for the request submitted directly to Ratis in datanode. |
hdds.datanode.read.chunk.threads.per.volume | 10 | DATANODE | Number of threads per volume that Datanode will use for reading replicated chunks. |
hdds.datanode.read.threadpool | 10 | OZONE, HDDS, PERFORMANCE | The number of threads in RPC server reading from the socket for Datanode client service endpoints. This config overrides Hadoop configuration "ipc.server.read.threadpool.size" for HddsDatanodeClientProtocolServer. The default value is 10. |
hdds.datanode.recovering.container.scrubbing.service.interval | 1m | SCM, DELETION | Time interval of the stale recovering container scrubbing service. The recovering container scrubbing service runs on Datanode periodically and deletes stale recovering container Unit could be defined with postfix (ns,ms,s,m,h,d). |
hdds.datanode.replication.outofservice.limit.factor | 2.0 | DATANODE, SCM | Decommissioning and maintenance nodes can handle morereplication commands than in-service nodes due to reduced load. This multiplier determines the increased queue capacity and executor pool size. |
hdds.datanode.replication.port | 9886 | DATANODE, MANAGEMENT | Port used for the server2server replication server |
hdds.datanode.replication.queue.limit | 4096 | DATANODE | The maximum number of queued requests for container replication |
hdds.datanode.replication.streams.limit | 10 | DATANODE | The maximum number of replication commands a single datanode can execute simultaneously |
hdds.datanode.replication.work.dir | DATANODE | This configuration is deprecated. Temporary sub directory under each hdds.datanode.dir will be used during the container replication between datanodes to save the downloaded container(in compressed format). | |
hdds.datanode.rocksdb.auto-compaction-small-sst-file | true | DATANODE | Auto compact small SST files (rocksdb.auto-compaction-small-sst-file-size-threshold) when count exceeds (rocksdb.auto-compaction-small-sst-file-num-threshold) |
hdds.datanode.rocksdb.auto-compaction-small-sst-file-num-threshold | 512 | DATANODE | Auto compaction will happen if the number of small SST files exceeds this threshold. |
hdds.datanode.rocksdb.auto-compaction-small-sst-file-size-threshold | 1MB | DATANODE | SST files smaller than this configuration will be auto compacted. |
hdds.datanode.rocksdb.auto-compaction-small-sst-file.interval.minutes | 120 | DATANODE | Auto compact small SST files interval in minutes. |
hdds.datanode.rocksdb.auto-compaction-small-sst-file.threads | 1 | DATANODE | Auto compact small SST files threads. |
hdds.datanode.rocksdb.delete-obsolete-files-period | 1h | DATANODE | Periodicity when obsolete files get deleted. Default is 1h. |
hdds.datanode.rocksdb.log.level | INFO | DATANODE | The user log level of RocksDB(DEBUG/INFO/WARN/ERROR/FATAL)) |
hdds.datanode.rocksdb.log.max-file-num | 64 | DATANODE | The max user log file number to keep for each RocksDB |
hdds.datanode.rocksdb.log.max-file-size | 32MB | DATANODE | The max size of each user log file of RocksDB. O means no size limit. |
hdds.datanode.rocksdb.max-open-files | 1024 | DATANODE | The total number of files that a RocksDB can open. |
hdds.datanode.slow.op.warning.threshold | 500ms | OZONE, DATANODE, PERFORMANCE | Thresholds for printing slow-operation audit logs. |
hdds.datanode.storage.utilization.critical.threshold | 0.95 | OZONE, SCM, MANAGEMENT | If a datanode overall storage utilization exceeds more than this value, the datanode will be marked out of space. |
hdds.datanode.storage.utilization.warning.threshold | 0.75 | OZONE, SCM, MANAGEMENT | If a datanode overall storage utilization exceeds more than this value, a warning will be logged while processing the nodeReport in SCM. |
hdds.datanode.use.datanode.hostname | false | OZONE, DATANODE | Whether Datanodes should use Datanode hostnames when connecting to other Datanodes for data transfer. |
hdds.datanode.volume.choosing.policy | org.apache.hadoop.ozone.container.common.volume.CapacityVolumeChoosingPolicy | OZONE, CONTAINER, STORAGE, MANAGEMENT | The class name of the policy for choosing volumes in the list of directories. Defaults to org.apache.hadoop.ozone.container.common.volume.CapacityVolumeChoosingPolicy. This volume choosing policy randomly chooses two volumes with remaining space and then picks the one with lower utilization. |
hdds.datanode.volume.min.free.space | -1 | OZONE, CONTAINER, STORAGE, MANAGEMENT | This determines the free space to be used for closing containers When the difference between volume capacity and used reaches this number, containers that reside on this volume will be closed and no new containers would be allocated on this volume. Max of min.free.space and min.free.space.percent will be used as final value. |
hdds.datanode.volume.min.free.space.percent | 0.02 | OZONE, CONTAINER, STORAGE, MANAGEMENT | This determines the free space percent to be used for closing containers When the difference between volume capacity and used reaches (free.space.percent of volume capacity), containers that reside on this volume will be closed and no new containers would be allocated on this volume. Max of min.free.space or min.free.space.percent will be used as final value. |
hdds.datanode.wait.on.all.followers | false | DATANODE | Defines whether the leader datanode will wait for bothfollowers to catch up before removing the stateMachineData from the cache. |
hdds.db.profile | DISK | OZONE, OM, PERFORMANCE | This property allows user to pick a configuration that tunes the RocksDB settings for the hardware it is running on. Right now, we have SSD and DISK as profile options. |
hdds.grpc.tls.enabled | false | OZONE, HDDS, SECURITY, TLS | If HDDS GRPC server TLS is enabled. |
hdds.grpc.tls.provider | OPENSSL | OZONE, HDDS, SECURITY, TLS, CRYPTO_COMPLIANCE | HDDS GRPC server TLS provider. |
hdds.heartbeat.initial-interval | 2s | OZONE, MANAGEMENT | Heartbeat interval used during Datanode initialization for SCM. |
hdds.heartbeat.interval | 30s | OZONE, MANAGEMENT | The heartbeat interval from a data node to SCM. Yes, it is not three but 30, since most data nodes will heart beating via Ratis heartbeats. If a client is not able to talk to a data node, it will notify OM/SCM eventually. So a 30 second HB seems to work. This assumes that replication strategy used is Ratis if not, this value should be set to something smaller like 3 seconds. ozone.scm.pipeline.close.timeout should also be adjusted accordingly, if the default value for this config is not used. |
hdds.heartbeat.recon.initial-interval | 60s | OZONE, MANAGEMENT, RECON | Heartbeat interval used during Datanode initialization for Recon. |
hdds.heartbeat.recon.interval | 60s | OZONE, MANAGEMENT, RECON | The heartbeat interval from a Datanode to Recon. |
hdds.key.algo | RSA | SCM, HDDS, X509, SECURITY, CRYPTO_COMPLIANCE | SCM CA key algorithm. |
hdds.key.dir.name | keys | SCM, HDDS, X509, SECURITY | Directory to store public/private key for SCM CA. This is relative to ozone/hdds meteadata dir. |
hdds.key.len | 2048 | SCM, HDDS, X509, SECURITY, CRYPTO_COMPLIANCE | SCM CA key length. This is an algorithm-specific metric, such as modulus length, specified in number of bits. |
hdds.metadata.dir | X509, SECURITY | Absolute path to HDDS metadata dir. | |
hdds.metrics.percentiles.intervals | OZONE, DATANODE | Comma-delimited set of integers denoting the desired rollover intervals (in seconds) for percentile latency metrics on the Datanode. By default, percentile latency metrics are disabled. | |
hdds.metrics.session-id | OZONE, HDDS | Get the user-specified session identifier. The default is the empty string. The session identifier is used to tag metric data that is reported to some performance metrics system via the org.apache.hadoop.metrics API. The session identifier is intended, in particular, for use by Hadoop-On-Demand (HOD) which allocates a virtual Hadoop cluster dynamically and transiently. HOD will set the session identifier by modifying the mapred-site.xml file before starting the cluster. When not running under HOD, this identifer is expected to remain set to the empty string. | |
hdds.node.report.interval | 60000ms | OZONE, CONTAINER, MANAGEMENT | Time interval of the datanode to send node report. Each datanode periodically send node report to SCM. Unit could be defined with postfix (ns,ms,s,m,h,d) |
hdds.pipeline.action.max.limit | 20 | DATANODE | Maximum number of Pipeline Actions sent by the datanode to SCM in a single heartbeat. |
hdds.pipeline.report.interval | 60000ms | OZONE, PIPELINE, MANAGEMENT | Time interval of the datanode to send pipeline report. Each datanode periodically send pipeline report to SCM. Unit could be defined with postfix (ns,ms,s,m,h,d) |
hdds.priv.key.file.name | private.pem | X509, SECURITY | Name of file which stores private key generated for SCM CA. |
hdds.profiler.endpoint.enabled | false | OZONE, MANAGEMENT | Enable /prof java profiler servlet page on HTTP server. |
hdds.prometheus.endpoint.enabled | true | OZONE, MANAGEMENT | Enable prometheus compatible metric page on the HTTP servers. |
hdds.prometheus.endpoint.token | SECURITY, MANAGEMENT | Allowed authorization token while using prometheus servlet endpoint. This will disable SPNEGO based authentication on the endpoint. | |
hdds.public.key.file.name | public.pem | X509, SECURITY | Name of file which stores public key generated for SCM CA. |
hdds.raft.server.rpc.first-election.timeout | OZONE, RATIS, MANAGEMENT | ratis Minimum timeout for the first election of a leader. If not configured, fallback to hdds.ratis.leader.election.minimum.timeout.duration. | |
hdds.ratis.client.exponential.backoff.base.sleep | 4s | OZONE, CLIENT, PERFORMANCE | Specifies base sleep for exponential backoff retry policy. With the default base sleep of 4s, the sleep duration for ith retry is min(4 * pow(2, i), max_sleep) * r, where r is random number in the range [0.5, 1.5). |
hdds.ratis.client.exponential.backoff.max.retries | 2147483647 | OZONE, CLIENT, PERFORMANCE | Client's max retry value for the exponential backoff policy. |
hdds.ratis.client.exponential.backoff.max.sleep | 40s | OZONE, CLIENT, PERFORMANCE | The sleep duration obtained from exponential backoff policy is limited by the configured max sleep. Refer dfs.ratis.client.exponential.backoff.base.sleep for further details. |
hdds.ratis.client.multilinear.random.retry.policy | 5s, 5, 10s, 5, 15s, 5, 20s, 5, 25s, 5, 60s, 10 | OZONE, CLIENT, PERFORMANCE | Specifies multilinear random retry policy to be used by ratis client. e.g. given pairs of number of retries and sleep time (n0, t0), (n1, t1), ..., for the first n0 retries sleep duration is t0 on average, the following n1 retries sleep duration is t1 on average, and so on. |
hdds.ratis.client.request.watch.timeout | 3m | OZONE, CLIENT, PERFORMANCE | Timeout for ratis client watch request. |
hdds.ratis.client.request.watch.type | ALL_COMMITTED | OZONE, CLIENT, PERFORMANCE | Desired replication level when Ozone client's Raft client calls watch(), ALL_COMMITTED or MAJORITY_COMMITTED. MAJORITY_COMMITTED increases write performance by reducing watch() latency when an Ozone datanode is slow in a pipeline, at the cost of potential read latency increasing due to read retries to different datanodes. |
hdds.ratis.client.request.write.timeout | 5m | OZONE, CLIENT, PERFORMANCE | Timeout for ratis client write request. |
hdds.ratis.client.retry.policy | org.apache.hadoop.hdds.ratis.retrypolicy.RequestTypeDependentRetryPolicyCreator | OZONE, CLIENT, PERFORMANCE | The class name of the policy for retry. |
hdds.ratis.client.retrylimited.max.retries | 180 | OZONE, CLIENT, PERFORMANCE | Number of retries for ratis client request. |
hdds.ratis.client.retrylimited.retry.interval | 1s | OZONE, CLIENT, PERFORMANCE | Interval between successive retries for a ratis client request. |
hdds.ratis.leader.election.minimum.timeout.duration | 5s | OZONE, RATIS, MANAGEMENT | The minimum timeout duration for ratis leader election. Default is 5s. |
hdds.ratis.raft.client.async.outstanding-requests.max | 32 | OZONE, CLIENT, PERFORMANCE | Controls the maximum number of outstanding async requests that can be handled by the Standalone as well as Ratis client. |
hdds.ratis.raft.client.rpc.request.timeout | 60s | OZONE, CLIENT, PERFORMANCE | The timeout duration for ratis client request (except for watch request). It should be set greater than leader election timeout in Ratis. |
hdds.ratis.raft.client.rpc.watch.request.timeout | 180s | OZONE, CLIENT, PERFORMANCE | The timeout duration for ratis client watch request. Timeout for the watch API in Ratis client to acknowledge a particular request getting replayed to all servers. It is highly recommended for the timeout duration to be strictly longer than Ratis server watch timeout (hdds.ratis.raft.server.watch.timeout) |
hdds.ratis.raft.grpc.flow.control.window | 5MB | OZONE, CLIENT, PERFORMANCE | This parameter tells how much data grpc client can send to grpc server with out receiving any ack(WINDOW_UPDATE) packet from server. This parameter should be set in accordance with chunk size. Example: If Chunk size is 4MB, considering some header size in to consideration, this can be set 5MB or greater. Tune this parameter accordingly, as when it is set with a value lesser than chunk size it degrades the ozone client performance. |
hdds.ratis.raft.server.datastream.client.pool.size | 10 | OZONE, DATANODE, RATIS, DATASTREAM | Maximum number of client proxy in NettyServerStreamRpc for datastream write. |
hdds.ratis.raft.server.datastream.request.threads | 20 | OZONE, DATANODE, RATIS, DATASTREAM | Maximum number of threads in the thread pool for datastream request. |
hdds.ratis.raft.server.delete.ratis.log.directory | true | OZONE, DATANODE, RATIS | Flag to indicate whether ratis log directory will becleaned up during pipeline remove. |
hdds.ratis.raft.server.leaderelection.pre-vote | true | OZONE, DATANODE, RATIS | Flag to enable/disable ratis election pre-vote. |
hdds.ratis.raft.server.log.appender.wait-time.min | 0us | OZONE, DATANODE, RATIS, PERFORMANCE | The minimum wait time between two appendEntries calls. In some error conditions, the leader may keep retrying appendEntries. If it happens, increasing this value to, say, 5us (microseconds) can help avoid the leader being too busy retrying. |
hdds.ratis.raft.server.notification.no-leader.timeout | 300s | OZONE, DATANODE, RATIS | Time out duration after which StateMachine gets notified that leader has not been elected for a long time and leader changes its role to Candidate. |
hdds.ratis.raft.server.rpc.request.timeout | 60s | OZONE, DATANODE, RATIS | The timeout duration of the ratis write request on Ratis Server. |
hdds.ratis.raft.server.rpc.slowness.timeout | 300s | OZONE, DATANODE, RATIS | Timeout duration after which stateMachine will be notified that follower is slow. StateMachine will close down the pipeline. |
hdds.ratis.raft.server.watch.timeout | 30s | OZONE, DATANODE, RATIS | The timeout duration for watch request on Ratis Server. Timeout for the watch request in Ratis server to acknowledge a particular request is replayed to all servers. It is highly recommended for the timeout duration to be strictly shorter than Ratis client watch timeout (hdds.ratis.raft.client.rpc.watch.request.timeout). |
hdds.ratis.raft.server.write.element-limit | 1024 | OZONE, DATANODE, RATIS, PERFORMANCE | Maximum number of pending requests after which the leader starts rejecting requests from client. |
hdds.ratis.server.num.snapshots.retained | 5 | STORAGE | Config parameter to specify number of old snapshots retained at the Ratis leader. |
hdds.ratis.server.retry-cache.timeout.duration | 600000ms | OZONE, RATIS, MANAGEMENT | Retry Cache entry timeout for ratis server. |
hdds.ratis.snapshot.threshold | 100000 | OZONE, CONTAINER, RATIS | Number of transactions after which a ratis snapshot should be taken. |
hdds.scm.block.deleting.service.interval | 60s | SCM, DELETION | Time interval of the scm block deleting service. The block deletingservice runs on SCM periodically and deletes blocks queued for deletion. Unit could be defined with postfix (ns,ms,s,m,h,d). |
hdds.scm.block.deletion.per-interval.max | 500000 | SCM, DELETION | Maximum number of blocks which SCM processes during an interval. The block num is counted at the replica level.If SCM has 100000 blocks which need to be deleted and the configuration is 5000 then it would only send 5000 blocks for deletion to the datanodes. |
hdds.scm.block.deletion.txn.dn.commit.map.limit | 5000000 | SCM | This value indicates the size of the transactionToDNsCommitMap after which we will skip one round of scm block deleting interval. |
hdds.scm.ec.pipeline.choose.policy.impl | org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RandomPipelineChoosePolicy | SCM, PIPELINE | Sets the policy for choosing an EC pipeline. The value should be the full name of a class which implements org.apache.hadoop.hdds.scm.PipelineChoosePolicy. The class decides which pipeline will be used when selecting an EC Pipeline. If not set, org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RandomPipelineChoosePolicy will be used as default value. One of the following values can be used: (1) org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RandomPipelineChoosePolicy : chooses a pipeline randomly. (2) org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.HealthyPipelineChoosePolicy : chooses a healthy pipeline randomly. (3) org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.CapacityPipelineChoosePolicy : chooses the pipeline with lower utilization from two random pipelines. Note that random choose method will be executed twice in this policy.(4) org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RoundRobinPipelineChoosePolicy : chooses a pipeline in a round robin fashion. Intended for troubleshooting and testing purposes only. |
hdds.scm.http.auth.kerberos.keytab | SECURITY | The keytab file used by SCM http server to login as its service principal. | |
hdds.scm.http.auth.kerberos.principal | SECURITY | This Kerberos principal is used when communicating to the HTTP server of SCM.The protocol used is SPNEGO. | |
hdds.scm.http.auth.type | simple | OM, SECURITY, KERBEROS | simple or kerberos. If kerberos is set, SPNEGO will be used for http authentication. |
hdds.scm.kerberos.keytab.file | /etc/security/keytabs/SCM.keytab | SCM, SECURITY, KERBEROS | The keytab file used by SCM daemon to login as its service principal. |
hdds.scm.kerberos.principal | SCM/_HOST@REALM | SCM, SECURITY, KERBEROS | The SCM service principal. e.g. scm/_HOST@REALM.COM |
hdds.scm.pipeline.choose.policy.impl | org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RandomPipelineChoosePolicy | SCM, PIPELINE | Sets the policy for choosing a pipeline for a Ratis container. The value should be the full name of a class which implements org.apache.hadoop.hdds.scm.PipelineChoosePolicy. The class decides which pipeline will be used to find or allocate Ratis containers. If not set, org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RandomPipelineChoosePolicy will be used as default value. One of the following values can be used: (1) org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RandomPipelineChoosePolicy : chooses a pipeline randomly. (2) org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.HealthyPipelineChoosePolicy : chooses a healthy pipeline randomly. (3) org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.CapacityPipelineChoosePolicy : chooses the pipeline with lower utilization from two random pipelines. Note that random choose method will be executed twice in this policy.(4) org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RoundRobinPipelineChoosePolicy : chooses a pipeline in a round robin fashion. Intended for troubleshooting and testing purposes only. |
hdds.scm.replication.container.sample.limit | 100 | SCM | The number of containers to sample in each state per iteration of the replication manager. This is useful for debugging when Recon is not available. The samples are included in the ReplicationManagerReport for each lifecycle and health state. |
hdds.scm.replication.datanode.delete.container.limit | 40 | SCM, DATANODE | A limit to restrict the total number of delete container commands queued on a datanode. Note this is intended to be a temporary config until we have a more dynamic way of limiting load |
hdds.scm.replication.datanode.reconstruction.weight | 3 | SCM, DATANODE | When counting the number of replication commands on a datanode, the number of reconstruction commands is multiplied by this weight to ensure reconstruction commands use more of the capacity, as they are more expensive to process. |
hdds.scm.replication.datanode.replication.limit | 20 | SCM, DATANODE | A limit to restrict the total number of replication and reconstruction commands queued on a datanode. Note this is intended to be a temporary config until we have a more dynamic way of limiting load. |
hdds.scm.replication.event.timeout | 12m | SCM, OZONE | Timeout for the container replication/deletion commands sent to datanodes. After this timeout the command will be retried. |
hdds.scm.replication.event.timeout.datanode.offset | 6m | SCM, OZONE | The amount of time to subtract from hdds.scm.replication.event.timeout to give a deadline on the datanodes which is less than the SCM timeout. This ensures the datanodes will not process a command after SCM believes it should have expired. |
hdds.scm.replication.inflight.limit.factor | 0.75 | SCM | The overall replication task limit on a cluster is the number healthy nodes, times the datanode.replication.limit. This factor, which should be between zero and 1, scales that limit down to reduce the overall number of replicas pending creation on the cluster. A setting of zero disables global limit checking. A setting of 1 effectively disables it, by making the limit equal to the above equation. However if there are many decommissioning nodes on the cluster, the decommission nodes will have a higher than normal limit, so the setting of 1 may still provide some limit in extreme circumstances. |
hdds.scm.replication.maintenance.remaining.redundancy | 1 | SCM, OZONE | The number of redundant containers in a group which must be available for a node to enter maintenance. If putting a node into maintenance reduces the redundancy below this value , the node will remain in the ENTERING_MAINTENANCE state until a new replica is created. For Ratis containers, the default value of 1 ensures at least two replicas are online, meaning 1 more can be lost without data becoming unavailable. For any EC container it will have at least dataNum + 1 online, allowing the loss of 1 more replica before data becomes unavailable. Currently only EC containers use this setting. Ratis containers use hdds.scm.replication.maintenance.replica.minimum. For EC, if nodes are in maintenance, it is likely reconstruction reads will be required if some of the data replicas are offline. This is seamless to the client, but will affect read performance. |
hdds.scm.replication.maintenance.replica.minimum | 2 | SCM, OZONE | The minimum number of container replicas which must be available for a node to enter maintenance. If putting a node into maintenance reduces the available replicas for any container below this level, the node will remain in the entering maintenance state until a new replica is created. |
hdds.scm.replication.over.replicated.interval | 30s | SCM, OZONE | How frequently to check if there are work to process on the over replicated queue |
hdds.scm.replication.push | true | SCM, DATANODE | If false, replication happens by asking the target to pull from source nodes. If true, the source node is asked to push to the target node. |
hdds.scm.replication.quasi.closed.stuck.best.origin.copies | 3 | SCM | For quasi-closed stuck containers with multiple diverged origins, the number of replicas to maintain for the origin with the highest bcsId among healthy replicas. This origin is considered the 'best' copy and receives extra fault-tolerance. If multiple origins share the same highest bcsId, all of them receive this count. |
hdds.scm.replication.quasi.closed.stuck.other.origin.copies | 2 | SCM | For quasi-closed stuck containers with multiple diverged origins, the number of replicas to maintain for each origin that does not have the highest block commit sequence ID (BCSID). These replicas are kept to preserve data integrity across diverged copies. |
hdds.scm.replication.thread.interval | 300s | SCM, OZONE | There is a replication monitor thread running inside SCM which takes care of replicating the containers in the cluster. This property is used to configure the interval in which that thread runs. |
hdds.scm.replication.under.replicated.interval | 30s | SCM, OZONE | How frequently to check if there are work to process on the under replicated queue |
hdds.scm.safemode.atleast.one.node.reported.pipeline.pct | 0.90 | HDDS, SCM, OPERATION | Percentage of pipelines, where at least one datanode is reported in the pipeline. |
hdds.scm.safemode.enabled | true | HDDS, SCM, OPERATION | Boolean value to enable or disable SCM safe mode. |
hdds.scm.safemode.healthy.pipeline.pct | 0.10 | HDDS, SCM, OPERATION | Percentage of healthy pipelines, where all 3 datanodes are reported in the pipeline. |
hdds.scm.safemode.log.interval | 1m | HDDS, SCM, OPERATION | Interval at which SCM logs safemode status while SCM is in safemode. Default is 1 minute. |
hdds.scm.safemode.min.datanode | 3 | HDDS, SCM, OPERATION | Minimum DataNodes which should be registered to get SCM out of safe mode. |
hdds.scm.safemode.pipeline.creation | true | HDDS, SCM, OPERATION | Boolean value to enable background pipeline creation in SCM safe mode. |
hdds.scm.safemode.threshold.pct | 0.99 | HDDS, SCM, OPERATION | % of containers which should have at least one reported replica before SCM comes out of safe mode. |
hdds.scm.unknown-container.action | WARN | SCM, MANAGEMENT | The action taken by SCM to process unknown containers that reported by Datanodes. The default action is just logging container not found warning, another available action is DELETE action. These unknown containers will be deleted under this action way. |
hdds.scm.wait.time.after.safemode.exit | 5m | HDDS, SCM, OPERATION | After exiting safemode, wait for configured interval of time to start replication monitor and cleanup activities of unhealthy pipelines. |
hdds.scmclient.failover.max.retry | 15 | OZONE, SCM, CLIENT | Max retry count for SCM Client when failover happens. |
hdds.scmclient.failover.retry.interval | 2s | OZONE, SCM, CLIENT | SCM Client timeout on waiting for the next connection retry to other SCM IP. The default value is set to 2 seconds. |
hdds.scmclient.max.retry.timeout | 10m | OZONE, SCM, CLIENT | Max retry timeout for SCM Client |
hdds.scmclient.rpc.timeout | 15m | OZONE, SCM, CLIENT | RpcClient timeout on waiting for the response from SCM. The default value is set to 15 minutes. If ipc.client.ping is set to true and this rpc-timeout is greater than the value of ipc.ping.interval, the effective value of the rpc-timeout is rounded up to multiple of ipc.ping.interval. |
hdds.secret.key.algorithm | HmacSHA256 | SCM, SECURITY, CRYPTO_COMPLIANCE | The algorithm that SCM uses to generate symmetric secret keys. A valid algorithm is the one supported by KeyGenerator, as described at https://docs.oracle.com/javase/8/docs/technotes/guides/security/StandardNames.html#KeyGenerator. |
hdds.secret.key.expiry.duration | 9d | SCM, SECURITY | The duration for which symmetric secret keys issued by SCM are valid. Secret key is used to sign delegation tokens signed by OM, so the secret key must be valid for at least (ozone.manager.delegation.token.max-lifetime + hdds.secret.key.rotate.duration + ozone.manager.delegation.remover.scan.interval) time to guarantee that delegation tokens can be verified by OM. Considering the default value of three properties mentioned and rounding up to days, this property's default value, in combination with hdds.secret.key.rotate.duration=1d, results in 9 secret keys (for the last 9 days) are kept valid at any point of time. If any of ozone.manager.delegation.token.max-lifetime, hdds.secret.key.rotate.duration or ozone.manager.delegation.remover.scan.interval value is changed, this property should be checked, and updated accordingly if necessary. |
hdds.secret.key.file.name | secret_keys.json | SCM, SECURITY | Name of file which stores symmetric secret keys for token signatures. |
hdds.secret.key.rotate.check.duration | 10m | SCM, SECURITY | The duration that SCM periodically checks if it's time to generate new symmetric secret keys. This config has an impact on the practical correctness of secret key expiry and rotation period. For example, if hdds.secret.key.rotate.duration=1d and hdds.secret.key.rotate.check.duration=10m, the actual key rotation will happen each 1d +/- 10m. |
hdds.secret.key.rotate.duration | 1d | SCM, SECURITY | The duration that SCM periodically generate a new symmetric secret keys. |
hdds.security.client.datanode.container.protocol.acl | * | SECURITY | Comma separated list of users and groups allowed to access client datanode container protocol. |
hdds.security.client.datanode.disk.balancer.protocol.acl | * | SECURITY | Comma separated list of users and groups allowed to access disk balancer protocol. |
hdds.security.client.scm.block.protocol.acl | * | SECURITY | Comma separated list of users and groups allowed to access client scm block protocol. |
hdds.security.client.scm.certificate.protocol.acl | * | SECURITY | Comma separated list of users and groups allowed to access client scm certificate protocol. |
hdds.security.client.scm.container.protocol.acl | * | SECURITY | Comma separated list of users and groups allowed to access client scm container protocol. |
hdds.security.client.scm.secretkey.datanode.protocol.acl | * | SECURITY | Comma separated list of users and groups allowed to access client scm secret key protocol for datanodes. |
hdds.security.client.scm.secretkey.om.protocol.acl | * | SECURITY | Comma separated list of users and groups allowed to access client scm secret key protocol for om. |
hdds.security.client.scm.secretkey.scm.protocol.acl | * | SECURITY | Comma separated list of users and groups allowed to access client scm secret key protocol for om. |
hdds.security.provider | BC | OZONE, HDDS, X509, SECURITY, CRYPTO_COMPLIANCE | The main security provider used for various cryptographic algorithms. |
hdds.x509.ca.rotation.ack.timeout | PT15M | OZONE, HDDS, SECURITY | Max time that SCM leader will wait for the rotation preparation acks before it believes the rotation is failed. Default is 15 minutes. |
hdds.x509.ca.rotation.check.interval | P1D | OZONE, HDDS, SECURITY | Check interval of whether internal root certificate is going to expire and needs to start rotation or not. Default is 1 day. The property value should be less than the value of property hdds.x509.renew.grace.duration. |
hdds.x509.ca.rotation.enabled | false | OZONE, HDDS, SECURITY | Whether auto root CA and sub CA certificate rotation is enabled or not. Default is disabled. |
hdds.x509.ca.rotation.time-of-day | 02:00:00 | OZONE, HDDS, SECURITY | Time of day to start the rotation. Default 02:00 AM to avoid impacting daily workload. The supported format is 'hh:mm:ss', representing hour, minute, and second. |
hdds.x509.default.duration | P365D | OZONE, HDDS, SECURITY | Default duration for which x509 certificates issued by SCM are valid. The formats accepted are based on the ISO-8601 duration format PnDTnHnMn.nS |
hdds.x509.dir.name | certs | OZONE, HDDS, SECURITY | X509 certificate directory name. |
hdds.x509.expired.certificate.check.interval | P1D | Interval to use for removing expired certificates. A background task to remove expired certificates from the scm metadata store is scheduled to run at the rate this configuration option specifies. | |
hdds.x509.file.name | certificate.crt | OZONE, HDDS, SECURITY | Certificate file name. |
hdds.x509.max.duration | P1865D | OZONE, HDDS, SECURITY | Max time for which certificate issued by SCM CA are valid. This duration is used for self-signed root cert and scm sub-ca certs issued by root ca. The formats accepted are based on the ISO-8601 duration format PnDTnHnMn.nS |
hdds.x509.renew.grace.duration | P28D | OZONE, HDDS, SECURITY | Duration of the grace period within which a certificate should be * renewed before the current one expires. Default is 28 days. |
hdds.x509.rootca.certificate.file | Path to an external CA certificate. The file format is expected to be pem. This certificate is used when initializing SCM to create a root certificate authority. By default, a self-signed certificate is generated instead. Note that this certificate is only used for Ozone's internal communication, and it does not affect the certificates used for HTTPS protocol at WebUIs as they can be configured separately. | ||
hdds.x509.rootca.certificate.polling.interval | PT2h | Interval to use for polling in certificate clients for a new root ca certificate. Every time the specified time duration elapses, the clients send a request to the SCMs to see if a new root ca certificate was generated. Once there is a change, the system automatically adds the new root ca to the clients' trust stores and requests a new certificate to be signed. | |
hdds.x509.rootca.private.key.file | Path to an external private key. The file format is expected to be pem. This private key is later used when initializing SCM to sign certificates as the root certificate authority. When not specified a private and public key is generated instead. These keys are only used for Ozone's internal communication, and it does not affect the HTTPS protocol at WebUIs as they can be configured separately. | ||
hdds.x509.rootca.public.key.file | Path to an external public key. The file format is expected to be pem. This public key is later used when initializing SCM to sign certificates as the root certificate authority. When only the private key is specified the public key is read from the external certificate. Note that this is only used for Ozone's internal communication, and it does not affect the HTTPS protocol at WebUIs as they can be configured separately. | ||
hdds.x509.signature.algorithm | SHA256withRSA | OZONE, HDDS, SECURITY, CRYPTO_COMPLIANCE | X509 signature certificate. |
hdds.xframe.enabled | true | OZONE, HDDS | If true, then enables protection against clickjacking by returning X_FRAME_OPTIONS header value set to SAMEORIGIN. Clickjacking protection prevents an attacker from using transparent or opaque layers to trick a user into clicking on a button or link on another page. |
hdds.xframe.value | SAMEORIGIN | OZONE, HDDS | This configration value allows user to specify the value for the X-FRAME-OPTIONS. The possible values for this field are DENY, SAMEORIGIN and ALLOW-FROM. Any other value will throw an exception when Datanodes are starting up. |
httpfs.access.mode | read-write | Sets the access mode for HTTPFS. If access is not allowed the FORBIDDED (403) is returned. Valid access modes are: read-write Full Access allowed write-only PUT POST and DELETE full Access. GET only allows GETFILESTATUS and LISTSTATUS read-only GET Full Access PUT POST and DELETE are FORBIDDEN | |
httpfs.buffer.size | 4096 | The buffer size used by a read/write request when streaming data from/to HDFS. | |
httpfs.delegation.token.manager.max.lifetime | 604800 | HttpFS delegation token maximum lifetime, default 7 days, in seconds | |
httpfs.delegation.token.manager.renewal.interval | 86400 | HttpFS delegation token update interval, default 1 day, in seconds. | |
httpfs.delegation.token.manager.update.interval | 86400 | HttpFS delegation token update interval, default 1 day, in seconds. | |
httpfs.hadoop.authentication.kerberos.keytab | ${user.home}/httpfs.keytab | The Kerberos keytab file with the credentials for the Kerberos principal used by httpfs to connect to the HDFS Namenode. | |
httpfs.hadoop.authentication.kerberos.principal | ${user.name}/${httpfs.hostname}@${kerberos.realm} | The Kerberos principal used by httpfs to connect to the HDFS Namenode. | |
httpfs.hadoop.authentication.type | simple | Defines the authentication mechanism used by httpfs to connect to the HDFS Namenode. Valid values are 'simple' and 'kerberos'. | |
httpfs.hadoop.filesystem.cache.purge.frequency | 60 | Frequency, in seconds, for the idle filesystem purging daemon runs. | |
httpfs.hadoop.filesystem.cache.purge.timeout | 60 | Timeout, in seconds, for an idle filesystem to be purged. | |
httpfs.hostname | ${httpfs.http.hostname} | Property used to synthetize the HTTP Kerberos principal used by httpfs. This property is only used to resolve other properties within this configuration file. | |
httpfs.http.administrators | ACL for the admins, this configuration is used to control who can access the default servlets for HttpFS server. The value should be a comma separated list of users and groups. The user list comes first and is separated by a space followed by the group list, e.g. "user1,user2 group1,group2". Both users and groups are optional, so "user1", " group1", "", "user1 group1", "user1,user2 group1,group2" are all valid (note the leading space in " group1"). '' grants access to all users and groups, e.g. '', '* ' and ' *' are all valid. | ||
httpfs.http.hostname | 0.0.0.0 | The bind host for HttpFS REST API. | |
httpfs.http.port | 14000 | The HTTP port for HttpFS REST API. | |
httpfs.services | org.apache.ozone.lib.service.instrumentation.InstrumentationService, org.apache.ozone.lib.service.scheduler.SchedulerService, org.apache.ozone.lib.service.security.GroupsService, org.apache.ozone.lib.service.hadoop.FileSystemAccessService | Services used by the httpfs server. | |
httpfs.ssl.enabled | false | Whether SSL is enabled. Default is false, i.e. disabled. | |
kerberos.realm | LOCALHOST | Kerberos realm, used only if Kerberos authentication is used between the clients and httpfs or between HttpFS and HDFS. This property is only used to resolve other properties within this configuration file. | |
net.topology.node.switch.mapping.impl | org.apache.hadoop.net.ScriptBasedMapping | OZONE, SCM | The default implementation of the DNSToSwitchMapping. It invokes a script specified in net.topology.script.file.name to resolve node names. If the value for net.topology.script.file.name is not set, the default value of DEFAULT_RACK is returned for all node names. |
ozone.UnsafeByteOperations.enabled | true | OZONE, PERFORMANCE, CLIENT | It specifies whether to use unsafe or safe buffer to byteString copy. |
ozone.acl.authorizer.class | org.apache.hadoop.ozone.security.acl.OzoneAccessAuthorizer | OZONE, SECURITY, ACL | Acl authorizer for Ozone. |
ozone.acl.enabled | false | OZONE, SECURITY, ACL | Key to enable/disable ozone acls. |
ozone.administrators | OZONE, SECURITY | Ozone administrator users delimited by the comma. If not set, only the user who launches an ozone service will be the admin user. This property must be set if ozone services are started by different users. Otherwise, the RPC layer will reject calls from other servers which are started by users not in the list. | |
ozone.administrators.groups | OZONE, SECURITY | Ozone administrator groups delimited by the comma. This is the list of groups who can access admin only information from ozone. It is enough to either have the name defined in ozone.administrators or be directly or indirectly in a group defined in this property. | |
ozone.audit.log.debug.cmd.list.dnaudit | DATANODE | A comma separated list of Datanode commands that are written to the DN audit logs only if the audit log level is debug. Ex: "CREATE_CONTAINER,READ_CONTAINER,UPDATE_CONTAINER". | |
ozone.audit.log.debug.cmd.list.omaudit | OM | A comma separated list of OzoneManager commands that are written to the OzoneManager audit logs only if the audit log level is debug. Ex: "ALLOCATE_BLOCK,ALLOCATE_KEY,COMMIT_KEY". | |
ozone.audit.log.debug.cmd.list.scmaudit | SCM | A comma separated list of SCM commands that are written to the SCM audit logs only if the audit log level is debug. Ex: "GET_VERSION,REGISTER,SEND_HEARTBEAT". | |
ozone.authorization.enabled | true | OZONE, SECURITY, AUTHORIZATION | Master switch to enable/disable authorization checks in Ozone (admin privilege checks and ACL checks). This property only takes effect when ozone.security.enabled is true. When true: admin privilege checks are always performed, and object ACL checks are controlled by ozone.acl.enabled. When false: no authorization checks are performed. Default is true. |
ozone.block.deleting.service.interval | 1m | OZONE, PERFORMANCE, SCM | Time interval of the block deleting service. The block deleting service runs on each datanode periodically and deletes blocks queued for deletion. Unit could be defined with postfix (ns,ms,s,m,h,d) |
ozone.block.deleting.service.timeout | 300000ms | OZONE, PERFORMANCE, SCM | A timeout value of block deletion service. If this is set greater than 0, the service will stop waiting for the block deleting completion after this time. This setting supports multiple time unit suffixes as described in dfs.heartbeat.interval. If no suffix is specified, then milliseconds is assumed. |
ozone.block.deleting.service.workers | 10 | OZONE, PERFORMANCE, SCM | Number of workers executed of block deletion service. This configuration should be set to greater than 0. |
ozone.chunk.read.buffer.default.size | 1MB | OZONE, SCM, CONTAINER, PERFORMANCE | The default read buffer size during read chunk operations when checksum is disabled. Chunk data will be cached in buffers of this capacity. For chunk data with checksum, the read buffer size will be the same as the number of bytes per checksum (ozone.client.bytes.per.checksum) corresponding to the chunk. |
ozone.chunk.read.mapped.buffer.max.count | 0 | OZONE, SCM, CONTAINER, PERFORMANCE | The default max count of memory mapped buffers allowed for a DN. Default 0 means no mapped buffers allowed for data read. |
ozone.chunk.read.mapped.buffer.threshold | 32KB | OZONE, SCM, CONTAINER, PERFORMANCE | The default read threshold to use memory mapped buffers. |
ozone.client.bucket.replication.config.refresh.time.ms | 30000 | OZONE | Default time period to refresh the bucket replication config in o3fs clients. Until the bucket replication config refreshed, client will continue to use existing replication config irrespective of whether bucket replication config updated at OM or not. |
ozone.client.bytes.per.checksum | 16KB | CLIENT, CRYPTO_COMPLIANCE | Checksum will be computed for every bytes per checksum number of bytes and stored sequentially. The minimum value for this config is 8KB. |
ozone.client.checksum.combine.mode | COMPOSITE_CRC | CLIENT | The combined checksum type [MD5MD5CRC / COMPOSITE_CRC] determines which algorithm would be used to compute file checksum.COMPOSITE_CRC calculates the combined CRC of the whole file, where the lower-level chunk/block checksums are combined into file-level checksum.MD5MD5CRC calculates the MD5 of MD5 of checksums of individual chunks.Default checksum type is COMPOSITE_CRC. |
ozone.client.checksum.type | CRC32 | CLIENT, CRYPTO_COMPLIANCE | The checksum type [NONE/ CRC32/ CRC32C/ SHA256/ MD5] determines which algorithm would be used to compute checksum for chunk data. Default checksum type is CRC32. |
ozone.client.connection.timeout | 5000ms | OZONE, PERFORMANCE, CLIENT | Connection timeout for Ozone client in milliseconds. |
ozone.client.datastream.buffer.flush.size | 16MB | CLIENT | The boundary at which putBlock is executed |
ozone.client.datastream.min.packet.size | 1MB | CLIENT | The maximum size of the ByteBuffer (used via ratis streaming) |
ozone.client.datastream.pipeline.mode | true | CLIENT | Streaming write support both pipeline mode(datanode1->datanode2->datanode3) and star mode(datanode1->datanode2, datanode1->datanode3). By default we use pipeline mode. |
ozone.client.datastream.sync.size | 0B | CLIENT | The minimum size of written data before forcing the datanodes in the pipeline to flush the pending data to underlying storage. If set to zero or negative, the client will not force the datanodes to flush. |
ozone.client.datastream.window.size | 64MB | CLIENT | Maximum size of BufferList(used for retry) size per BlockDataStreamOutput instance |
ozone.client.ec.grpc.retries.enabled | true | CLIENT | To enable Grpc client retries for EC. |
ozone.client.ec.grpc.retries.max | 3 | CLIENT | The maximum attempts GRPC client makes before failover. |
ozone.client.ec.grpc.write.timeout | 30s | OZONE, CLIENT, MANAGEMENT | Timeout for ozone ec grpc client during write. |
ozone.client.ec.reconstruct.stripe.read.pool.limit | 30 | CLIENT | Thread pool max size for parallel read available ec chunks to reconstruct the whole stripe. |
ozone.client.ec.reconstruct.stripe.write.pool.limit | 30 | CLIENT | Thread pool max size for parallel write available ec chunks to reconstruct the whole stripe. |
ozone.client.ec.stripe.queue.size | 2 | CLIENT | The max number of EC stripes can be buffered in client before flushing into datanodes. |
ozone.client.elastic.byte.buffer.pool.max.size | 16GB | OZONE, CLIENT | The maximum total size of buffers that can be cached in the client-side ByteBufferPool. This pool is used heavily during EC read and write operations. Setting a limit prevents unbounded memory growth in long-lived rpc clients like the S3 Gateway. Once this limit is reached, used buffers are not put back to the pool and will be garbage collected. |
ozone.client.exclude.nodes.expiry.time | 600000 | CLIENT | Time after which an excluded node is reconsidered for writes. If the value is zero, the node is excluded for the life of the client |
ozone.client.failover.max.attempts | 500 | Expert only. Ozone RpcClient attempts talking to each OzoneManager ipc.client.connect.max.retries (default = 10) number of times before failing over to another OzoneManager, if available. This parameter represents the number of times per request the client will failover before giving up. This value is kept high so that client does not give up trying to connect to OMs easily. | |
ozone.client.follower.read.default.consistency | LINEARIZABLE_ALLOW_FOLLOWER | The default consistency when client enables follower read. Currently, the supported follower read consistency are LINEARIZABLE_ALLOW_FOLLOWER and LOCAL_LEASE The default value is LINEARIZABLE_ALLOW_FOLLOWER to preserve the same strong consistency behavior when switching from leader-only read to follower read. | |
ozone.client.follower.read.enabled | false | Enable client to read from OM followers. If false, all client requests are sent to the OM leader. | |
ozone.client.fs.default.bucket.layout | FILE_SYSTEM_OPTIMIZED | OZONE, CLIENT | Default bucket layout value used when buckets are created using OFS. Supported values are LEGACY and FILE_SYSTEM_OPTIMIZED. FILE_SYSTEM_OPTIMIZED: This layout allows the bucket to support atomic rename/delete operations and also allows interoperability between S3 and FS APIs. Keys written via S3 API with a "/" delimiter will create intermediate directories. |
ozone.client.hbase.enhancements.allowed | false | CLIENT | When set to false, client-side HBase enhancement-related Ozone (experimental) features are disabled (not allowed to be enabled) regardless of whether those configs are set. Here is the list of configs and values overridden when this config is set to false: 1. ozone.fs.hsync.enabled = false 2. ozone.client.incremental.chunk.list = false 3. ozone.client.stream.putblock.piggybacking = false 4. ozone.client.key.write.concurrency = 1 A warning message will be printed if any of the above configs are overridden by this. |
ozone.client.incremental.chunk.list | false | CLIENT | Client PutBlock request can choose incremental chunk list rather than full chunk list to optimize performance. Critical to HBase. EC does not support this feature. Can be enabled only when ozone.client.hbase.enhancements.allowed = true |
ozone.client.key.latest.version.location | true | OZONE, CLIENT | Ozone client gets the latest version location. |
ozone.client.key.provider.cache.expiry | 10d | OZONE, CLIENT, SECURITY | Ozone client security key provider cache expiration time. |
ozone.client.key.write.concurrency | 1 | CLIENT | Maximum concurrent writes allowed on each key. Defaults to 1 which matches the behavior before HDDS-9844. For unlimited write concurrency, set this to -1 or any negative integer value. Any value other than 1 is effective only when ozone.client.hbase.enhancements.allowed = true |
ozone.client.leader.read.default.consistency | DEFAULT | The default consistency when client disables follower read. Currently, the supported leader read consistency are DEFAULT and LINEARIZABLE_LEADER_ONLY. The default value is DEFAULT for backward compatibility reason which is mostly strongly consistent. | |
ozone.client.list.cache | 1000 | OZONE, PERFORMANCE | Configuration property to configure the cache size of client list calls. |
ozone.client.max.ec.stripe.write.retries | 10 | CLIENT | When EC stripe write failed, client will request to allocate new block group and write the failed stripe into new block group. If the same stripe failure continued in newly acquired block group also, then it will retry by requesting to allocate new block group again. This configuration is used to limit these number of retries. By default the number of retries are 10. |
ozone.client.max.retries | 5 | CLIENT | Maximum number of retries by Ozone Client on encountering exception while writing a key |
ozone.client.read.max.retries | 3 | CLIENT | Maximum number of retries by Ozone Client on encountering connectivity exception when reading a key. |
ozone.client.read.retry.interval | 1 | CLIENT | Indicates the time duration in seconds a client will wait before retrying a read key request on encountering a connectivity exception from Datanodes. By default the interval is 1 second |
ozone.client.read.timeout | 30s | OZONE, CLIENT, MANAGEMENT | Timeout for ozone grpc client during read. |
ozone.client.retry.interval | 0 | CLIENT | Indicates the time duration a client will wait before retrying a write key request on encountering an exception. By default there is no wait |
ozone.client.server-defaults.validity.period.ms | 3600000 | OZONE, CLIENT, SECURITY | The amount of milliseconds after which cached server defaults are updated. By default this parameter is set to 1 hour. Support multiple time unit suffix(case insensitive). If no time unit is specified then milliseconds is assumed. |
ozone.client.socket.timeout | 5000ms | OZONE, CLIENT | Socket timeout for Ozone client. Unit could be defined with postfix (ns,ms,s,m,h,d) |
ozone.client.stream.buffer.flush.delay | true | CLIENT | Default true, when call flush() and determine whether the data in the current buffer is greater than ozone.client.stream.buffer.size, if greater than then send buffer to the datanode. You can turn this off by setting this configuration to false. |
ozone.client.stream.buffer.flush.size | 16MB | CLIENT | Size which determines at what buffer position a partial flush will be initiated during write. It should be a multiple of ozone.client.stream.buffer.size |
ozone.client.stream.buffer.increment | 0B | CLIENT | Buffer (defined by ozone.client.stream.buffer.size) will be incremented with this steps. If zero, the full buffer will be created at once. Setting it to a variable between 0 and ozone.client.stream.buffer.size can reduce the memory usage for very small keys, but has a performance overhead. |
ozone.client.stream.buffer.max.size | 32MB | CLIENT | Size which determines at what buffer position write call be blocked till acknowledgement of the first partial flush happens by all servers. |
ozone.client.stream.buffer.size | 4MB | CLIENT | The size of chunks the client will send to the server |
ozone.client.stream.putblock.piggybacking | false | CLIENT | Allow PutBlock to be piggybacked in WriteChunk requests if the chunk is small. Can be enabled only when ozone.client.hbase.enhancements.allowed = true |
ozone.client.stream.read.pre-read-size | 33554432 | CLIENT | Extra bytes to prefetch during streaming reads. |
ozone.client.stream.read.response-data-size | 1048576 | CLIENT | Chunk size of streaming read responses from datanodes. |
ozone.client.stream.read.timeout | 10s | CLIENT | Timeout for receiving streaming read responses. |
ozone.client.stream.readblock.enable | false | CLIENT | Allow ReadBlock to stream all the readChunk in one request. |
ozone.client.verify.checksum | true | CLIENT | Ozone client to verify checksum of the checksum blocksize data. |
ozone.client.wait.between.retries.millis | 2000 | Expert only. The time to wait, in milliseconds, between retry attempts to contact OM. Wait time increases linearly if same OM is retried again. If retrying on multiple OMs proxies in round robin fashion, the wait time is introduced after all the OM proxies have been attempted once. | |
ozone.container.cache.lock.stripes | 1024 | PERFORMANCE, CONTAINER, STORAGE | Container DB open is an exclusive operation. We use a stripe lock to guarantee that different threads can open different container DBs concurrently, while for one container DB, only one thread can open it at the same time. This setting controls the lock stripes. |
ozone.container.cache.size | 1024 | PERFORMANCE, CONTAINER, STORAGE | The open container is cached on the data node side. We maintain an LRU cache for caching the recently used containers. This setting controls the size of that cache. |
ozone.csi.default-volume-size | 1000000000 | STORAGE | The default size of the create volumes (if not specified). |
ozone.csi.mount.command | goofys --endpoint %s %s %s | STORAGE | This is the mount command which is used to publish volume. these %s will be replicated by s3gAddress, volumeId and target path. |
ozone.csi.owner | STORAGE | This is the username which is used to create the requested storage. Used as a hadoop username and the generated ozone volume used to store all the buckets. WARNING: It can be a security hole to use CSI in a secure environments as ALL the users can request the mount of a specific bucket via the CSI interface. | |
ozone.csi.s3g.address | http://localhost:9878 | STORAGE | The address of S3 Gateway endpoint. |
ozone.csi.socket | /var/lib/csi.sock | STORAGE | The socket where all the CSI services will listen (file name). |
ozone.default.bucket.layout | OZONE, MANAGEMENT | Default bucket layout used by Ozone Manager during bucket creation when a client does not specify the bucket layout option. Supported values are OBJECT_STORE and FILE_SYSTEM_OPTIMIZED. OBJECT_STORE: This layout allows the bucket to behave as a pure object store and will not allow interoperability between S3 and FS APIs. FILE_SYSTEM_OPTIMIZED: This layout allows the bucket to support atomic rename/delete operations and also allows interoperability between S3 and FS APIs. Keys written via S3 API with a "/" delimiter will create intermediate directories. | |
ozone.directory.deleting.service.interval | 1m | OZONE, PERFORMANCE, OM | Time interval of the directory deleting service. It runs on OM periodically and cleanup orphan directory and its sub-tree. For every orphan directory it deletes the sub-path tree structure(dirs/files). It sends sub-files to KeyDeletingService to deletes its blocks. Unit could be defined with postfix (ns,ms,s,m,h,d) |
ozone.filesystem.snapshot.enabled | true | OZONE, OM | Enables Ozone filesystem snapshot feature if set to true on the OM side. Disables it otherwise. |
ozone.freon.http-address | 0.0.0.0:9884 | OZONE, MANAGEMENT | The address and the base port where the FREON web ui will listen on. If the port is 0 then the server will start on a free port. |
ozone.freon.http-bind-host | 0.0.0.0 | OZONE, MANAGEMENT | The actual address the Freon web server will bind to. If this optional address is set, it overrides only the hostname portion of ozone.freon.http-address. |
ozone.freon.http.auth.kerberos.keytab | /etc/security/keytabs/HTTP.keytab | SECURITY | Keytab used by Freon. |
ozone.freon.http.auth.kerberos.principal | HTTP/_HOST@REALM | SECURITY | Security principal used by freon. |
ozone.freon.http.auth.type | simple | FREON, SECURITY | simple or kerberos. If kerberos is set, SPNEGO will be used for http authentication. |
ozone.freon.http.enabled | true | OZONE, MANAGEMENT | Property to enable or disable FREON web ui. |
ozone.freon.https-address | 0.0.0.0:9885 | OZONE, MANAGEMENT | The address and the base port where the Freon web server will listen on using HTTPS. If the port is 0 then the server will start on a free port. |
ozone.freon.https-bind-host | 0.0.0.0 | OZONE, MANAGEMENT | The actual address the Freon web server will bind to using HTTPS. If this optional address is set, it overrides only the hostname portion of ozone.freon.http-address. |
ozone.fs.datastream.auto.threshold | 4MB | OZONE, DATANODE | A threshold to auto select datastream to write files in OzoneFileSystem. |
ozone.fs.datastream.enabled | false | OZONE, DATANODE | To enable/disable filesystem write via ratis streaming. |
ozone.fs.hsync.enabled | false | OZONE, CLIENT, OM | Enable hsync/hflush on the Ozone Manager and/or client side. Disabled by default. Can be enabled only when ozone.hbase.enhancements.allowed = true |
ozone.fs.iterate.batch-size | 100 | OZONE, OZONEFS | Iterate batch size of delete when use BasicOzoneFileSystem. |
ozone.fs.listing.page.size | 1024 | OZONE, CLIENT | Listing page size value used by client for listing number of items on fs related sub-commands output. Kindly set this config value responsibly to avoid high resource usage. Maximum value restricted is 5000 for optimum performance. |
ozone.fs.listing.page.size.max | 5000 | OZONE, OM | Maximum listing page size value enforced by server for listing items on fs related sub-commands output. Kindly set this config value responsibly to avoid high resource usage. Maximum value restricted is 5000 for optimum performance. |
ozone.hbase.enhancements.allowed | false | OZONE, OM | When set to false, server-side HBase enhancement-related Ozone (experimental) features are disabled (not allowed to be enabled) regardless of whether those configs are set. Here is the list of configs and values overridden when this config is set to false: 1. ozone.fs.hsync.enabled = false A warning message will be printed if any of the above configs are overridden by this. |
ozone.http.basedir | OZONE, OM, SCM, MANAGEMENT | The base dir for HTTP Jetty server to extract contents. If this property is not configured, by default, Jetty will create a directory inside the directory named by the ${ozone.metadata.dirs}/webserver. While in production environment, it's strongly suggested instructing Jetty to use a different parent directory by setting this property to the name of the desired parent directory. The value of the property will be used to set Jetty context attribute 'org.eclipse.jetty.webapp.basetempdir'. The directory named by this property must exist and be writeable. | |
ozone.http.filter.initializers | OZONE, SECURITY, KERBEROS | Set to org.apache.hadoop.security.AuthenticationFilterInitializer to enable Kerberos authentication for Ozone HTTP web consoles is enabled using the SPNEGO protocol. When this property is set, ozone.security.http.kerberos.enabled should be set to true. | |
ozone.http.policy | HTTP_ONLY | OZONE, SECURITY, MANAGEMENT | Decide if HTTPS(SSL) is supported on Ozone This configures the HTTP endpoint for Ozone daemons: The following values are supported: - HTTP_ONLY : Service is provided only on http - HTTPS_ONLY : Service is provided only on https - HTTP_AND_HTTPS : Service is provided both on http and https |
ozone.https.client.keystore.resource | ssl-client.xml | OZONE, SECURITY, MANAGEMENT | Resource file from which ssl client keystore information will be extracted |
ozone.https.client.need-auth | false | OZONE, SECURITY, MANAGEMENT | Whether SSL client certificate authentication is required |
ozone.https.server.keystore.resource | ssl-server.xml | OZONE, SECURITY, MANAGEMENT | Resource file from which ssl server keystore information will be extracted |
ozone.key.deleting.limit.per.task | 50000 | OM, PERFORMANCE | A maximum number of keys to be scanned by key deleting service per time interval in OM. Those keys are sent to delete metadata and generate transactions in SCM for next async deletion between SCM and DataNode. |
ozone.key.preallocation.max.blocks | 64 | OZONE, OM, PERFORMANCE | While allocating blocks from OM, this configuration limits the maximum number of blocks being allocated. This configuration ensures that the allocated block response do not exceed rpc payload limit. If client needs more space for the write, separate block allocation requests will be made. |
ozone.manager.delegation.remover.scan.interval | 3600000 | Time interval after which ozone secret manger scans for expired delegation token. | |
ozone.manager.delegation.token.max-lifetime | 7d | Default max time interval after which ozone delegation token will not be renewed. Delegation Token is signed and verified using secret key which has a max hdds.secret.key.expiry.duration lifetime. To guarantee that the delegation token can be properly loaded, verified, and renewed during its lifetime, (ozone.manager.delegation.token.max-lifetime + hdds.secret.key.rotate.duration + ozone.manager.delegation.remover.scan.interval) must not be greater than hdds.secret.key.expiry.duration. If any of ozone.manager.delegation.token.max-lifetime, hdds.secret.key.expiry.duration, hdds.secret.key.rotate.duration or ozone.manager.delegation.remover.scan.interval value is changed, The above constrain must be checked and values be adjusted accordingly if necessary. | |
ozone.manager.delegation.token.renew-interval | 1d | Default time interval after which ozone delegation token will require renewal before any further use. | |
ozone.metadata.dirs | OZONE, OM, SCM, CONTAINER, STORAGE, REQUIRED | This setting is the fallback location for SCM, OM, Recon and DataNodes to store their metadata. This setting may be used only in test/PoC clusters to simplify configuration. For production clusters or any time you care about performance, it is recommended that ozone.om.db.dirs, ozone.scm.db.dirs and hdds.container.ratis.datanode.storage.dir be configured separately. | |
ozone.metadata.dirs.permissions | 700 | Permissions for the metadata directories for fallback location for SCM, OM, Recon and DataNodes to store their metadata. The permissions have to be octal or symbolic. This is the fallback used in case the default permissions for OM,SCM,Recon,Datanode are not set. | |
ozone.metastore.rocksdb.cf.write.buffer.size | 128MB | OZONE, OM, SCM, STORAGE, PERFORMANCE | The write buffer (memtable) size for each column family of the rocksdb store. Check the rocksdb documentation for more details. |
ozone.metastore.rocksdb.statistics | OFF | OZONE, OM, SCM, STORAGE, PERFORMANCE | The statistics level of the rocksdb store. If you use any value from org.rocksdb.StatsLevel (eg. ALL or EXCEPT_DETAILED_TIMERS), the rocksdb statistics will be exposed over JMX bean with the choosed setting. Set it to OFF to not initialize rocksdb statistics at all. Please note that collection of statistics could have 5-10% performance penalty. Check the rocksdb documentation for more details. |
ozone.network.flexible.fqdn.resolution.enabled | false | OZONE, SCM, OM | SCM, OM hosts will be able to resolve itself based on its host name instead of fqdn. It is useful for deploying to kubernetes environment, during the initial launching time when [pod_name].[service_name] is not resolvable yet because of the probe. |
ozone.network.jvm.address.cache.enabled | true | OZONE, SCM, OM, DATANODE | Disable the jvm network address cache. In environment such as kubernetes, IPs of instances of scm, om and datanodes can be changed. Disabling this cache helps to quickly resolve the fqdn's to the new IPs. |
ozone.network.topology.aware.read | true | OZONE, PERFORMANCE | Whether to enable topology aware read to improve the read performance. |
ozone.om.address | 0.0.0.0:9862 | OM, REQUIRED | The address of the Ozone OM service. This allows clients to discover the address of the OM. When HA mode is enabled, append the service ID and node ID to each OM property. For example: ozone.om.address.service1.om1 |
ozone.om.admin.protocol.max.retries | 20 | OM, MANAGEMENT | Expert only. The maximum number of retries for Ozone Manager Admin protocol on each OM. |
ozone.om.admin.protocol.wait.between.retries | 1000 | OM, MANAGEMENT | Expert only. The time to wait, in milliseconds, between retry attempts for Ozone Manager Admin protocol. |
ozone.om.allow.leader.skip.linearizable.read | false | OM, PERFORMANCE, HA | Allow leader to handler requests directly, no need to check the leadership for every request. |
ozone.om.client.rpc.timeout | 15m | OZONE, OM, CLIENT | RpcClient timeout on waiting for the response from OzoneManager. The default value is set to 15 minutes. If ipc.client.ping is set to true and this rpc-timeout is greater than the value of ipc.ping.interval, the effective value of the rpc-timeout is rounded up to multiple of ipc.ping.interval. |
ozone.om.client.trash.core.pool.size | 5 | OZONE, OM, CLIENT | Total number of threads in pool for the Trash Emptier |
ozone.om.compaction.service.columnfamilies | keyTable,fileTable,directoryTable,deletedTable,deletedDirectoryTable,multipartInfoTable | OZONE, OM, PERFORMANCE | A comma separated, no spaces list of all the column families that are compacted by the compaction service. If this is empty, no column families are compacted. |
ozone.om.compaction.service.enabled | false | OZONE, OM, PERFORMANCE | Enable or disable a background job that periodically compacts rocksdb tables flagged for compaction. |
ozone.om.compaction.service.run.interval | 6h | OZONE, OM, PERFORMANCE | A background job that periodically compacts rocksdb tables flagged for compaction. Unit could be defined with postfix (ns,ms,s,m,h,d) |
ozone.om.compaction.service.timeout | 10m | OZONE, OM, PERFORMANCE | A timeout value of compaction service. If this is set greater than 0, the service will stop waiting for compaction completion after this time. Unit could be defined with postfix (ns,ms,s,m,h,d) |
ozone.om.container.location.cache.size | 100000 | OZONE, OM | The size of the container locations cache in Ozone Manager. This cache allows Ozone Manager to populate block locations in key-read responses without calling SCM, thus increases Ozone Manager read performance. |
ozone.om.container.location.cache.ttl | 360m | OZONE, OM | The time to live for container location cache in Ozone. |
ozone.om.db.checkpoint.use.inode.based.transfer | true | OZONE, OM | Denotes if the OM bootstrap inode based transfer implementation is set as default. |
ozone.om.db.dirs | OZONE, OM, STORAGE, PERFORMANCE | Directory where the OzoneManager stores its metadata. This should be specified as a single directory. If the directory does not exist then the OM will attempt to create it. If undefined, then the OM will log a warning and fallback to ozone.metadata.dirs. This fallback approach is not recommended for production environments. | |
ozone.om.db.dirs.permissions | 700 | Permissions for the metadata directories for Ozone Manager. The permissions have to be octal or symbolic. If the default permissions are not set then the default value of 700 will be used. | |
ozone.om.db.max.open.files | -1 | OZONE, OM | Max number of open files that OM RocksDB will open simultaneously Essentially sets max_open_files config for the active OM RocksDB instance. This will limit the total number of files opened by OM db. Default is -1 which is unlimited and gives max performance. If you are certain that your ulimit will always be bigger than number of files in the database, set max_open_files to -1, or else set it to a value lesser than or equal to ulimit. |
ozone.om.decommissioned.nodes.EXAMPLEOMSERVICEID | OM, HA | Comma-separated list of OM node Ids which have been decommissioned. OMs present in this list will not be included in the OM HA ring. | |
ozone.om.delta.update.data.size.max.limit | 1024MB | OM, MANAGEMENT | Recon get a limited delta updates from OM periodically since sequence number. Based on sequence number passed, OM DB delta update may have large number of log files and each log batch data may be huge depending on frequent writes and updates by ozone client, so to avoid increase in heap memory, this config is used as limiting factor of default 1 GB while preparing DB updates object. |
ozone.om.edekcacheloader.initial.delay.ms | 3000 | When KeyProvider is configured, the time delayed until the first attempt to warm up edek cache on OM start up. | |
ozone.om.edekcacheloader.interval.ms | 1000 | When KeyProvider is configured, the interval time of warming up edek cache on OM starts up. All edeks will be loaded from KMS into provider cache. The edek cache loader will try to warm up the cache until succeed or OM leaves active state. | |
ozone.om.edekcacheloader.max-retries | 10 | When KeyProvider is configured, the max retries allowed to attempt warm up edek cache if none of key successful on OM start up. | |
ozone.om.enable.filesystem.paths | false | OM, OZONE | If true, key names will be interpreted as file system paths. '/' will be treated as a special character and paths will be normalized and must follow Unix filesystem path naming conventions. This flag will be helpful when objects created by S3G need to be accessed using OFS/O3Fs. If false, it will fallback to default behavior of Key/MPU create requests where key paths are not normalized and any intermediate directories will not be created or any file checks happens to check filesystem semantics. |
ozone.om.enable.ofs.shared.tmp.dir | false | OZONE, OM | Enable shared ofs tmp directory ofs://tmp. Allows a root tmp directory with sticky-bit behaviour. |
ozone.om.follower.read.local.lease.enabled | false | OM, PERFORMANCE, HA, RATIS | If we enabled the local lease for Follower Read. If enabled, follower OM will decide if return local data directly based on lag log and time. |
ozone.om.follower.read.local.lease.log.limit | 10000 | OM, PERFORMANCE, HA, RATIS | If the log lag between leader OM and follower OM is larger than this number, the follower OM is not up-to-date. Setting this to -1 to allow infinite lag. |
ozone.om.follower.read.local.lease.time.ms | 5000 | OM, PERFORMANCE, HA, RATIS | If the lag time Ms between leader OM and follower OM is larger than this number, the follower OM is not up-to-date. By default, it's set to Ratis RPC timeout value. Setting this to -1 to allow infinite lag. |
ozone.om.fs.snapshot.max.limit | 10000 | OZONE, OM, MANAGEMENT | The maximum number of filesystem snapshot allowed in an Ozone Manager. |
ozone.om.group.rights | READ, LIST | OM, SECURITY | Default group permissions set for an object in OzoneManager. |
ozone.om.grpc.bossgroup.size | 8 | OZONE, OM, S3GATEWAY | OM grpc server netty boss event group size. |
ozone.om.grpc.maximum.response.length | 134217728 | OZONE, OM, S3GATEWAY | OM/S3GATEWAY OMRequest, OMResponse over grpc max message length (bytes). |
ozone.om.grpc.port | 8981 | MANAGEMENT | Port used for the GrpcOmTransport OzoneManagerServiceGrpc server |
ozone.om.grpc.read.thread.num | 32 | OZONE, OM, S3GATEWAY | OM grpc server read thread pool core thread size. |
ozone.om.grpc.workergroup.size | 32 | OZONE, OM, S3GATEWAY | OM grpc server netty worker event group size. |
ozone.om.ha.raft.server.log.appender.wait-time.min | 0ms | OZONE, OM, RATIS, PERFORMANCE | Minimum wait time between two appendEntries calls. |
ozone.om.ha.raft.server.read.leader.lease.enabled | false | OZONE, OM, RATIS, PERFORMANCE | If we enabled the leader lease on Ratis Leader. |
ozone.om.ha.raft.server.read.option | DEFAULT | OZONE, OM, RATIS, PERFORMANCE | Select the Ratis server read option. Possible values are: DEFAULT - Directly query statemachine (non-linearizable). Only the leader can serve read requests. LINEARIZABLE - Use ReadIndex (see Raft Paper section 6.4) to maintain linearizability. Both the leader and the followers can serve read requests. |
ozone.om.ha.raft.server.retrycache.expirytime | 300s | OZONE, OM, RATIS | The timeout duration of the retry cache. |
ozone.om.handler.count.key | 100 | OM, PERFORMANCE | The number of RPC handler threads for OM service endpoints. |
ozone.om.hierarchical.resource.locks.hard.limit | 10000 | Maximum number of lock objects that could be present in the pool. | |
ozone.om.hierarchical.resource.locks.soft.limit | 1024 | Soft limit for number of lock objects that could be idle in the pool. | |
ozone.om.http-address | 0.0.0.0:9874 | OM, MANAGEMENT | The address and the base port where the OM web UI will listen on. If the port is 0, then the server will start on a free port. However, it is best to specify a well-known port, so it is easy to connect and see the OM management UI. When HA mode is enabled, append the service ID and node ID to each OM property. For example: ozone.om.http-address.service1.om1 |
ozone.om.http-bind-host | 0.0.0.0 | OM, MANAGEMENT | The actual address the OM web server will bind to. If this optional the address is set, it overrides only the hostname portion of ozone.om.http-address. When HA mode is enabled, append the service ID and node ID to each OM property. For example: ozone.om.http-bind-host.service1.om1 |
ozone.om.http.auth.kerberos.keytab | /etc/security/keytabs/HTTP.keytab | OZONE, SECURITY, KERBEROS | The keytab file used by OM http server to login as its service principal if SPNEGO is enabled for om http server. |
ozone.om.http.auth.kerberos.principal | HTTP/_HOST@REALM | OZONE, SECURITY, KERBEROS | Ozone Manager http server service principal if SPNEGO is enabled for om http server. |
ozone.om.http.auth.type | simple | OM, SECURITY, KERBEROS | simple or kerberos. If kerberos is set, SPNEGO will be used for http authentication. |
ozone.om.http.enabled | true | OM, MANAGEMENT | Property to enable or disable OM web user interface. |
ozone.om.https-address | 0.0.0.0:9875 | OM, MANAGEMENT, SECURITY | The address and the base port where the OM web UI will listen on using HTTPS. If the port is 0 then the server will start on a free port. When HA mode is enabled, append the service ID and node ID to each OM property. For example: ozone.om.https-address.service1.om1 |
ozone.om.https-bind-host | 0.0.0.0 | OM, MANAGEMENT, SECURITY | The actual address the OM web server will bind to using HTTPS. If this optional address is set, it overrides only the hostname portion of ozone.om.https-address. When HA mode is enabled, append the service ID and node ID to each OM property. For example: ozone.om.https-bind-host.service1.om1 |
ozone.om.internal.service.id | OM, HA | Service ID of the Ozone Manager. If this is not set fall back to ozone.om.service.ids to find the service ID it belongs to. | |
ozone.om.kerberos.keytab.file | /etc/security/keytabs/OM.keytab | OZONE, SECURITY, KERBEROS | The keytab file used by OzoneManager daemon to login as its service principal. The principal name is configured with ozone.om.kerberos.principal. |
ozone.om.kerberos.principal | OM/_HOST@REALM | OZONE, SECURITY, KERBEROS | The OzoneManager service principal. Ex om/_HOST@REALM.COM |
ozone.om.kerberos.principal.pattern | * | A client-side RegEx that can be configured to control allowed realms to authenticate with (useful in cross-realm env.) | |
ozone.om.key.path.lock.enabled | false | OZONE, OM | Defaults to false. If true, the fine-grained KEY_PATH_LOCK functionality is enabled. If false, it is disabled. |
ozone.om.keyname.character.check.enabled | false | OM, OZONE | If true, then enable to check if the key name contains illegal characters when creating/renaming key. For the definition of illegal characters, follow the rules in Amazon S3's object key naming guide. |
ozone.om.leader.election.minimum.timeout.duration | 5s | OZONE, OM, RATIS, MANAGEMENT, DEPRECATED | DEPRECATED. Leader election timeout uses ratis rpc timeout which can be set via ozone.om.ratis.minimum.timeout. |
ozone.om.lease.hard.limit | 7d | OZONE, OM, PERFORMANCE | Controls how long an open hsync key is considered as active. Specifically, if a hsync key has been open longer than the value of this config entry, that open hsync key is considered as expired (e.g. due to client crash). Unit could be defined with postfix (ns,ms,s,m,h,d) |
ozone.om.lease.soft.limit | 60s | OZONE, OM | Hsync soft limit lease period. |
ozone.om.lock.fair | false | If this is true, the Ozone Manager lock will be used in Fair mode, which will schedule threads in the order received/queued. If this is false, uses non-fair ordering. See java.util.concurrent.locks.ReentrantReadWriteLock for more information on fair/non-fair locks. | |
ozone.om.max.buckets | 100000 | OZONE, OM | maximum number of buckets across all volumes. |
ozone.om.multitenancy.enabled | false | OZONE, OM | Enable S3 Multi-Tenancy. If disabled, all S3 multi-tenancy requests are rejected. |
ozone.om.multitenancy.ranger.sync.interval | 10m | OZONE, OM | Determines how often the Multi-Tenancy Ranger background sync thread service should run. Background thread periodically checks Ranger policies and roles created by Multi-Tenancy feature. And overwrites them if obvious discrepancies are detected. Value should be set with a unit suffix (ns,ms,s,m,h,d) |
ozone.om.multitenancy.ranger.sync.timeout | 10s | OZONE, OM | The timeout for each Multi-Tenancy Ranger background sync thread run. If the timeout has been reached, a warning message will be logged. |
ozone.om.namespace.s3.strict | true | OZONE, OM | Ozone namespace should follow S3 naming rule by default. However this parameter allows the namespace to support non-S3 compatible characters. |
ozone.om.network.topology.refresh.duration | 1h | SCM, OZONE, OM | The duration at which we periodically fetch the updated network topology cluster tree from SCM. |
ozone.om.node.id | OM, HA | The ID of this OM node. If the OM node ID is not configured it is determined automatically by matching the local node's address with the configured address. If node ID is not deterministic from the configuration, then it is set to default node id - om1. | |
ozone.om.nodes.EXAMPLEOMSERVICEID | OM, HA | Comma-separated list of OM node Ids for a given OM service ID (eg. EXAMPLEOMSERVICEID). The OM service ID should be the value (one of the values if there are multiple) set for the parameter ozone.om.service.ids. Decommissioned nodes (represented by node Ids in ozone.om.decommissioned.nodes config list) will be ignored and not included in the OM HA setup even if added to this list. Unique identifiers for each OM Node, delimited by commas. This will be used by OzoneManagers in HA setup to determine all the OzoneManagers belonging to the same OMservice in the cluster. For example, if you used “omService1” as the OM service ID previously, and you wanted to use “om1”, “om2” and "om3" as the individual IDs of the OzoneManagers, you would configure a property ozone.om.nodes.omService1, and its value "om1,om2,om3". | |
ozone.om.object.creation.ignore.client.acls | false | OM, SECURITY | Ignore ACLs sent by client to OzoneManager during volume/bucket/key creation. |
ozone.om.open.key.cleanup.limit.per.task | 1000 | OZONE, OM, PERFORMANCE | The maximum number of open keys to be identified as expired and marked for deletion by one run of the open key cleanup service on the OM. This property is used to throttle the actual number of open key deletions on the OM. |
ozone.om.open.key.cleanup.service.interval | 24h | OZONE, OM, PERFORMANCE | A background job that periodically checks open key entries and marks expired open keys for deletion. This entry controls the interval of this cleanup check. Unit could be defined with postfix (ns,ms,s,m,h,d) |
ozone.om.open.key.cleanup.service.timeout | 300s | OZONE, OM, PERFORMANCE | A timeout value of open key cleanup service. If this is set greater than 0, the service will stop waiting for the open key deleting completion after this time. If timeout happens to a large proportion of open key deletion, this value needs to be increased or ozone.om.open.key.cleanup.limit.per.task should be decreased. Unit could be defined with postfix (ns,ms,s,m,h,d) |
ozone.om.open.key.expire.threshold | 7d | OZONE, OM, PERFORMANCE | Controls how long an open key operation is considered active. Specifically, if a key has been open longer than the value of this config entry, that open key is considered as expired (e.g. due to client crash). Unit could be defined with postfix (ns,ms,s,m,h,d) |
ozone.om.open.mpu.cleanup.service.interval | 24h | OZONE, OM, PERFORMANCE | A background job that periodically checks inactive multipart info send multipart upload abort requests for them. This entry controls the interval of this cleanup check. Unit could be defined with postfix (ns,ms,s,m,h,d) |
ozone.om.open.mpu.cleanup.service.timeout | 300s | OZONE, OM, PERFORMANCE | A timeout value of multipart upload cleanup service. If this is set greater than 0, the service will stop waiting for the multipart info abort completion after this time. If timeout happens to a large proportion of multipart aborts, this value needs to be increased or ozone.om.open.key.cleanup.limit.per.task should be decreased. Unit could be defined with postfix (ns,ms,s,m,h,d) |
ozone.om.open.mpu.expire.threshold | 30d | OZONE, OM, PERFORMANCE | Controls how long multipart upload is considered active. Specifically, if a multipart info has been ongoing longer than the value of this config entry, that multipart info is considered as expired (e.g. due to client crash). Unit could be defined with postfix (ns,ms,s,m,h,d) |
ozone.om.open.mpu.parts.cleanup.limit.per.task | 1000 | OZONE, OM, PERFORMANCE | The maximum number of parts, rounded up to the nearest number of expired multipart upload. This property is used to approximately throttle the number of MPU parts sent to the OM. |
ozone.om.ratis.log.appender.queue.byte-limit | 32MB | OZONE, DEBUG, OM, RATIS | Byte limit for Raft's Log Worker queue. |
ozone.om.ratis.log.appender.queue.num-elements | 1024 | OZONE, DEBUG, OM, RATIS | Number of operation pending with Raft's Log Worker. |
ozone.om.ratis.log.purge.gap | 1000000 | OZONE, OM, RATIS | The minimum gap between log indices for Raft server to purge its log segments after taking snapshot. |
ozone.om.ratis.log.purge.preservation.log.num | 0 | OZONE, OM, RATIS | The number of latest Raft logs to not be purged after taking snapshot. |
ozone.om.ratis.log.purge.upto.snapshot.index | true | OZONE, OM, RATIS | Enable/disable Raft server to purge its log up to the snapshot index after taking snapshot. |
ozone.om.ratis.minimum.timeout | 5s | OZONE, OM, RATIS, MANAGEMENT | The minimum timeout duration for OM's Ratis server rpc. |
ozone.om.ratis.port | 9872 | OZONE, OM, RATIS | The port number of the OzoneManager's Ratis server. |
ozone.om.ratis.rpc.type | GRPC | OZONE, OM, RATIS, MANAGEMENT | Ratis supports different kinds of transports like netty, GRPC, Hadoop RPC etc. This picks one of those for this cluster. |
ozone.om.ratis.segment.preallocated.size | 4MB | OZONE, OM, RATIS, PERFORMANCE | The size of the buffer which is preallocated for raft segment used by Apache Ratis on OM. (4 MB by default) |
ozone.om.ratis.segment.size | 64MB | OZONE, OM, RATIS, PERFORMANCE | The size of the raft segment used by Apache Ratis on OM. (64 MB by default) |
ozone.om.ratis.server.close.threshold | 60s | OZONE, OM, RATIS | Raft Server will close if JVM pause longer than the threshold. |
ozone.om.ratis.server.failure.timeout.duration | 120s | OZONE, OM, RATIS, MANAGEMENT | The timeout duration for ratis server failure detection, once the threshold has reached, the ratis state machine will be informed about the failure in the ratis ring. |
ozone.om.ratis.server.leaderelection.pre-vote | true | OZONE, OM, RATIS, MANAGEMENT | Enable/disable OM HA leader election pre-vote phase. |
ozone.om.ratis.server.pending.write.element-limit | 4096 | OZONE, DEBUG, OM, RATIS | Maximum number of pending write requests. |
ozone.om.ratis.server.request.timeout | 3s | OZONE, OM, RATIS, MANAGEMENT | The timeout duration for OM's ratis server request . |
ozone.om.ratis.server.retry.cache.timeout | 600000ms | OZONE, OM, RATIS, MANAGEMENT | Retry Cache entry timeout for OM's ratis server. |
ozone.om.ratis.snapshot.dir | OZONE, OM, STORAGE, MANAGEMENT, RATIS | This directory is used for storing OM's snapshot related files like the ratisSnapshotIndex and DB checkpoint from leader OM. If undefined, OM snapshot dir will fallback to ozone.metadata.dirs. This fallback approach is not recommended for production environments. | |
ozone.om.ratis.snapshot.max.total.sst.size | 10737418240 | OZONE, OM, RATIS | Max size of SST files in OM Ratis Snapshot tarball. |
ozone.om.ratis.storage.dir | OZONE, OM, STORAGE, MANAGEMENT, RATIS | This directory is used for storing OM's Ratis metadata like logs. If this is not set then default metadata dirs is used. A warning will be logged if this not set. Ideally, this should be mapped to a fast disk like an SSD. If undefined, OM ratis storage dir will fallback to ozone.metadata.dirs. This fallback approach is not recommended for production environments. | |
ozone.om.read.threadpool | 10 | OM, PERFORMANCE | The number of threads in RPC server reading from the socket for OM service endpoints. This config overrides Hadoop configuration "ipc.server.read.threadpool.size" for Ozone Manager. |
ozone.om.s3.grpc.server_enabled | true | OZONE, OM, S3GATEWAY | Property to enable or disable Ozone Manager gRPC endpoint for clients. Right now, it is used by S3 Gateway only. |
ozone.om.save.metrics.interval | 5m | OZONE, OM | Time interval used to store the omMetrics in to a file. Background thread periodically stores the OM metrics in to a file. Unit could be defined with postfix (ns,ms,s,m,h,d) |
ozone.om.security.admin.protocol.acl | * | SECURITY | Comma separated list of users and groups allowed to access ozone manager admin protocol. |
ozone.om.security.client.protocol.acl | * | SECURITY | Comma separated list of users and groups allowed to access client ozone manager protocol. |
ozone.om.server.list.max.size | 1000 | OM, OZONE | Configuration property to configure the max server side response size for list calls on om. |
ozone.om.service.ids | OM, HA | Comma-separated list of OM service Ids. This property allows the client to figure out quorum of OzoneManager address. | |
ozone.om.snapshot.cache.cleanup.service.run.interval | 1m | OZONE, OM | Interval at which snapshot cache clean up will run. Uses millisecond by default when no time unit is specified. |
ozone.om.snapshot.cache.max.size | 10 | OZONE, OM | Size of the OM Snapshot LRU cache. This is a soft limit of open OM Snapshot RocksDB instances that will be held. The actual number of cached instance could exceed this limit if more than this number of snapshot instances are still in-use by snapDiff or other tasks. |
ozone.om.snapshot.checkpoint.dir.creation.poll.timeout | 20s | OZONE, PERFORMANCE, OM | Max poll timeout for snapshot dir exists check performed before loading a snapshot in cache. Unit defaults to millisecond if a unit is not specified. |
ozone.om.snapshot.compact.non.snapshot.diff.tables | false | OZONE, OM, PERFORMANCE | Enable or disable compaction of tables that are not tracked by snapshot diff when their snapshots are evicted from cache. |
ozone.om.snapshot.compaction.dag.max.time.allowed | 30d | OZONE, OM | Maximum time a snapshot is allowed to be in compaction DAG before it gets pruned out by pruning daemon. Uses millisecond by default when no time unit is specified. |
ozone.om.snapshot.compaction.dag.prune.daemon.run.interval | 10m | OZONE, OM | Interval at which compaction DAG pruning daemon thread is running to remove older snapshots with compaction history from compaction DAG. Uses millisecond by default when no time unit is specified. |
ozone.om.snapshot.db.max.open.files | 100 | OZONE, OM | Max number of open files for each snapshot db present in the snapshot cache. Essentially sets max_open_files config for RocksDB instances opened for Ozone snapshots. This will limit the total number of files opened by a snapshot db thereby limiting the total number of open file handles by snapshot dbs. Max total number of open handles = (snapshot cache size * max open files) |
ozone.om.snapshot.diff.cleanup.service.run.interval | 1m | OZONE, OM | Interval at which snapshot diff clean up service will run. Uses millisecond by default when no time unit is specified. |
ozone.om.snapshot.diff.cleanup.service.timeout | 5m | OZONE, OM | Timeout for snapshot diff clean up service. Uses millisecond by default when no time unit is specified. |
ozone.om.snapshot.diff.db.dir | OZONE, OM | Directory where the OzoneManager stores the snapshot diff related data. This should be specified as a single directory. If the directory does not exist then the OM will attempt to create it. If undefined, then the OM will log a warning and fallback to ozone.metadata.dirs. This fallback approach is not recommended for production environments. | |
ozone.om.snapshot.diff.disable.native.libs | false | OZONE, OM | Flag to perform snapshot diff without using native libs(can be slow). |
ozone.om.snapshot.diff.job.default.wait.time | 1m | OZONE, OM | Default wait time returned to client to wait before retrying snap diff request. Uses millisecond by default when no time unit is specified. |
ozone.om.snapshot.diff.job.report.persistent.time | 7d | OZONE, OM | Maximum time a successful snapshot diff job and its report will be persisted. Uses millisecond by default when no time unit is specified. |
ozone.om.snapshot.diff.max.allowed.keys.changed.per.job | 10000000 | OZONE, OM | Max numbers of keys changed allowed for a snapshot diff job. |
ozone.om.snapshot.diff.max.jobs.purge.per.task | 100 | OZONE, OM | Maximum number of snapshot diff jobs to be purged per snapDiff clean up run. |
ozone.om.snapshot.diff.max.page.size | 1000 | OZONE, OM | Maximum number of entries to be returned in a single page of snap diff report. |
ozone.om.snapshot.diff.thread.pool.size | 10 | OZONE, OM | Maximum numbers of concurrent snapshot diff jobs are allowed. |
ozone.om.snapshot.directory.metrics.update.interval | 5m | OZONE, OM | Time interval used to update the space consumption stats of the Ozone Manager snapshot directories. Background thread periodically calculates and updates these stats. Unit could be defined with postfix (ns,ms,s,m,h,d) |
ozone.om.snapshot.force.full.diff | false | OZONE, OM | Flag to always perform full snapshot diff (can be slow) without using the optimised compaction DAG. |
ozone.om.snapshot.load.native.lib | true | OZONE, OM | Load native library for performing optimized snapshot diff. |
ozone.om.snapshot.local.data.manager.service.interval | 5m | Interval for cleaning up orphan snapshot local data versions corresponding to snapshots | |
ozone.om.snapshot.provider.connection.timeout | 5000s | OZONE, OM, HA, MANAGEMENT | Connection timeout for HTTP call made by OM Snapshot Provider to request OM snapshot from OM Leader. |
ozone.om.snapshot.provider.request.timeout | 300000ms | OZONE, OM, HA, MANAGEMENT | Connection request timeout for HTTP call made by OM Snapshot Provider to request OM snapshot from OM Leader. |
ozone.om.snapshot.provider.socket.timeout | 5000s | OZONE, OM, HA, MANAGEMENT | Socket timeout for HTTP call made by OM Snapshot Provider to request OM snapshot from OM Leader. |
ozone.om.snapshot.prune.compaction.backup.batch.size | 2000 | OZONE, OM | Prune SST files in Compaction backup directory in batches every ozone.om.snapshot.compaction.dag.prune.daemon.run.interval. |
ozone.om.snapshot.rocksdb.metrics.enabled | false | OZONE, OM | Skip collecting RocksDBStore metrics for Snapshotted DB. |
ozone.om.transport.class | org.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransportFactory | OM, MANAGEMENT | Property to determine the transport protocol for the client to Ozone Manager channel. |
ozone.om.unflushed.transaction.max.count | 10000 | OZONE, OM | the unflushed transactions here are those requests that have been applied to OM state machine but not been flushed to OM rocksdb. when OM meets high concurrency-pressure and flushing is not fast enough, too many pending requests will be hold in memory and will lead to long GC of OM, which will slow down flushing further. there are some cases that flushing is slow, for example, 1 rocksdb is on a HDD, which has poor IO performance than SSD. 2 a big compaction is happening internally in rocksdb and write stall of rocksdb happens. 3 long GC, which may caused by other factors. the property is to limit the max count of unflushed transactions, so that the maximum memory occupied by unflushed transactions is limited. |
ozone.om.upgrade.finalization.ratis.based.timeout | 30s | OM, UPGRADE | Maximum time to wait for a slow follower to be finalized through a Ratis snapshot. This is an advanced config, and needs to be changed only under a special circumstance when the leader OM has purged the finalize request from its logs, and a follower OM was down during upgrade finalization. Default is 30s. |
ozone.om.upgrade.quota.recalculate.enabled | true | OZONE, OM | quota recalculation trigger when upgrade to the layout version QUOTA. while upgrade, re-calculation of quota used will block write operation to existing buckets till this operation is completed. |
ozone.om.user.max.volume | 1024 | OM, MANAGEMENT | The maximum number of volumes a user can have on a cluster.Increasing or decreasing this number has no real impact on ozone cluster. This is defined only for operational purposes. Only an administrator can create a volume, once a volume is created there are no restrictions on the number of buckets or keys inside each bucket a user can create. |
ozone.om.user.rights | ALL | OM, SECURITY | Default user permissions set for an object in OzoneManager. |
ozone.om.volume.listall.allowed | true | OM, MANAGEMENT | Allows everyone to list all volumes when set to true. Defaults to true. When set to false, non-admin users can only list the volumes they have access to. Admins can always list all volumes. Note that this config only applies to OzoneNativeAuthorizer. For other authorizers, admin needs to set policies accordingly to allow all volume listing e.g. for Ranger, a new policy with special volume "/" can be added to allow group public LIST access. |
ozone.path.deleting.limit.per.task | 20000 | OZONE, PERFORMANCE, OM | A maximum number of paths(dirs/files) to be deleted by directory deleting service per time interval. |
ozone.readonly.administrators | Ozone read only admin users delimited by the comma. If set, This is the list of users are allowed to read operations skip checkAccess. | ||
ozone.readonly.administrators.groups | Ozone read only admin groups delimited by the comma. If set, This is the list of groups are allowed to read operations skip checkAccess. | ||
ozone.recon.address | RECON, MANAGEMENT | RPC address of Recon Server. If not set, datanodes will not configure Recon Server. | |
ozone.recon.administrators | RECON, SECURITY | Recon administrator users delimited by a comma. This is the list of users who can access admin only information from recon. Users defined in ozone.administrators will always be able to access all recon information regardless of this setting. | |
ozone.recon.administrators.groups | RECON, SECURITY | Recon administrator groups delimited by a comma. This is the list of groups who can access admin only information from recon. It is enough to either have the name defined in ozone.recon.administrators or be directly or indirectly in a group defined in this property. | |
ozone.recon.containerkey.flush.db.max.threshold | 150000 | OZONE, RECON, PERFORMANCE | Maximum threshold number of entries to hold in memory for Container Key Mapper task in hashmap before flushing to recon rocks DB containerKeyTable |
ozone.recon.db.dir | OZONE, RECON, STORAGE, PERFORMANCE | Directory where the Recon Server stores its metadata. This should be specified as a single directory. If the directory does not exist then the Recon will attempt to create it. If undefined, then the Recon will log a warning and fallback to ozone.metadata.dirs. This fallback approach is not recommended for production environments. | |
ozone.recon.db.dirs.permissions | 700 | Permissions for the metadata directories for Recon. The permissions can either be octal or symbolic. If the default permissions are not set then the default value of 700 will be used. | |
ozone.recon.dn.metrics.collection.minimum.api.delay | 30s | OZONE, RECON, DN | Minimum delay in API to start a new task for Jmx collection. It behaves like a rate limiter to avoid unnecessary task creation. |
ozone.recon.dn.metrics.collection.timeout | 10m | OZONE, RECON, DN | Maximum time taken for the api to complete. If it exceeds pending tasks will be cancelled. |
ozone.recon.filesizecount.flush.db.max.threshold | 200000 | OZONE, RECON, PERFORMANCE | Maximum threshold number of entries to hold in memory for File Size Count task in hashmap before flushing to recon derby DB |
ozone.recon.heatmap.enable | false | OZONE, RECON | To enable/disable recon heatmap feature. Along with this config, user must also provide the implementation of "org.apache.hadoop.ozone.recon.heatmap.IHeatMapProvider" interface and configure in "ozone.recon.heatmap.provider" configuration. |
ozone.recon.heatmap.provider | OZONE, RECON | Fully qualified heatmap provider implementation class name. If this value is not set, then HeatMap feature will be disabled and not exposed in Recon UI. Please refer Ozone doc for more details regarding the implementation of "org.apache.hadoop.ozone.recon.heatmap.IHeatMapProvider" interface. | |
ozone.recon.http-address | 0.0.0.0:9888 | RECON, MANAGEMENT | The address and the base port where the Recon web UI will listen on. If the port is 0, then the server will start on a free port. However, it is best to specify a well-known port, so it is easy to connect and see the Recon management UI. |
ozone.recon.http-bind-host | 0.0.0.0 | RECON, MANAGEMENT | The actual address the Recon server will bind to. If this optional the address is set, it overrides only the hostname portion of ozone.recon.http-address. |
ozone.recon.http.auth.kerberos.keytab | /etc/security/keytabs/HTTP.keytab | RECON, SECURITY, KERBEROS | The keytab file for HTTP Kerberos authentication in Recon. |
ozone.recon.http.auth.kerberos.principal | HTTP/_HOST@REALM | RECON, SECURITY, KERBEROS | The server principal used by Ozone Recon server. This is typically set to HTTP/_HOST@REALM.TLD The SPNEGO server principal begins with the prefix HTTP/ by convention. |
ozone.recon.http.auth.type | simple | RECON, SECURITY, KERBEROS | simple or kerberos. If kerberos is set, SPNEGO will be used for http authentication. |
ozone.recon.http.enabled | true | RECON, MANAGEMENT | Property to enable or disable Recon web user interface. |
ozone.recon.https-address | 0.0.0.0:9889 | RECON, MANAGEMENT, SECURITY | The address and the base port where the Recon web UI will listen on using HTTPS. If the port is 0 then the server will start on a free port. |
ozone.recon.https-bind-host | 0.0.0.0 | RECON, MANAGEMENT, SECURITY | The actual address the Recon web server will bind to using HTTPS. If this optional address is set, it overrides only the hostname portion of ozone.recon.https-address. |
ozone.recon.kerberos.keytab.file | SECURITY, RECON, OZONE | The keytab file used by Recon daemon to login as its service principal. | |
ozone.recon.kerberos.principal | SECURITY, RECON, OZONE | This Kerberos principal is used by the Recon service. | |
ozone.recon.nssummary.flush.db.max.threshold | 150000 | OZONE, RECON, PERFORMANCE | Maximum threshold number of entries to hold in memory for NSSummary task in hashmap before flushing to recon rocks DB namespaceSummaryTable |
ozone.recon.om.connection.request.timeout | 5000 | OZONE, RECON, OM | Connection request timeout in milliseconds for HTTP call made by Recon to request OM DB snapshot. |
ozone.recon.om.connection.timeout | 5s | OZONE, RECON, OM | Connection timeout for HTTP call in milliseconds made by Recon to request OM snapshot. |
ozone.recon.om.db.dir | OZONE, RECON, STORAGE | Directory where the Recon Server stores its OM snapshot DB. This should be specified as a single directory. If the directory does not exist then the Recon will attempt to create it. If undefined, then the Recon will log a warning and fallback to ozone.metadata.dirs. This fallback approach is not recommended for production environments. | |
ozone.recon.om.event.buffer.capacity | 20000 | OZONE, RECON, OM, PERFORMANCE | Maximum capacity of the event buffer used by Recon to queue OM delta updates during task reinitialization. When tasks are being reprocessed on staging DB, this buffer holds incoming delta updates to prevent blocking the OM sync process. If the buffer overflows, task reinitialization will be triggered. |
ozone.recon.om.snapshot.task.flush.param | false | OZONE, RECON, OM | Request to flush the OM DB before taking checkpoint snapshot. |
ozone.recon.om.snapshot.task.initial.delay | 1m | OZONE, RECON, OM | Initial delay in MINUTES by Recon to request OM DB Snapshot. |
ozone.recon.om.snapshot.task.interval.delay | 5s | OZONE, RECON, OM | Interval in SECONDS by Recon to request OM DB Snapshot. |
ozone.recon.om.socket.timeout | 5s | OZONE, RECON, OM | Socket timeout in milliseconds for HTTP call made by Recon to request OM snapshot. |
ozone.recon.scm.connection.request.timeout | 5s | OZONE, RECON, SCM | Connection request timeout in milliseconds for HTTP call made by Recon to request SCM DB snapshot. |
ozone.recon.scm.connection.timeout | 5s | OZONE, RECON, SCM | Connection timeout for HTTP call in milliseconds made by Recon to request SCM snapshot. |
ozone.recon.scm.container.threshold | 100 | OZONE, RECON, SCM | Threshold value for the difference in number of containers in SCM and RECON. |
ozone.recon.scm.snapshot.enabled | true | OZONE, RECON, SCM | If enabled, SCM DB Snapshot is taken by Recon. |
ozone.recon.scm.snapshot.task.initial.delay | 1m | OZONE, MANAGEMENT, RECON | Initial delay in MINUTES by Recon to request SCM DB Snapshot. |
ozone.recon.scm.snapshot.task.interval.delay | 24h | OZONE, MANAGEMENT, RECON | Interval in MINUTES by Recon to request SCM DB Snapshot. |
ozone.recon.scmclient.failover.max.retry | 3 | OZONE, RECON, SCM | Max retry count for SCM Client when failover happens. |
ozone.recon.scmclient.max.retry.timeout | 6s | OZONE, RECON, SCM | Max retry timeout for SCM Client when Recon connects to SCM. This config is used to dynamically compute the max retry count for SCM Client when failover happens. Check the SCMClientConfig class getRetryCount method. |
ozone.recon.scmclient.rpc.timeout | 1m | OZONE, RECON, SCM | RpcClient timeout on waiting for the response from SCM when Recon connects to SCM. |
ozone.recon.security.client.datanode.container.protocol.acl | * | SECURITY, RECON, OZONE | Comma separated acls (users, groups) allowing clients accessing datanode container protocol |
ozone.recon.sql.db.auto.commit | true | STORAGE, RECON, OZONE | Sets the Ozone Recon database connection property of auto-commit to true/false. |
ozone.recon.sql.db.conn.idle.max.age | 3600s | STORAGE, RECON, OZONE | Sets maximum time to live for idle connection in seconds. |
ozone.recon.sql.db.conn.idle.test | SELECT 1 | STORAGE, RECON, OZONE | The query to send to the DB to maintain keep-alives and test for dead connections. |
ozone.recon.sql.db.conn.idle.test.period | 60s | STORAGE, RECON, OZONE | Sets maximum time to live for idle connection in seconds. |
ozone.recon.sql.db.conn.max.active | 5 | STORAGE, RECON, OZONE | The max active connections to the SQL database. |
ozone.recon.sql.db.conn.max.age | 1800s | STORAGE, RECON, OZONE | Sets maximum time a connection can be active in seconds. |
ozone.recon.sql.db.conn.timeout | 30000ms | STORAGE, RECON, OZONE | Sets time in milliseconds before call to getConnection is timed out. |
ozone.recon.sql.db.driver | org.apache.derby.jdbc.EmbeddedDriver | STORAGE, RECON, OZONE | Recon SQL DB driver class. Defaults to Derby. |
ozone.recon.sql.db.jdbc.url | jdbc:derby:${ozone.recon.db.dir}/ozone_recon_derby.db | STORAGE, RECON, OZONE | Ozone Recon SQL database jdbc url. |
ozone.recon.sql.db.jooq.dialect | DERBY | STORAGE, RECON, OZONE | Recon internally uses Jooq to talk to its SQL DB. By default, we support Derby and Sqlite out of the box. Please refer to https://www.jooq.org/javadoc/latest/org.jooq/org/jooq/SQLDialect.html to specify different dialect. |
ozone.recon.sql.db.password | STORAGE, RECON, OZONE | Ozone Recon SQL database password. | |
ozone.recon.sql.db.username | STORAGE, RECON, OZONE | Ozone Recon SQL database username. | |
ozone.recon.task.containercounttask.interval | 60s | RECON, OZONE | The time interval to wait between each runs of container count task. |
ozone.recon.task.missingcontainer.interval | 300s | RECON, OZONE | The time interval of the periodic check for unhealthy containers in the cluster as reported by Datanodes. |
ozone.recon.task.pipelinesync.interval | 300s | RECON, OZONE | The time interval of periodic sync of pipeline state from SCM to Recon. |
ozone.recon.task.reprocess.max.iterators | 5 | OZONE, RECON, PERFORMANCE | Maximum number of iterator threads to use for parallel table iteration during reprocess |
ozone.recon.task.reprocess.max.keys.in.memory | 2000 | OZONE, RECON, PERFORMANCE | Maximum number of keys to batch in memory before handing to worker threads during parallel reprocess |
ozone.recon.task.reprocess.max.workers | 20 | OZONE, RECON, PERFORMANCE | Maximum number of worker threads to use for parallel table processing during reprocess |
ozone.recon.task.safemode.wait.threshold | 300s | RECON, OZONE | The time interval to wait for starting container health task and pipeline sync task before recon exits out of safe or warmup mode. |
ozone.recon.task.thread.count | 1 | OZONE, RECON | The number of Recon Tasks that are waiting on updates from OM. |
ozone.replication.allowed-configs | ^((STANDALONE|RATIS)/(ONE|THREE))|(EC/(3-2|6-3|10-4)-(512|1024|2048|4096)k)$ | STORAGE | Regular expression to restrict enabled replication schemes |
ozone.rest.client.http.connection.max | 100 | OZONE, CLIENT | This defines the overall connection limit for the connection pool used in RestClient. |
ozone.rest.client.http.connection.per-route.max | 20 | OZONE, CLIENT | This defines the connection limit per one HTTP route/host. Total max connection is limited by ozone.rest.client.http.connection.max property. |
ozone.s3.administrators | OZONE, SECURITY | S3 administrator users delimited by a comma. This is the list of users who can access admin only information from s3. If this property is empty then ozone.administrators will be able to access all s3 information regardless of this setting. | |
ozone.s3.administrators.groups | OZONE, SECURITY | S3 administrator groups delimited by a comma. This is the list of groups who can access admin only information from S3. It is enough to either have the name defined in ozone.s3.administrators or be directly or indirectly in a group defined in this property. | |
ozone.s3g.client.buffer.size | 4MB | OZONE, S3GATEWAY | The size of the buffer which is for read block. (4MB by default). |
ozone.s3g.default.bucket.layout | OBJECT_STORE | OZONE, S3GATEWAY | The bucket layout that will be used when buckets are created through the S3 API. |
ozone.s3g.domain.name | OZONE, S3GATEWAY | List of Ozone S3Gateway domain names. If multiple domain names to be provided, they should be a "," separated. This parameter is only required when virtual host style pattern is followed. | |
ozone.s3g.http-address | 0.0.0.0:9878 | OZONE, S3GATEWAY | The address and the base port where the Ozone S3Gateway Server will listen on. |
ozone.s3g.http-bind-host | 0.0.0.0 | OZONE, S3GATEWAY | The actual address the HTTP server will bind to. If this optional address is set, it overrides only the hostname portion of ozone.s3g.http-address. This is useful for making the Ozone S3Gateway HTTP server listen on all interfaces by setting it to 0.0.0.0. |
ozone.s3g.http.auth.kerberos.keytab | /etc/security/keytabs/HTTP.keytab | OZONE, S3GATEWAY, SECURITY, KERBEROS | The keytab file used by the S3Gateway server to login as its service principal. |
ozone.s3g.http.auth.kerberos.principal | HTTP/_HOST@REALM | OZONE, S3GATEWAY, SECURITY, KERBEROS | The server principal used by Ozone S3Gateway server. This is typically set to HTTP/_HOST@REALM.TLD The SPNEGO server principal begins with the prefix HTTP/ by convention. |
ozone.s3g.http.auth.type | simple | S3GATEWAY, SECURITY, KERBEROS | simple or kerberos. If kerberos is set, SPNEGO will be used for http authentication. |
ozone.s3g.http.enabled | true | OZONE, S3GATEWAY | The boolean which enables the Ozone S3Gateway server . |
ozone.s3g.https-address | 0.0.0.0:9879 | OZONE, S3GATEWAY | Ozone S3Gateway HTTPS server address and port. |
ozone.s3g.https-bind-host | OZONE, S3GATEWAY | The actual address the HTTPS server will bind to. If this optional address is set, it overrides only the hostname portion of ozone.s3g.https-address. This is useful for making the Ozone S3Gateway HTTPS server listen on all interfaces by setting it to 0.0.0.0. | |
ozone.s3g.kerberos.keytab.file | /etc/security/keytabs/s3g.keytab | OZONE, SECURITY, KERBEROS, S3GATEWAY | The keytab file used by S3Gateway daemon to login as its service principal. The principal name is configured with ozone.s3g.kerberos.principal. |
ozone.s3g.kerberos.principal | s3g/_HOST@REALM | OZONE, SECURITY, KERBEROS, S3GATEWAY | The S3Gateway service principal. Ex: s3g/_HOST@REALM.COM |
ozone.s3g.list-keys.shallow.enabled | true | OZONE, S3GATEWAY | If this is true, there will be efficiency optimization effects when calling s3g list interface with delimiter '/' parameter, especially when there are a large number of keys. |
ozone.s3g.list.max.keys.limit | 1000 | Maximum number of keys returned by S3 ListObjects/ListObjectsV2 API. AWS default is 1000. Can be overridden per deployment in ozone-site.xml. | |
ozone.s3g.metrics.percentiles.intervals.seconds | 60 | S3GATEWAY, PERFORMANCE | Specifies the interval in seconds for the rollover of MutableQuantiles metrics. Setting this interval equal to the metrics sampling time ensures more detailed metrics. |
ozone.s3g.secret.http.auth.type | kerberos | S3GATEWAY, SECURITY, KERBEROS | simple or kerberos. If kerberos is set, Kerberos SPNEOGO will be used for http authentication. |
ozone.s3g.secret.http.enabled | false | OZONE, S3GATEWAY | The boolean which enables the Ozone S3Gateway Secret endpoint. |
ozone.s3g.volume.name | s3v | OZONE, S3GATEWAY | The volume name to access through the s3gateway. |
ozone.s3g.webadmin.http-address | 0.0.0.0:19878 | OZONE, S3GATEWAY | The address and port where Ozone S3Gateway serves web content. |
ozone.s3g.webadmin.http-bind-host | 0.0.0.0 | OZONE, S3GATEWAY | The actual address the HTTP server will bind to. If this optional address is set, it overrides only the hostname portion of ozone.s3g.webadmin.http-address. This is useful for making the Ozone S3Gateway HTTP server listen on all interfaces by setting it to 0.0.0.0. |
ozone.s3g.webadmin.http.enabled | true | OZONE, S3GATEWAY | This option can be used to disable the web server which serves additional content in Ozone S3 Gateway. |
ozone.s3g.webadmin.https-address | 0.0.0.0:19879 | OZONE, S3GATEWAY | Ozone S3Gateway content server's HTTPS address and port. |
ozone.s3g.webadmin.https-bind-host | OZONE, S3GATEWAY | The actual address the HTTPS server will bind to. If this optional address is set, it overrides only the hostname portion of ozone.s3g.webadmin.https-address. This is useful for making the Ozone S3Gateway HTTPS server listen on all interfaces by setting it to 0.0.0.0. | |
ozone.scm.block.client.address | OZONE, SCM | The address of the Ozone SCM block client service. If not defined value of ozone.scm.client.address is used. | |
ozone.scm.block.client.bind.host | 0.0.0.0 | OZONE, SCM | The hostname or IP address used by the SCM block client endpoint to bind. |
ozone.scm.block.client.port | 9863 | OZONE, SCM | The port number of the Ozone SCM block client service. |
ozone.scm.block.deletion.per.dn.distribution.factor | 8 | OZONE, SCM | Factor with which number of delete blocks sent to each datanode in every interval. If total number of DNs are 100 and hdds.scm.block.deletion.per-interval.max is 500000 Then maximum 500000/(100/8) = 40000 blocks will be sent to each DN in every interval. |
ozone.scm.block.handler.count.key | 100 | OZONE, MANAGEMENT, PERFORMANCE | Used to set the number of RPC handlers when accessing blocks. The default value is 100. |
ozone.scm.block.read.threadpool | 10 | OZONE, MANAGEMENT, PERFORMANCE | The number of threads in RPC server reading from the socket when accessing blocks. This config overrides Hadoop configuration "ipc.server.read.threadpool.size" for SCMBlockProtocolServer. The default value is 10. |
ozone.scm.block.size | 256MB | OZONE, SCM | The default size of a scm block. This is maps to the default Ozone block size. |
ozone.scm.ca.list.retry.interval | 10s | OZONE, SCM, OM, DATANODE | SCM client wait duration between each retry to get Scm CA list. OM/Datanode obtain CA list during startup, and wait for the CA List size to be matched with SCM node count size plus 1. (Additional one certificate is root CA certificate). If the received CA list size is not matching with expected count, this is the duration used to wait before making next attempt to get CA list. |
ozone.scm.chunk.size | 4MB | OZONE, SCM, CONTAINER, PERFORMANCE | The chunk size for reading/writing chunk operations in bytes. The chunk size defaults to 4MB. If the value configured is more than the maximum size (32MB), it will be reset to the maximum size (32MB). This maps to the network packet sizes and file write operations in the client to datanode protocol. When tuning this parameter, flow control window parameter should be tuned accordingly. Refer to hdds.ratis.raft.grpc.flow.control.window for more information. |
ozone.scm.client.address | OZONE, SCM, REQUIRED | The address of the Ozone SCM client service. This is a required setting. It is a string in the host:port format. The port number is optional and defaults to 9860. | |
ozone.scm.client.bind.host | 0.0.0.0 | OZONE, SCM, MANAGEMENT | The hostname or IP address used by the SCM client endpoint to bind. This setting is used by the SCM only and never used by clients. The setting can be useful in multi-homed setups to restrict the availability of the SCM client service to a specific interface. The default is appropriate for most clusters. |
ozone.scm.client.handler.count.key | 100 | OZONE, MANAGEMENT, PERFORMANCE | Used to set the number of RPC handlers used by Client to access SCM. The default value is 100. |
ozone.scm.client.port | 9860 | OZONE, SCM, MANAGEMENT | The port number of the Ozone SCM client service. |
ozone.scm.client.read.threadpool | 10 | OZONE, MANAGEMENT, PERFORMANCE | The number of threads in RPC server reading from the socket used by Client to access SCM. This config overrides Hadoop configuration "ipc.server.read.threadpool.size" for SCMClientProtocolServer. The default value is 10. |
ozone.scm.close.container.wait.duration | 150s | SCM, OZONE, RECON | Wait duration before which close container is send to DN. |
ozone.scm.container.layout | FILE_PER_BLOCK | OZONE, SCM, CONTAINER, PERFORMANCE | Container layout defines how chunks, blocks and containers are stored on disk. Each chunk is stored separately with FILE_PER_CHUNK. All chunks of a block are stored in the same file with FILE_PER_BLOCK. The default is FILE_PER_BLOCK. |
ozone.scm.container.list.max.count | 4096 | OZONE, SCM, CONTAINER | The max number of containers info could be included in response of ListContainer request. |
ozone.scm.container.lock.stripes | 512 | OZONE, SCM, PERFORMANCE, MANAGEMENT | The number of stripes created for the container state manager lock. |
ozone.scm.container.placement.ec.impl | org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackScatter | OZONE, MANAGEMENT | The full name of class which implements org.apache.hadoop.hdds.scm.PlacementPolicy. The class decides which datanode will be used to host the container replica in EC mode. If not set, org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackScatter will be used as default value. |
ozone.scm.container.placement.impl | org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware | OZONE, MANAGEMENT | The full name of class which implements org.apache.hadoop.hdds.scm.PlacementPolicy. The class decides which datanode will be used to host the container replica. If not set, org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware will be used as default value. |
ozone.scm.container.size | 5GB | OZONE, PERFORMANCE, MANAGEMENT | Default container size used by Ozone. There are two considerations while picking this number. The speed at which a container can be replicated, determined by the network speed and the metadata that each container generates. So selecting a large number creates less SCM metadata, but recovery time will be more. 5GB is a number that maps to quick replication times in gigabit networks, but still balances the amount of metadata. |
ozone.scm.datanode.address | OZONE, MANAGEMENT | The address of the Ozone SCM service used for internal communication between the DataNodes and the SCM. It is a string in the host:port format. The port number is optional and defaults to 9861. This setting is optional. If unspecified then the hostname portion is picked from the ozone.scm.client.address setting and the default service port of 9861 is chosen. | |
ozone.scm.datanode.admin.monitor.interval | 30s | SCM | This sets how frequently the datanode admin monitor runs to check for nodes added to the admin workflow or removed from it. The progress of decommissioning and entering maintenance nodes is also checked to see if they have completed. |
ozone.scm.datanode.admin.monitor.logging.limit | 1000 | SCM | When a node is checked for decommission or maintenance, this setting controls how many degraded containers are logged on each pass. The limit is applied separately for each type of container, ie under-replicated and unhealthy will each have their own limit. |
ozone.scm.datanode.bind.host | OZONE, MANAGEMENT | The hostname or IP address used by the SCM service endpoint to bind. | |
ozone.scm.datanode.disallow.same.peers | false | OZONE, SCM, PIPELINE | Disallows same set of datanodes to participate in multiple pipelines when set to true. Default is set to false. |
ozone.scm.datanode.handler.count.key | 100 | OZONE, MANAGEMENT, PERFORMANCE | Used to set the number of RPC handlers used by DataNode to access SCM. The default value is 100. |
ozone.scm.datanode.id.dir | OZONE, MANAGEMENT | The path that datanodes will use to store the datanode ID. If this value is not set, then datanode ID is created under the metadata directory. | |
ozone.scm.datanode.pipeline.limit | 2 | OZONE, SCM, PIPELINE | Max number of pipelines per datanode can be engaged in. Setting the value to 0 means the pipeline limit per dn will be determined by the no of metadata volumes reported per dn. |
ozone.scm.datanode.port | 9861 | OZONE, MANAGEMENT | The port number of the Ozone SCM service. |
ozone.scm.datanode.ratis.volume.free-space.min | 1GB | OZONE, DATANODE | Minimum amount of storage space required for each ratis volume on a datanode to hold a new pipeline. Datanodes with all its ratis volumes with space under this value will not be allocated a pipeline or container replica. |
ozone.scm.datanode.read.threadpool | 10 | OZONE, MANAGEMENT, PERFORMANCE | The number of threads in RPC server reading from the socket used by DataNode to access SCM. This config overrides Hadoop configuration "ipc.server.read.threadpool.size" for SCMDatanodeProtocolServer. The default value is 10. |
ozone.scm.db.dirs | OZONE, SCM, STORAGE, PERFORMANCE | Directory where the StorageContainerManager stores its metadata. This should be specified as a single directory. If the directory does not exist then the SCM will attempt to create it. If undefined, then the SCM will log a warning and fallback to ozone.metadata.dirs. This fallback approach is not recommended for production environments. | |
ozone.scm.db.dirs.permissions | 700 | Permissions for the metadata directories for Storage Container Manager. The permissions can either be octal or symbolic. If the default permissions are not set then the default value of 700 will be used. | |
ozone.scm.dead.node.interval | 10m | OZONE, MANAGEMENT | The interval between heartbeats before a node is tagged as dead. |
ozone.scm.default.service.id | OZONE, SCM, HA | Service ID of the SCM. If this is not set fall back to ozone.scm.service.ids to find the service ID it belongs to. | |
ozone.scm.ec.pipeline.minimum | 5 | STORAGE | The minimum number of pipelines to have open for each Erasure Coding configuration |
ozone.scm.ec.pipeline.per.volume.factor | 1 | SCM | TODO |
ozone.scm.event.ContainerReport.thread.pool.size | 10 | OZONE, SCM | Thread pool size configured to process container reports. |
ozone.scm.expired.container.replica.op.scrub.interval | 5m | OZONE, SCM, CONTAINER | SCM schedules a fixed interval job using the configured interval to scrub expired container replica operation. |
ozone.scm.grpc.port | 9895 | OZONE, SCM, HA, RATIS | The port number of the SCM's grpc server. |
ozone.scm.ha.dbtransactionbuffer.flush.interval | 60s | SCM, OZONE | Wait duration for flush of buffered transaction. |
ozone.scm.ha.grpc.deadline.interval | 30m | SCM, OZONE, HA, RATIS | Deadline for SCM DB checkpoint interval. |
ozone.scm.ha.raft.server.log.appender.wait-time.min | 0ms | OZONE, SCM, RATIS, PERFORMANCE | Minimum wait time between two appendEntries calls. |
ozone.scm.ha.raft.server.rpc.first-election.timeout | SCM, OZONE, HA, RATIS | ratis timeout for the first election of a leader. If not configured, fallback to ozone.scm.ha.ratis.leader.election.timeout. | |
ozone.scm.ha.ratis.leader.election.timeout | 5s | SCM, OZONE, HA, RATIS | The minimum timeout duration for SCM ratis leader election. Default is 1s. |
ozone.scm.ha.ratis.leader.ready.check.interval | 2s | SCM, OZONE, HA, RATIS | The interval between ratis server performing a leader readiness check. |
ozone.scm.ha.ratis.leader.ready.wait.timeout | 60s | SCM, OZONE, HA, RATIS | The minimum timeout duration for waiting for leader readiness. |
ozone.scm.ha.ratis.log.appender.queue.byte-limit | 32MB | SCM, OZONE, HA, RATIS | Byte limit for Raft's Log Worker queue. |
ozone.scm.ha.ratis.log.appender.queue.num-elements | 1024 | SCM, OZONE, HA, RATIS | Number of operation pending with Raft's Log Worker. |
ozone.scm.ha.ratis.log.purge.enabled | false | SCM, OZONE, HA, RATIS | whether enable raft log purge. |
ozone.scm.ha.ratis.log.purge.gap | 1000000 | SCM, OZONE, HA, RATIS | The minimum gap between log indices for Raft server to purge its log segments after taking snapshot. |
ozone.scm.ha.ratis.request.timeout | 30s | SCM, OZONE, HA, RATIS | The timeout duration for SCM's Ratis server RPC. |
ozone.scm.ha.ratis.rpc.type | GRPC | SCM, OZONE, HA, RATIS | Ratis supports different kinds of transports like netty, GRPC, Hadoop RPC etc. This picks one of those for this cluster. |
ozone.scm.ha.ratis.segment.preallocated.size | 4MB | SCM, OZONE, HA, RATIS | The size of the buffer which is preallocated for raft segment used by Apache Ratis on SCM. (4 MB by default) |
ozone.scm.ha.ratis.segment.size | 64MB | SCM, OZONE, HA, RATIS | The size of the raft segment used by Apache Ratis on SCM. (64 MB by default) |
ozone.scm.ha.ratis.server.failure.timeout.duration | 120s | SCM, OZONE, HA, RATIS | The timeout duration for ratis server failure detection, once the threshold has reached, the ratis state machine will be informed about the failure in the ratis ring. |
ozone.scm.ha.ratis.server.leaderelection.pre-vote | true | SCM, OZONE, HA, RATIS | Enable/disable SCM HA leader election pre-vote phase. |
ozone.scm.ha.ratis.server.retry.cache.timeout | 60s | SCM, OZONE, HA, RATIS | Retry Cache entry timeout for SCM's Ratis server. |
ozone.scm.ha.ratis.server.snapshot.creation.gap | 1024 | SCM, OZONE | Raft snapshot gap index after which snapshot can be taken. |
ozone.scm.ha.ratis.snapshot.dir | SCM, OZONE, HA, RATIS | The ratis snapshot dir location. | |
ozone.scm.ha.ratis.snapshot.threshold | 1000 | SCM, OZONE, HA, RATIS | The threshold to trigger a Ratis taking snapshot operation for SCM. |
ozone.scm.ha.ratis.storage.dir | OZONE, SCM, HA, RATIS | Storage directory used by SCM to write Ratis logs. | |
ozone.scm.handler.count.key | 100 | OZONE, MANAGEMENT, PERFORMANCE | The number of RPC handler threads for each SCM service endpoint. The default is appropriate for small clusters (tens of nodes). Set a value that is appropriate for the cluster size. Generally, HDFS recommends RPC handler count is set to 20 * log2(Cluster Size) with an upper limit of 200. However, Ozone SCM will not have the same amount of traffic as HDFS Namenode, so a value much smaller than that will work well too. To specify handlers for individual RPC servers, set the following configuration properties instead: ---- RPC type ---- : ---- Configuration properties ---- SCMClientProtocolServer : 'ozone.scm.client.handler.count.key' SCMBlockProtocolServer : 'ozone.scm.block.handler.count.key' SCMDatanodeProtocolServer: 'ozone.scm.datanode.handler.count.key' |
ozone.scm.heartbeat.log.warn.interval.count | 10 | OZONE, MANAGEMENT | Defines how frequently we will log the missing of a heartbeat to SCM. For example in the default case, we will write a warning message for each ten consecutive heartbeats that we miss to SCM. This helps in reducing clutter in a data node log, but trade off is that logs will have less of this statement. |
ozone.scm.heartbeat.rpc-retry-count | 15 | OZONE, MANAGEMENT | Retry count for the RPC from Datanode to SCM. The rpc-retry-interval is 1s by default. Make sure rpc-retry-count * (rpc-timeout + rpc-retry-interval) is less than hdds.heartbeat.interval. |
ozone.scm.heartbeat.rpc-retry-interval | 1s | OZONE, MANAGEMENT | Retry interval for the RPC from Datanode to SCM. Make sure rpc-retry-count * (rpc-timeout + rpc-retry-interval) is less than hdds.heartbeat.interval. |
ozone.scm.heartbeat.rpc-timeout | 5s | OZONE, MANAGEMENT | Timeout value for the RPC from Datanode to SCM. |
ozone.scm.heartbeat.thread.interval | 3s | OZONE, MANAGEMENT | When a heartbeat from the data node arrives on SCM, It is queued for processing with the time stamp of when the heartbeat arrived. There is a heartbeat processing thread inside SCM that runs at a specified interval. This value controls how frequently this thread is run. There are some assumptions build into SCM such as this value should allow the heartbeat processing thread to run at least three times more frequently than heartbeats and at least five times more than stale node detection time. If you specify a wrong value, SCM will gracefully refuse to run. For more info look at the node manager tests in SCM. In short, you don't need to change this. |
ozone.scm.http-address | 0.0.0.0:9876 | OZONE, MANAGEMENT | The address and the base port where the SCM web ui will listen on. If the port is 0 then the server will start on a free port. |
ozone.scm.http-bind-host | 0.0.0.0 | OZONE, MANAGEMENT | The actual address the SCM web server will bind to. If this optional address is set, it overrides only the hostname portion of ozone.scm.http-address. |
ozone.scm.http.enabled | true | OZONE, MANAGEMENT | Property to enable or disable SCM web ui. |
ozone.scm.https-address | 0.0.0.0:9877 | OZONE, MANAGEMENT | The address and the base port where the SCM web UI will listen on using HTTPS. If the port is 0 then the server will start on a free port. |
ozone.scm.https-bind-host | 0.0.0.0 | OZONE, MANAGEMENT | The actual address the SCM web server will bind to using HTTPS. If this optional address is set, it overrides only the hostname portion of ozone.scm.https-address. |
ozone.scm.info.wait.duration | 10m | OZONE, SCM, OM | Maximum amount of duration OM/SCM waits to get Scm Info/Scm signed cert during OzoneManager init/SCM bootstrap. |
ozone.scm.keyvalue.container.deletion-choosing.policy | org.apache.hadoop.ozone.container.common.impl.TopNOrderedContainerDeletionChoosingPolicy | OZONE, MANAGEMENT | The policy used for choosing desired keyvalue containers for block deletion. Datanode selects some containers to process block deletion in a certain interval defined by ozone.block.deleting.service.interval. The number of containers to process in each interval is defined by ozone.block.deleting.container.limit.per.interval. This property is used to configure the policy applied while selecting containers. There are two policies supporting now: RandomContainerDeletionChoosingPolicy and TopNOrderedContainerDeletionChoosingPolicy. org.apache.hadoop.ozone.container.common.impl.RandomContainerDeletionChoosingPolicy implements a simply random policy that to return a random list of containers. org.apache.hadoop.ozone.container.common.impl.TopNOrderedContainerDeletionChoosingPolicy implements a policy that choosing top count number of containers in a pending-deletion-blocks's num based descending order. |
ozone.scm.names | OZONE, REQUIRED | The value of this property is a set of DNS | DNS:PORT | IP Address | IP:PORT. Written as a comma separated string. e.g. scm1, scm2:8020, 7.7.7.7:7777. This property allows datanodes to discover where SCM is, so that datanodes can send heartbeat to SCM. | |
ozone.scm.network.topology.schema.file | network-topology-default.xml | OZONE, MANAGEMENT | The schema file defines the ozone network topology. We currently support xml(default) and yaml format. Refer to the samples in the topology awareness document for xml and yaml topology definition samples. |
ozone.scm.node.id | OZONE, SCM, HA | The ID of this SCM node. If the SCM node ID is not configured it is determined automatically by matching the local node's address with the configured address. If node ID is not deterministic from the configuration, then it is set to the scmId from the SCM version file. | |
ozone.scm.nodes.EXAMPLESCMSERVICEID | OZONE, SCM, HA | Comma-separated list of SCM node Ids for a given SCM service ID (eg. EXAMPLESCMSERVICEID). The SCM service ID should be the value (one of the values if there are multiple) set for the parameter ozone.scm.service.ids. Unique identifiers for each SCM Node, delimited by commas. This will be used by SCMs in HA setup to determine all the SCMs belonging to the same SCM in the cluster. For example, if you used “scmService1” as the SCM service ID previously, and you wanted to use “scm1”, “scm2” and "scm3" as the individual IDs of the SCMs, you would configure a property ozone.scm.nodes.scmService1, and its value "scm1,scm2,scm3". | |
ozone.scm.pipeline.allocated.timeout | 5m | OZONE, SCM, PIPELINE | Timeout for every pipeline to stay in ALLOCATED stage. When pipeline is created, it should be at OPEN stage once pipeline report is successfully received by SCM. If a pipeline stays at ALLOCATED longer than the specified period of time, it should be scrubbed so that new pipeline can be created. This timeout is for how long pipeline can stay at ALLOCATED stage until it gets scrubbed. |
ozone.scm.pipeline.creation.auto.factor.one | true | OZONE, SCM, PIPELINE | If enabled, SCM will auto create RATIS factor ONE pipeline. |
ozone.scm.pipeline.creation.interval | 120s | OZONE, SCM, PIPELINE | SCM schedules a fixed interval job using the configured interval to create pipelines. |
ozone.scm.pipeline.destroy.timeout | 66s | OZONE, SCM, PIPELINE | Once a pipeline is closed, SCM should wait for the above configured time before destroying a pipeline. |
ozone.scm.pipeline.leader-choose.policy | org.apache.hadoop.hdds.scm.pipeline.leader.choose.algorithms.MinLeaderCountChoosePolicy | OZONE, SCM, PIPELINE | The policy used for choosing desired leader for pipeline creation. There are two policies supporting now: DefaultLeaderChoosePolicy, MinLeaderCountChoosePolicy. org.apache.hadoop.hdds.scm.pipeline.leader.choose.algorithms.DefaultLeaderChoosePolicy implements a policy that choose leader without depending on priority. org.apache.hadoop.hdds.scm.pipeline.leader.choose.algorithms.MinLeaderCountChoosePolicy implements a policy that choose leader which has the minimum exist leader count. In the future, we need to add policies which consider: 1. resource, the datanode with the most abundant cpu and memory can be made the leader 2. topology, the datanode nearest to the client can be made the leader |
ozone.scm.pipeline.owner.container.count | 3 | OZONE, SCM, PIPELINE | Number of containers per owner per disk in a pipeline. |
ozone.scm.pipeline.per.metadata.disk | 2 | OZONE, SCM, PIPELINE | Number of pipelines to be created per raft log disk. |
ozone.scm.pipeline.scrub.interval | 5m | OZONE, SCM, PIPELINE | SCM schedules a fixed interval job using the configured interval to scrub pipelines. |
ozone.scm.primordial.node.id | OZONE, SCM, HA | optional config, if being set will cause scm --init to only take effect on the specific node and ignore scm --bootstrap cmd. Similarly, scm --init will be ignored on the non-primordial scm nodes. The config can either be set equal to the hostname or the node id of any of the scm nodes. With the config set, applications/admins can safely execute init and bootstrap commands safely on all scm instances. If a cluster is upgraded from non-ratis to ratis based SCM, scm --init needs to re-run for switching from non-ratis based SCM to ratis-based SCM on the primary node. | |
ozone.scm.ratis.pipeline.limit | 0 | OZONE, SCM, PIPELINE | Upper limit for how many pipelines can be OPEN in SCM. 0 as default means there is no limit. Otherwise, the number is the limit of max amount of pipelines which are OPEN. |
ozone.scm.ratis.port | 9894 | OZONE, SCM, HA, RATIS | The port number of the SCM's Ratis server. |
ozone.scm.security.handler.count.key | 2 | OZONE, HDDS, SECURITY | Threads configured for SCMSecurityProtocolServer. |
ozone.scm.security.read.threadpool | 1 | OZONE, HDDS, SECURITY, PERFORMANCE | The number of threads in RPC server reading from the socket when performing security related operations with SCM. This config overrides Hadoop configuration "ipc.server.read.threadpool.size" for SCMSecurityProtocolServer. The default value is 1. |
ozone.scm.security.service.address | OZONE, HDDS, SECURITY | Address of SCMSecurityProtocolServer. | |
ozone.scm.security.service.bind.host | 0.0.0.0 | OZONE, HDDS, SECURITY | SCM security server host. |
ozone.scm.security.service.port | 9961 | OZONE, HDDS, SECURITY | SCM security server port. |
ozone.scm.sequence.id.batch.size | 1000 | OZONE, SCM | SCM allocates sequence id in a batch way. This property determines how many ids will be allocated in a single batch. |
ozone.scm.service.ids | OZONE, SCM, HA | Comma-separated list of SCM service Ids. This property allows the client to figure out quorum of OzoneManager address. | |
ozone.scm.skip.bootstrap.validation | false | OZONE, SCM, HA | optional config, the config when set to true skips the clusterId validation from leader scm during bootstrap |
ozone.scm.stale.node.interval | 5m | OZONE, MANAGEMENT | The interval for stale node flagging. Please see ozone.scm.heartbeat.thread.interval before changing this value. |
ozone.security.crypto.compliance.mode | unrestricted | OZONE, SECURITY, HDDS, CRYPTO_COMPLIANCE | Based on this property the security compliance mode is loaded and enables filtering cryptographic configuration options according to the specified compliance mode. |
ozone.security.enabled | false | OZONE, SECURITY, KERBEROS | True if security is enabled for ozone. When this property is true, hadoop.security.authentication should be Kerberos. |
ozone.security.http.kerberos.enabled | false | OZONE, SECURITY, KERBEROS | True if Kerberos authentication for Ozone HTTP web consoles is enabled using the SPNEGO protocol. When this property is true, hadoop.security.authentication should be Kerberos and ozone.security.enabled should be set to true. |
ozone.security.reconfigure.protocol.acl | * | SECURITY | Comma separated list of users and groups allowed to access reconfigure protocol. |
ozone.server.default.replication | 3 | OZONE | Default replication value. The actual number of replications can be specified when writing the key. The default is used if replication is not specified when creating key or no default replication set at bucket. Supported values: For RATIS: 1, 3 For EC (Erasure Coding) supported format: {ECCodec}-{DataBlocks}-{ParityBlocks}-{ChunkSize} ECCodec: Codec for encoding stripe. Supported values : XOR, RS (Reed Solomon) DataBlocks: Number of data blocks in a stripe. ParityBlocks: Number of parity blocks in a stripe. ChunkSize: Chunk size in bytes. E.g. 1024k, 2048k etc. Supported combinations of {DataBlocks}-{ParityBlocks} : 3-2, 6-3, 10-4 |
ozone.server.default.replication.type | RATIS | OZONE | Default replication type to be used while writing key into ozone. The value can be specified when writing the key, default is used when nothing is specified when creating key or no default value set at bucket. Supported values: RATIS, EC. |
ozone.service.shutdown.timeout | 60s | OZONE, OM, SCM, DATANODE, RECON, S3GATEWAY | Timeout to wait for each shutdown operation to completeIf a hook takes longer than this time to complete, it will be interrupted, so the service will shutdown. This allows the service shutdown to recover from a blocked operation. The minimum duration of the timeout is 1 second, if hook has been configured with a timeout less than 1 second. |
ozone.snapshot.deep.cleaning.enabled | false | OZONE, PERFORMANCE, OM | Flag to enable/disable snapshot deep cleaning. |
ozone.snapshot.defrag.limit.per.task | 1 | OZONE, PERFORMANCE, OM | The maximum number of snapshots that would be defragmented in each task run of snapshot defragmentation service. |
ozone.snapshot.defrag.service.interval | -1 | OZONE, PERFORMANCE, OM | Task interval of snapshot defragmentation service. |
ozone.snapshot.defrag.service.timeout | 300s | OZONE, PERFORMANCE, OM | Timeout value of a run of snapshot defragmentation service. |
ozone.snapshot.deleting.limit.per.task | 10 | OZONE, PERFORMANCE, OM | The maximum number of snapshots that would be reclaimed by Snapshot Deleting Service per run. |
ozone.snapshot.deleting.service.interval | 30s | OZONE, PERFORMANCE, OM | The time interval between successive SnapshotDeletingService thread run. |
ozone.snapshot.deleting.service.timeout | 300s | OZONE, PERFORMANCE, OM | Timeout value for SnapshotDeletingService. |
ozone.snapshot.directory.service.interval | 24h | OZONE, PERFORMANCE, OM, DEPRECATED | DEPRECATED. The time interval between successive SnapshotDirectoryCleaningService thread run. |
ozone.snapshot.directory.service.timeout | 300s | OZONE, PERFORMANCE, OM, DEPRECATED | DEPRECATED. Timeout value for SnapshotDirectoryCleaningService. |
ozone.snapshot.filtering.limit.per.task | 2 | OZONE, PERFORMANCE, OM | A maximum number of snapshots to be filtered by sst filtering service per time interval. |
ozone.snapshot.filtering.service.interval | 1m | OZONE, PERFORMANCE, OM | Time interval of the SST File filtering service from Snapshot. |
ozone.snapshot.key.deleting.limit.per.task | 20000 | OM, PERFORMANCE | The maximum number of deleted keys to be scanned by Snapshot Deleting Service per snapshot run. |
ozone.sst.filtering.service.timeout | 300000ms | OZONE, PERFORMANCE, OM | A timeout value of sst filtering service. |
ozone.tracing.enabled | false | OZONE, HDDS | If true, tracing is initialized and spans may be exported (subject to sampling). |
ozone.tracing.endpoint | OZONE, HDDS | OTLP gRPC receiver endpoint URL. | |
ozone.tracing.sampler | -1 | OZONE, HDDS | Root trace sampling ratio (0.0 to 1.0). |
ozone.tracing.span.sampling | OZONE, HDDS | Optional per-span sampling: comma-separated spanName:rate entries. | |
ozone.volume.io.percentiles.intervals.seconds | 60 | OZONE, DATANODE | This setting specifies the interval (in seconds) for monitoring percentile performance metrics. It helps in tracking the read and write performance of DataNodes in real-time, allowing for better identification and analysis of performance issues. |
ozone.xceiver.client.metrics.percentiles.intervals.seconds | 60 | XCEIVER, PERFORMANCE | Specifies the interval in seconds for the rollover of XceiverClient MutableQuantiles metrics. Setting this interval equal to the metrics sampling time ensures more detailed metrics. |
recon.om.delta.update.lag.threshold | 0 | OZONE, RECON | At every Recon OM sync, recon starts fetching OM DB updates, and it continues to fetch from OM till the lag, between OM DB WAL sequence number and Recon OM DB snapshot WAL sequence number, is less than this lag threshold value. |
recon.om.delta.update.limit | 50000 | OZONE, RECON | Recon each time get a limited delta updates from OM. The actual fetched data might be larger than this limit. |
scm.container.client.idle.threshold | 10s | OZONE, PERFORMANCE | In the standalone pipelines, the SCM clients use netty to communicate with the container. It also uses connection pooling to reduce client side overheads. This allows a connection to stay idle for a while before the connection is closed. |
scm.container.client.max.size | 256 | OZONE, PERFORMANCE | Controls the maximum number of connections that are cached via client connection pooling. If the number of connections exceed this count, then the oldest idle connection is evicted. |
ssl.server.keystore.keypassword | OZONE, SECURITY, MANAGEMENT | Keystore key password for HTTPS SSL configuration | |
ssl.server.keystore.location | OZONE, SECURITY, MANAGEMENT | Keystore location for HTTPS SSL configuration | |
ssl.server.keystore.password | OZONE, SECURITY, MANAGEMENT | Keystore password for HTTPS SSL configuration | |
ssl.server.keystore.type | jks | OZONE, SECURITY, CRYPTO_COMPLIANCE | The keystore type for HTTP Servers used in ozone. |
ssl.server.truststore.location | OZONE, SECURITY, MANAGEMENT | Truststore location for HTTPS SSL configuration | |
ssl.server.truststore.password | OZONE, SECURITY, MANAGEMENT | Truststore password for HTTPS SSL configuration | |
ssl.server.truststore.type | jks | OZONE, SECURITY, CRYPTO_COMPLIANCE | The truststore type for HTTP Servers used in ozone. |