Skip to main content
Version: Next

Configuration Key Appendix

This page provides a comprehensive overview of the configuration keys available in Ozone.

NameDefault ValueTagsDescription
fs.trash.classnameorg.apache.hadoop.fs.ozone.OzoneTrashPolicyOZONE, OZONEFS, CLIENTTrash Policy to be used.
hadoop.hdds.db.rocksdb.WAL_size_limit_MB0MBOM, SCM, DATANODEThe total size limit of WAL log files. Once the total log file size exceeds this limit, the earliest files will be deleted.Default 0 means no limit.
hadoop.hdds.db.rocksdb.WAL_ttl_seconds1200OM, SCM, DATANODEThe lifetime of WAL log files. Default 1200 seconds.
hadoop.hdds.db.rocksdb.keep.log.file.num10OM, SCM, DATANODEMaximum number of RocksDB application log files.
hadoop.hdds.db.rocksdb.logging.enabledfalseOM, SCM, DATANODEEnable/Disable RocksDB logging for OM.
hadoop.hdds.db.rocksdb.logging.levelINFOOM, SCM, DATANODEOM RocksDB logging level (INFO/DEBUG/WARN/ERROR/FATAL)
hadoop.hdds.db.rocksdb.max.log.file.size100MBOM, SCM, DATANODEMaximum size of RocksDB application log file.
hadoop.hdds.db.rocksdb.writeoption.syncfalseOM, SCM, DATANODEEnable/Disable Sync option. If true write will be considered complete, once flushed to persistent storage. If false, writes are flushed asynchronously.
hadoop.http.authentication.kerberos.keytab${user.home}/httpfs.keytabThe Kerberos keytab file with the credentials for the HTTP Kerberos principal used by httpfs in the HTTP endpoint. httpfs.authentication.kerberos.keytab is deprecated. Instead use hadoop.http.authentication.kerberos.keytab.
hadoop.http.authentication.kerberos.principalHTTP/${httpfs.hostname}@${kerberos.realm}The HTTP Kerberos principal used by HttpFS in the HTTP endpoint. The HTTP Kerberos principal MUST start with 'HTTP/' per Kerberos HTTP SPNEGO specification. httpfs.authentication.kerberos.principal is deprecated. Instead use hadoop.http.authentication.kerberos.principal.
hadoop.http.authentication.signature.secret.file${httpfs.config.dir}/httpfs-signature.secretFile containing the secret to sign HttpFS hadoop-auth cookies. This file should be readable only by the system user running HttpFS service. If multiple HttpFS servers are used in a load-balancer/round-robin fashion, they should share the secret file. If the secret file specified here does not exist, random secret is generated at startup time. httpfs.authentication.signature.secret.file is deprecated. Instead use hadoop.http.authentication.signature.secret.file.
hadoop.http.authentication.typesimpleDefines the authentication mechanism used by httpfs for its HTTP clients. Valid values are 'simple' or 'kerberos'. If using 'simple' HTTP clients must specify the username with the 'user.name' query string parameter. If using 'kerberos' HTTP clients must use HTTP SPNEGO or delegation tokens. httpfs.authentication.type is deprecated. Instead use hadoop.http.authentication.type.
hadoop.http.idle_timeout.ms60000OZONE, PERFORMANCE, S3GATEWAYOM/SCM/DN/S3GATEWAY Server connection timeout in milliseconds.
hadoop.http.max.request.header.size65536The maxmimum HTTP request header size.
hadoop.http.max.response.header.size65536The maxmimum HTTP response header size.
hadoop.http.max.threads1000The maxmimum number of threads.
hadoop.http.temp.dir${hadoop.tmp.dir}/httpfsHttpFS temp directory.
hdds.block.token.enabledfalseOZONE, HDDS, SECURITY, TOKENTrue if block tokens are enabled, else false.
hdds.block.token.expiry.time1dOZONE, HDDS, SECURITY, TOKENDefault value for expiry time of block token. This setting supports multiple time unit suffixes as described in dfs.heartbeat.interval. If no suffix is specified, then milliseconds is assumed.
hdds.command.status.report.interval30sOZONE, DATANODE, MANAGEMENTTime interval of the datanode to send status of commands executed since last report. Unit could be defined with postfix (ns,ms,s,m,h,d)
hdds.container.action.max.limit20DATANODEMaximum number of Container Actions sent by the datanode to SCM in a single heartbeat.
hdds.container.balancer.balancing.iteration.interval70mBALANCERThe interval period between each iteration of Container Balancer.
hdds.container.balancer.datanodes.involved.max.percentage.per.iteration20BALANCERMaximum percentage of healthy, in service datanodes that can be involved in balancing in one iteration.
hdds.container.balancer.exclude.containersBALANCERList of container IDs to exclude from balancing. For example "1, 4, 5" or "1,4,5".
hdds.container.balancer.exclude.datanodesBALANCERA list of Datanode hostnames or ip addresses separated by commas. The Datanodes specified in this list are excluded from balancing. This configuration is empty by default.
hdds.container.balancer.include.containersBALANCERList of container IDs to include in balancing. Only these containers will be included in balancing. For example "1, 4, 5" or "1,4,5".
hdds.container.balancer.include.datanodesBALANCERA list of Datanode hostnames or ip addresses separated by commas. Only the Datanodes specified in this list are balanced. This configuration is empty by default and is applicable only if it is non-empty.
hdds.container.balancer.iterations10BALANCERThe number of iterations that Container Balancer will run for.
hdds.container.balancer.move.networkTopology.enablefalseBALANCERwhether to take network topology into account when selecting a target for a source. This configuration is false by default.
hdds.container.balancer.move.replication.timeout50mBALANCERThe amount of time to allow a single container's replication from source to target as part of container move. For example, if "hdds.container.balancer.move.timeout" is 65 minutes, then out of those 65 minutes 50 minutes will be the deadline for replication to complete.
hdds.container.balancer.move.timeout65mBALANCERThe amount of time to allow a single container to move from source to target.
hdds.container.balancer.size.entering.target.max26GBBALANCERThe maximum size that can enter a target datanode in each iteration while balancing. This is the sum of data from multiple sources. The value must be greater than the configured (or default) ozone.scm.container.size.
hdds.container.balancer.size.leaving.source.max26GBBALANCERThe maximum size that can leave a source datanode in each iteration while balancing. This is the sum of data moving to multiple targets. The value must be greater than the configured (or default) ozone.scm.container.size.
hdds.container.balancer.size.moved.max.per.iteration500GBBALANCERThe maximum size of data in bytes that will be moved by Container Balancer in one iteration.
hdds.container.balancer.trigger.du.before.move.enablefalseBALANCERwhether to send command to all the healthy and in-service data nodes to run du immediately before startinga balance iteration. note that running du is very time consuming , especially when the disk usage rate of a data node is very high
hdds.container.balancer.utilization.threshold10BALANCERThreshold is a percentage in the range of 0 to 100. A cluster is considered balanced if for each datanode, the utilization of the datanode (used space to capacity ratio) differs from the utilization of the cluster (used space to capacity ratio of the entire cluster) no more than the threshold.
hdds.container.checksum.verification.enabledtrueOZONE, DATANODETo enable/disable checksum verification of the containers.
hdds.container.chunk.write.syncfalseOZONE, CONTAINER, MANAGEMENTDetermines whether the chunk writes in the container happen as sync I/0 or buffered I/O operation.
hdds.container.close.threshold0.9fOZONE, DATANODEThis determines the threshold to be used for closing a container. When the container used percentage reaches this threshold, the container will be closed. Value should be a positive, non-zero percentage in float notation (X.Yf), with 1.0f meaning 100%.
hdds.container.ipc.port9859OZONE, CONTAINER, MANAGEMENTThe ipc port number of container.
hdds.container.ipc.random.portfalseOZONE, DEBUG, CONTAINERAllocates a random free port for ozone container. This is used only while running unit tests.
hdds.container.ratis.admin.port9857OZONE, CONTAINER, PIPELINE, RATIS, MANAGEMENTThe ipc port number of container for admin requests.
hdds.container.ratis.datanode.storage.dirOZONE, CONTAINER, STORAGE, MANAGEMENT, RATISThis directory is used for storing Ratis metadata like logs. If this is not set then default metadata dirs is used. A warning will be logged if this not set. Ideally, this should be mapped to a fast disk like an SSD.
hdds.container.ratis.datastream.enabledfalseOZONE, CONTAINER, RATIS, DATASTREAMIt specifies whether to enable data stream of container.
hdds.container.ratis.datastream.port9855OZONE, CONTAINER, RATIS, DATASTREAMThe datastream port number of container.
hdds.container.ratis.datastream.random.portfalseOZONE, CONTAINER, RATIS, DATASTREAMAllocates a random free port for ozone container datastream. This is used only while running unit tests.
hdds.container.ratis.enabledfalseOZONE, MANAGEMENT, PIPELINE, RATISOzone supports different kinds of replication pipelines. Ratis is one of the replication pipeline supported by ozone.
hdds.container.ratis.ipc.port9858OZONE, CONTAINER, PIPELINE, RATISThe ipc port number of container for clients.
hdds.container.ratis.ipc.random.portfalseOZONE, DEBUGAllocates a random free port for ozone ratis port for the container. This is used only while running unit tests.
hdds.container.ratis.leader.pending.bytes.limit1GBOZONE, RATIS, PERFORMANCELimit on the total bytes of pending requests after which leader starts rejecting requests from client.
hdds.container.ratis.log.appender.queue.byte-limit32MBOZONE, DEBUG, CONTAINER, RATISByte limit for ratis leader's log appender queue.
hdds.container.ratis.log.appender.queue.num-elements1024OZONE, DEBUG, CONTAINER, RATISLimit for number of append entries in ratis leader's log appender queue.
hdds.container.ratis.log.purge.gap1000000OZONE, DEBUG, CONTAINER, RATISPurge gap between the last purged commit index and the current index, when the leader decides to purge its log.
hdds.container.ratis.log.queue.byte-limit4GBOZONE, DEBUG, CONTAINER, RATISByte limit for Ratis Log Worker queue.
hdds.container.ratis.log.queue.num-elements1024OZONE, DEBUG, CONTAINER, RATISLimit for the number of operations in Ratis Log Worker.
hdds.container.ratis.num.container.op.executors10OZONE, RATIS, PERFORMANCENumber of executors that will be used by Ratis to execute container ops.(10 by default).
hdds.container.ratis.num.write.chunk.threads.per.volume10OZONE, RATIS, PERFORMANCEMaximum number of threads in the thread pool that Datanode will use for writing replicated chunks. This is a per configured locations! (10 thread per disk by default).
hdds.container.ratis.rpc.typeGRPCOZONE, RATIS, MANAGEMENTRatis supports different kinds of transports like netty, GRPC, Hadoop RPC etc. This picks one of those for this cluster.
hdds.container.ratis.segment.preallocated.size4MBOZONE, RATIS, PERFORMANCEThe pre-allocated file size for raft segment used by Apache Ratis on datanodes. (4 MB by default)
hdds.container.ratis.segment.size64MBOZONE, RATIS, PERFORMANCEThe size of the raft segment file used by Apache Ratis on datanodes. (64 MB by default)
hdds.container.ratis.server.port9856OZONE, CONTAINER, PIPELINE, RATIS, MANAGEMENTThe ipc port number of container for server-server communication.
hdds.container.ratis.statemachine.max.pending.apply-transactions100000OZONE, CONTAINER, RATISMaximum number of pending apply transactions in a data pipeline. The default value is kept same as default snapshot threshold hdds.ratis.snapshot.threshold.
hdds.container.ratis.statemachine.write.wait.interval10mOZONE, DATANODETimeout for the write path for container blocks.
hdds.container.ratis.statemachinedata.sync.retriesOZONE, DEBUG, CONTAINER, RATISNumber of times the WriteStateMachineData op will be tried before failing. If the value is not configured, it will default to (hdds.ratis.rpc.slowness.timeout / hdds.container.ratis.statemachinedata.sync.timeout), which means that the WriteStatMachineData will be retried for every sync timeout until the configured slowness timeout is hit, after which the StateMachine will close down the pipeline. If this value is set to -1, then this retries indefinitely. This might not be desirable since if due to persistent failure the WriteStateMachineData op was not able to complete for a long time, this might block the Ratis write pipeline.
hdds.container.ratis.statemachinedata.sync.timeout10sOZONE, DEBUG, CONTAINER, RATISTimeout for StateMachine data writes by Ratis.
hdds.container.replication.compressionNO_COMPRESSIONOZONE, HDDS, DATANODECompression algorithm used for closed container replication. Possible chooices include NO_COMPRESSION, GZIP, SNAPPY, LZ4, ZSTD
hdds.container.report.interval60mOZONE, CONTAINER, MANAGEMENTTime interval of the datanode to send container report. Each datanode periodically send container report to SCM. Unit could be defined with postfix (ns,ms,s,m,h,d)
hdds.container.scrub.data.scan.interval7dSTORAGEMinimum time interval between two iterations of container data scanning. If an iteration takes less time than this, the scanner will wait before starting the next iteration. Unit could be defined with postfix (ns,ms,s,m,h,d).
hdds.container.scrub.dev.data.scan.enabledtrueSTORAGECan be used to disable the background container data scanner for developer testing purposes.
hdds.container.scrub.dev.metadata.scan.enabledtrueSTORAGECan be used to disable the background container metadata scanner for developer testing purposes.
hdds.container.scrub.enabledtrueSTORAGEConfig parameter to enable all container scanners.
hdds.container.scrub.metadata.scan.interval3hSTORAGEConfig parameter define time interval between two metadata scans by container scanner. Unit could be defined with postfix (ns,ms,s,m,h,d).
hdds.container.scrub.min.gap15mDATANODEThe minimum gap between two successive scans of the same container. Unit could be defined with postfix (ns,ms,s,m,h,d).
hdds.container.scrub.on.demand.volume.bytes.per.second5242880STORAGEConfig parameter to throttle I/O bandwidth used by the demand container scanner per volume.
hdds.container.scrub.volume.bytes.per.second5242880STORAGEConfig parameter to throttle I/O bandwidth used by scanner per volume.
hdds.container.token.enabledfalseOZONE, HDDS, SECURITY, TOKENTrue if container tokens are enabled, else false.
hdds.datanode.block.delete.command.worker.interval2sDATANODEThe interval between DeleteCmdWorker execution of delete commands.
hdds.datanode.block.delete.max.lock.wait.timeout100msDATANODE, DELETIONTimeout for the thread used to process the delete block command to wait for the container lock.
hdds.datanode.block.delete.queue.limit5DATANODEThe maximum number of block delete commands queued on a datanode, This configuration is also used by the SCM to control whether to send delete commands to the DN. If the DN has more commands waiting in the queue than this value, the SCM will not send any new block delete commands. until the DN has processed some commands and the queue length is reduced.
hdds.datanode.block.delete.threads.max5DATANODEThe maximum number of threads used to handle delete blocks on a datanode
hdds.datanode.block.deleting.limit.per.interval20000SCM, DELETION, DATANODENumber of blocks to be deleted in an interval.
hdds.datanode.block.deleting.max.lock.holding.time1sDATANODE, DELETIONThis configuration controls the maximum time that the block deleting service can hold the lock during the deletion of blocks. Once this configured time period is reached, the service will release and re-acquire the lock. This is not a hard limit as the time check only occurs after the completion of each transaction, which means the actual execution time may exceed this limit. Unit could be defined with postfix (ns,ms,s,m,h,d).
hdds.datanode.block.deleting.service.interval60sSCM, DELETIONTime interval of the Datanode block deleting service. The block deleting service runs on Datanode periodically and deletes blocks queued for deletion. Unit could be defined with postfix (ns,ms,s,m,h,d).
hdds.datanode.check.empty.container.dir.on.deletefalseDATANODEBoolean Flag to decide whether to check container directory or not to determine container is empty
hdds.datanode.chunk.data.validation.checkfalseDATANODEEnable safety checks such as checksum validation for Ratis calls.
hdds.datanode.client.addressOZONE, HDDS, MANAGEMENTThe address of the Ozone Datanode client service. It is a string in the host:port format.
hdds.datanode.client.bind.host0.0.0.0OZONE, HDDS, MANAGEMENTThe hostname or IP address used by the Datanode client service endpoint to bind.
hdds.datanode.client.port19864OZONE, HDDS, MANAGEMENTThe port number of the Ozone Datanode client service.
hdds.datanode.command.queue.limit5000DATANODEThe default maximum number of commands in the queue and command type's sub-queue on a datanode
hdds.datanode.container.checksum.lock.stripes127DATANODEThe number of lock stripes used to coordinate modifications to container checksum information. This information is only updated after a container is closed and does not affect the data read or write path. Each container in the datanode will be mapped to one lock which will only be held while its checksum information is updated.
hdds.datanode.container.client.cache.size100DATANODEThe maximum number of clients to be cached by the datanode client manager
hdds.datanode.container.client.cache.stale.threshold10000DATANODEThe stale threshold in ms for a client in cache. After this threshold the client is evicted from cache.
hdds.datanode.container.close.threads.max3DATANODEThe maximum number of threads used to close containers on a datanode
hdds.datanode.container.db.dirOZONE, CONTAINER, STORAGE, MANAGEMENTDetermines where the per-disk rocksdb instances will be stored. This setting is optional. If unspecified, then rocksdb instances are stored on the same disk as HDDS data. The directories should be tagged with corresponding storage types ([SSD]/[DISK]/[ARCHIVE]/[RAM_DISK]) for storage policies. The default storage type will be DISK if the directory does not have a storage type tagged explicitly. Ideally, this should be mapped to a fast disk like an SSD.
hdds.datanode.container.delete.threads.max2DATANODEThe maximum number of threads used to delete containers on a datanode
hdds.datanode.container.schema.v3.enabledtrueDATANODEEnable use of container schema v3(one rocksdb per disk).
hdds.datanode.container.schema.v3.key.separator|DATANODEThe default separator between Container ID and container meta key name.
hdds.datanode.data.dir.permissions700Permissions for the datanode data directories where actual file blocks are stored. The permissions can be either octal or symbolic. If the default permissions are not set then the default value of 700 will be used.
hdds.datanode.db.config.pathOZONE, CONTAINER, STORAGEPath to an ini configuration file for RocksDB on datanode component.
hdds.datanode.delete.container.timeout60sDATANODEIf a delete container request spends more than this time waiting on the container lock or performing pre checks, the command will be skipped and SCM will resend it automatically. This avoids commands running for a very long time without SCM being informed of the progress.
hdds.datanode.df.refresh.period5mDATANODEDisk space usage information will be refreshed with thespecified period following the completion of the last check.
hdds.datanode.dirOZONE, CONTAINER, STORAGE, MANAGEMENTDetermines where on the local filesystem HDDS data will be stored. The directories should be tagged with corresponding storage types ([SSD]/[DISK]/[ARCHIVE]/[RAM_DISK]) for storage policies. The default storage type will be DISK if the directory does not have a storage type tagged explicitly.
hdds.datanode.dir.du.reservedOZONE, CONTAINER, STORAGE, MANAGEMENTReserved space in bytes per volume. Always leave this much space free for non dfs use. Such as /dir1:100B, /dir2:200MB, means dir1 reserves 100 bytes and dir2 reserves 200 MB.
hdds.datanode.dir.du.reserved.percentOZONE, CONTAINER, STORAGE, MANAGEMENTPercentage of volume that should be reserved. This space is left free for other usage. The value should be between 0-1. Such as 0.1 which means 10% of volume space will be reserved.
hdds.datanode.disk.balancer.container.choosing.policyorg.apache.hadoop.ozone.container.diskbalancer.policy.DefaultContainerChoosingPolicyDISKBALANCERThe policy for selecting source/destination volumes and containers to move for disk balancing.
hdds.datanode.disk.balancer.enabledfalseOZONE, DATANODE, DISKBALANCERIf this property is set to true, then the Disk Balancer feature is enabled on Datanodes, and users can use this service. By default, this is disabled.
hdds.datanode.disk.balancer.info.dirDISKBALANCERThe path where datanode diskBalancer's conf is to be written to. if this property is not defined, ozone will fall back to use metadata directory instead.
hdds.datanode.disk.balancer.max.disk.throughputInMBPerSec10DISKBALANCERThe max balance speed.
hdds.datanode.disk.balancer.parallel.thread5DISKBALANCERThe max parallel balance thread count.
hdds.datanode.disk.balancer.replica.deletion.delay5mDATANODE, DISKBALANCERThe delay after a container is successfully moved from source volume to destination volume before the source container replica is deleted. Unit could be defined with postfix (ns,ms,s,m,h,d).
hdds.datanode.disk.balancer.service.interval60sDATANODE, DISKBALANCERTime interval of the Datanode DiskBalancer service. The Datanode will check the service periodically and update the config and running status for DiskBalancer service. Unit could be defined with postfix (ns,ms,s,m,h,d).
hdds.datanode.disk.balancer.service.timeout300sDATANODE, DISKBALANCERTimeout for the Datanode DiskBalancer service. Unit could be defined with postfix (ns,ms,s,m,h,d).
hdds.datanode.disk.balancer.should.run.defaultfalseDATANODE, DISKBALANCERIf DiskBalancer fails to get information from diskbalancer.info, it will choose this value to decide if this service should be running.
hdds.datanode.disk.balancer.stop.after.disk.eventrueDISKBALANCERIf true, the DiskBalancer will automatically stop once disks are balanced.
hdds.datanode.disk.balancer.volume.density.threshold.percent10DISKBALANCERThreshold is a percentage in the range of 0 to 100. A datanode is considered balanced if for each volume, the utilization of the volume(used space to capacity ratio) differs from the utilization of the datanode(used space to capacity ratio of the entire datanode) no more than the threshold.
hdds.datanode.disk.check.io.failures.tolerated1DATANODEThe number of IO tests out of the last hdds.datanode.disk.check.io.test.count test run that are allowed to fail before the volume is marked as failed.
hdds.datanode.disk.check.io.file.size100BDATANODEThe size of the temporary file that will be synced to the disk and read back to assess its health. The contents of the file will be stored in memory during the duration of the check.
hdds.datanode.disk.check.io.test.count3DATANODEThe number of IO tests required to determine if a disk has failed. Each disk check does one IO test. The volume will be failed if more than hdds.datanode.disk.check.io.failures.tolerated out of the last hdds.datanode.disk.check.io.test.count runs failed. Set to 0 to disable disk IO checks.
hdds.datanode.disk.check.min.gap10mDATANODEThe minimum gap between two successive checks of the same Datanode volume. Unit could be defined with postfix (ns,ms,s,m,h,d).
hdds.datanode.disk.check.timeout10mDATANODEMaximum allowed time for a disk check to complete. If the check does not complete within this time interval then the disk is declared as failed. Unit could be defined with postfix (ns,ms,s,m,h,d).
hdds.datanode.dns.interfacedefaultOZONE, DATANODEThe name of the Network Interface from which a Datanode should report its IP address. e.g. eth2. This setting may be required for some multi-homed nodes where the Datanodes are assigned multiple hostnames and it is desirable for the Datanodes to use a non-default hostname.
hdds.datanode.dns.nameserverdefaultOZONE, DATANODEThe host name or IP address of the name server (DNS) which a Datanode should use to determine its own host name.
hdds.datanode.du.factory.classnameDATANODEThe fully qualified name of the factory class that creates objects for providing disk space usage information. It should implement the SpaceUsageCheckFactory interface.
hdds.datanode.du.refresh.period1hDATANODEDisk space usage information will be refreshed with thespecified period following the completion of the last check.
hdds.datanode.failed.data.volumes.tolerated-1DATANODEThe number of data volumes that are allowed to fail before a datanode stops offering service. Config this to -1 means unlimited, but we should have at least one good volume left.
hdds.datanode.failed.db.volumes.tolerated-1DATANODEThe number of db volumes that are allowed to fail before a datanode stops offering service. Config this to -1 means unlimited, but we should have at least one good volume left.
hdds.datanode.failed.metadata.volumes.tolerated-1DATANODEThe number of metadata volumes that are allowed to fail before a datanode stops offering service. Config this to -1 means unlimited, but we should have at least one good volume left.
hdds.datanode.handler.count10OZONE, HDDS, MANAGEMENTThe number of RPC handler threads for Datanode client service endpoints.
hdds.datanode.hostnameOZONE, DATANODEOptional. The hostname for the Datanode containing this configuration file. Will be different for each machine. Defaults to current hostname.
hdds.datanode.http-address0.0.0.0:9882HDDS, MANAGEMENTThe address and the base port where the Datanode web ui will listen on. If the port is 0 then the server will start on a free port.
hdds.datanode.http-bind-host0.0.0.0HDDS, MANAGEMENTThe actual address the Datanode web server will bind to. If this optional address is set, it overrides only the hostname portion of hdds.datanode.http-address.
hdds.datanode.http.auth.kerberos.keytab/etc/security/keytabs/HTTP.keytabHDDS, SECURITY, MANAGEMENT, KERBEROSThe kerberos keytab file for datanode http server
hdds.datanode.http.auth.kerberos.principalHTTP/_HOST@REALMHDDS, SECURITY, MANAGEMENT, KERBEROSThe kerberos principal for the datanode http server.
hdds.datanode.http.auth.typesimpleDATANODE, SECURITY, KERBEROSsimple or kerberos. If kerberos is set, SPNEGO will be used for http authentication.
hdds.datanode.http.enabledtrueHDDS, MANAGEMENTProperty to enable or disable Datanode web ui.
hdds.datanode.https-address0.0.0.0:9883HDDS, MANAGEMENT, SECURITYThe address and the base port where the Datanode web UI will listen on using HTTPS. If the port is 0 then the server will start on a free port.
hdds.datanode.https-bind-host0.0.0.0HDDS, MANAGEMENT, SECURITYThe actual address the Datanode web server will bind to using HTTPS. If this optional address is set, it overrides only the hostname portion of hdds.datanode.http-address.
hdds.datanode.kerberos.keytab.fileOZONE, DATANODEThe keytab file used by each Datanode daemon to login as its service principal. The principal name is configured with hdds.datanode.kerberos.principal.
hdds.datanode.kerberos.principaldn/_HOST@REALMOZONE, SECURITY, KERBEROS, DATANODEThe Datanode service principal. This is typically set to dn/_HOST@REALM.TLD. Each Datanode will substitute _HOST with its own fully qualified hostname at startup. The _HOST placeholder allows using the same configuration setting on all Datanodes.
hdds.datanode.metadata.rocksdb.cache.size1GBOZONE, DATANODE, MANAGEMENTSize of the block metadata cache shared among RocksDB instances on each datanode. All containers on a datanode will share this cache.
hdds.datanode.periodic.disk.check.interval.minutes60DATANODEPeriodic disk check run interval in minutes.
hdds.datanode.pluginsComma-separated list of HDDS datanode plug-ins to be activated when HDDS service starts as part of datanode.
hdds.datanode.ratis.server.request.timeout2mOZONE, DATANODETimeout for the request submitted directly to Ratis in datanode.
hdds.datanode.read.chunk.threads.per.volume10DATANODENumber of threads per volume that Datanode will use for reading replicated chunks.
hdds.datanode.read.threadpool10OZONE, HDDS, PERFORMANCEThe number of threads in RPC server reading from the socket for Datanode client service endpoints. This config overrides Hadoop configuration "ipc.server.read.threadpool.size" for HddsDatanodeClientProtocolServer. The default value is 10.
hdds.datanode.recovering.container.scrubbing.service.interval1mSCM, DELETIONTime interval of the stale recovering container scrubbing service. The recovering container scrubbing service runs on Datanode periodically and deletes stale recovering container Unit could be defined with postfix (ns,ms,s,m,h,d).
hdds.datanode.replication.outofservice.limit.factor2.0DATANODE, SCMDecommissioning and maintenance nodes can handle morereplication commands than in-service nodes due to reduced load. This multiplier determines the increased queue capacity and executor pool size.
hdds.datanode.replication.port9886DATANODE, MANAGEMENTPort used for the server2server replication server
hdds.datanode.replication.queue.limit4096DATANODEThe maximum number of queued requests for container replication
hdds.datanode.replication.streams.limit10DATANODEThe maximum number of replication commands a single datanode can execute simultaneously
hdds.datanode.replication.work.dirDATANODEThis configuration is deprecated. Temporary sub directory under each hdds.datanode.dir will be used during the container replication between datanodes to save the downloaded container(in compressed format).
hdds.datanode.rocksdb.auto-compaction-small-sst-filetrueDATANODEAuto compact small SST files (rocksdb.auto-compaction-small-sst-file-size-threshold) when count exceeds (rocksdb.auto-compaction-small-sst-file-num-threshold)
hdds.datanode.rocksdb.auto-compaction-small-sst-file-num-threshold512DATANODEAuto compaction will happen if the number of small SST files exceeds this threshold.
hdds.datanode.rocksdb.auto-compaction-small-sst-file-size-threshold1MBDATANODESST files smaller than this configuration will be auto compacted.
hdds.datanode.rocksdb.auto-compaction-small-sst-file.interval.minutes120DATANODEAuto compact small SST files interval in minutes.
hdds.datanode.rocksdb.auto-compaction-small-sst-file.threads1DATANODEAuto compact small SST files threads.
hdds.datanode.rocksdb.delete-obsolete-files-period1hDATANODEPeriodicity when obsolete files get deleted. Default is 1h.
hdds.datanode.rocksdb.log.levelINFODATANODEThe user log level of RocksDB(DEBUG/INFO/WARN/ERROR/FATAL))
hdds.datanode.rocksdb.log.max-file-num64DATANODEThe max user log file number to keep for each RocksDB
hdds.datanode.rocksdb.log.max-file-size32MBDATANODEThe max size of each user log file of RocksDB. O means no size limit.
hdds.datanode.rocksdb.max-open-files1024DATANODEThe total number of files that a RocksDB can open.
hdds.datanode.slow.op.warning.threshold500msOZONE, DATANODE, PERFORMANCEThresholds for printing slow-operation audit logs.
hdds.datanode.storage.utilization.critical.threshold0.95OZONE, SCM, MANAGEMENTIf a datanode overall storage utilization exceeds more than this value, the datanode will be marked out of space.
hdds.datanode.storage.utilization.warning.threshold0.75OZONE, SCM, MANAGEMENTIf a datanode overall storage utilization exceeds more than this value, a warning will be logged while processing the nodeReport in SCM.
hdds.datanode.use.datanode.hostnamefalseOZONE, DATANODEWhether Datanodes should use Datanode hostnames when connecting to other Datanodes for data transfer.
hdds.datanode.volume.choosing.policyorg.apache.hadoop.ozone.container.common.volume.CapacityVolumeChoosingPolicyOZONE, CONTAINER, STORAGE, MANAGEMENTThe class name of the policy for choosing volumes in the list of directories. Defaults to org.apache.hadoop.ozone.container.common.volume.CapacityVolumeChoosingPolicy. This volume choosing policy randomly chooses two volumes with remaining space and then picks the one with lower utilization.
hdds.datanode.volume.min.free.space-1OZONE, CONTAINER, STORAGE, MANAGEMENTThis determines the free space to be used for closing containers When the difference between volume capacity and used reaches this number, containers that reside on this volume will be closed and no new containers would be allocated on this volume. Max of min.free.space and min.free.space.percent will be used as final value.
hdds.datanode.volume.min.free.space.percent0.02OZONE, CONTAINER, STORAGE, MANAGEMENTThis determines the free space percent to be used for closing containers When the difference between volume capacity and used reaches (free.space.percent of volume capacity), containers that reside on this volume will be closed and no new containers would be allocated on this volume. Max of min.free.space or min.free.space.percent will be used as final value.
hdds.datanode.wait.on.all.followersfalseDATANODEDefines whether the leader datanode will wait for bothfollowers to catch up before removing the stateMachineData from the cache.
hdds.db.profileDISKOZONE, OM, PERFORMANCEThis property allows user to pick a configuration that tunes the RocksDB settings for the hardware it is running on. Right now, we have SSD and DISK as profile options.
hdds.grpc.tls.enabledfalseOZONE, HDDS, SECURITY, TLSIf HDDS GRPC server TLS is enabled.
hdds.grpc.tls.providerOPENSSLOZONE, HDDS, SECURITY, TLS, CRYPTO_COMPLIANCEHDDS GRPC server TLS provider.
hdds.heartbeat.initial-interval2sOZONE, MANAGEMENTHeartbeat interval used during Datanode initialization for SCM.
hdds.heartbeat.interval30sOZONE, MANAGEMENTThe heartbeat interval from a data node to SCM. Yes, it is not three but 30, since most data nodes will heart beating via Ratis heartbeats. If a client is not able to talk to a data node, it will notify OM/SCM eventually. So a 30 second HB seems to work. This assumes that replication strategy used is Ratis if not, this value should be set to something smaller like 3 seconds. ozone.scm.pipeline.close.timeout should also be adjusted accordingly, if the default value for this config is not used.
hdds.heartbeat.recon.initial-interval60sOZONE, MANAGEMENT, RECONHeartbeat interval used during Datanode initialization for Recon.
hdds.heartbeat.recon.interval60sOZONE, MANAGEMENT, RECONThe heartbeat interval from a Datanode to Recon.
hdds.key.algoRSASCM, HDDS, X509, SECURITY, CRYPTO_COMPLIANCESCM CA key algorithm.
hdds.key.dir.namekeysSCM, HDDS, X509, SECURITYDirectory to store public/private key for SCM CA. This is relative to ozone/hdds meteadata dir.
hdds.key.len2048SCM, HDDS, X509, SECURITY, CRYPTO_COMPLIANCESCM CA key length. This is an algorithm-specific metric, such as modulus length, specified in number of bits.
hdds.metadata.dirX509, SECURITYAbsolute path to HDDS metadata dir.
hdds.metrics.percentiles.intervalsOZONE, DATANODEComma-delimited set of integers denoting the desired rollover intervals (in seconds) for percentile latency metrics on the Datanode. By default, percentile latency metrics are disabled.
hdds.metrics.session-idOZONE, HDDSGet the user-specified session identifier. The default is the empty string. The session identifier is used to tag metric data that is reported to some performance metrics system via the org.apache.hadoop.metrics API. The session identifier is intended, in particular, for use by Hadoop-On-Demand (HOD) which allocates a virtual Hadoop cluster dynamically and transiently. HOD will set the session identifier by modifying the mapred-site.xml file before starting the cluster. When not running under HOD, this identifer is expected to remain set to the empty string.
hdds.node.report.interval60000msOZONE, CONTAINER, MANAGEMENTTime interval of the datanode to send node report. Each datanode periodically send node report to SCM. Unit could be defined with postfix (ns,ms,s,m,h,d)
hdds.pipeline.action.max.limit20DATANODEMaximum number of Pipeline Actions sent by the datanode to SCM in a single heartbeat.
hdds.pipeline.report.interval60000msOZONE, PIPELINE, MANAGEMENTTime interval of the datanode to send pipeline report. Each datanode periodically send pipeline report to SCM. Unit could be defined with postfix (ns,ms,s,m,h,d)
hdds.priv.key.file.nameprivate.pemX509, SECURITYName of file which stores private key generated for SCM CA.
hdds.profiler.endpoint.enabledfalseOZONE, MANAGEMENTEnable /prof java profiler servlet page on HTTP server.
hdds.prometheus.endpoint.enabledtrueOZONE, MANAGEMENTEnable prometheus compatible metric page on the HTTP servers.
hdds.prometheus.endpoint.tokenSECURITY, MANAGEMENTAllowed authorization token while using prometheus servlet endpoint. This will disable SPNEGO based authentication on the endpoint.
hdds.public.key.file.namepublic.pemX509, SECURITYName of file which stores public key generated for SCM CA.
hdds.raft.server.rpc.first-election.timeoutOZONE, RATIS, MANAGEMENTratis Minimum timeout for the first election of a leader. If not configured, fallback to hdds.ratis.leader.election.minimum.timeout.duration.
hdds.ratis.client.exponential.backoff.base.sleep4sOZONE, CLIENT, PERFORMANCESpecifies base sleep for exponential backoff retry policy. With the default base sleep of 4s, the sleep duration for ith retry is min(4 * pow(2, i), max_sleep) * r, where r is random number in the range [0.5, 1.5).
hdds.ratis.client.exponential.backoff.max.retries2147483647OZONE, CLIENT, PERFORMANCEClient's max retry value for the exponential backoff policy.
hdds.ratis.client.exponential.backoff.max.sleep40sOZONE, CLIENT, PERFORMANCEThe sleep duration obtained from exponential backoff policy is limited by the configured max sleep. Refer dfs.ratis.client.exponential.backoff.base.sleep for further details.
hdds.ratis.client.multilinear.random.retry.policy5s, 5, 10s, 5, 15s, 5, 20s, 5, 25s, 5, 60s, 10OZONE, CLIENT, PERFORMANCESpecifies multilinear random retry policy to be used by ratis client. e.g. given pairs of number of retries and sleep time (n0, t0), (n1, t1), ..., for the first n0 retries sleep duration is t0 on average, the following n1 retries sleep duration is t1 on average, and so on.
hdds.ratis.client.request.watch.timeout3mOZONE, CLIENT, PERFORMANCETimeout for ratis client watch request.
hdds.ratis.client.request.watch.typeALL_COMMITTEDOZONE, CLIENT, PERFORMANCEDesired replication level when Ozone client's Raft client calls watch(), ALL_COMMITTED or MAJORITY_COMMITTED. MAJORITY_COMMITTED increases write performance by reducing watch() latency when an Ozone datanode is slow in a pipeline, at the cost of potential read latency increasing due to read retries to different datanodes.
hdds.ratis.client.request.write.timeout5mOZONE, CLIENT, PERFORMANCETimeout for ratis client write request.
hdds.ratis.client.retry.policyorg.apache.hadoop.hdds.ratis.retrypolicy.RequestTypeDependentRetryPolicyCreatorOZONE, CLIENT, PERFORMANCEThe class name of the policy for retry.
hdds.ratis.client.retrylimited.max.retries180OZONE, CLIENT, PERFORMANCENumber of retries for ratis client request.
hdds.ratis.client.retrylimited.retry.interval1sOZONE, CLIENT, PERFORMANCEInterval between successive retries for a ratis client request.
hdds.ratis.leader.election.minimum.timeout.duration5sOZONE, RATIS, MANAGEMENTThe minimum timeout duration for ratis leader election. Default is 5s.
hdds.ratis.raft.client.async.outstanding-requests.max32OZONE, CLIENT, PERFORMANCEControls the maximum number of outstanding async requests that can be handled by the Standalone as well as Ratis client.
hdds.ratis.raft.client.rpc.request.timeout60sOZONE, CLIENT, PERFORMANCEThe timeout duration for ratis client request (except for watch request). It should be set greater than leader election timeout in Ratis.
hdds.ratis.raft.client.rpc.watch.request.timeout180sOZONE, CLIENT, PERFORMANCEThe timeout duration for ratis client watch request. Timeout for the watch API in Ratis client to acknowledge a particular request getting replayed to all servers. It is highly recommended for the timeout duration to be strictly longer than Ratis server watch timeout (hdds.ratis.raft.server.watch.timeout)
hdds.ratis.raft.grpc.flow.control.window5MBOZONE, CLIENT, PERFORMANCEThis parameter tells how much data grpc client can send to grpc server with out receiving any ack(WINDOW_UPDATE) packet from server. This parameter should be set in accordance with chunk size. Example: If Chunk size is 4MB, considering some header size in to consideration, this can be set 5MB or greater. Tune this parameter accordingly, as when it is set with a value lesser than chunk size it degrades the ozone client performance.
hdds.ratis.raft.server.datastream.client.pool.size10OZONE, DATANODE, RATIS, DATASTREAMMaximum number of client proxy in NettyServerStreamRpc for datastream write.
hdds.ratis.raft.server.datastream.request.threads20OZONE, DATANODE, RATIS, DATASTREAMMaximum number of threads in the thread pool for datastream request.
hdds.ratis.raft.server.delete.ratis.log.directorytrueOZONE, DATANODE, RATISFlag to indicate whether ratis log directory will becleaned up during pipeline remove.
hdds.ratis.raft.server.leaderelection.pre-votetrueOZONE, DATANODE, RATISFlag to enable/disable ratis election pre-vote.
hdds.ratis.raft.server.log.appender.wait-time.min0usOZONE, DATANODE, RATIS, PERFORMANCEThe minimum wait time between two appendEntries calls. In some error conditions, the leader may keep retrying appendEntries. If it happens, increasing this value to, say, 5us (microseconds) can help avoid the leader being too busy retrying.
hdds.ratis.raft.server.notification.no-leader.timeout300sOZONE, DATANODE, RATISTime out duration after which StateMachine gets notified that leader has not been elected for a long time and leader changes its role to Candidate.
hdds.ratis.raft.server.rpc.request.timeout60sOZONE, DATANODE, RATISThe timeout duration of the ratis write request on Ratis Server.
hdds.ratis.raft.server.rpc.slowness.timeout300sOZONE, DATANODE, RATISTimeout duration after which stateMachine will be notified that follower is slow. StateMachine will close down the pipeline.
hdds.ratis.raft.server.watch.timeout30sOZONE, DATANODE, RATISThe timeout duration for watch request on Ratis Server. Timeout for the watch request in Ratis server to acknowledge a particular request is replayed to all servers. It is highly recommended for the timeout duration to be strictly shorter than Ratis client watch timeout (hdds.ratis.raft.client.rpc.watch.request.timeout).
hdds.ratis.raft.server.write.element-limit1024OZONE, DATANODE, RATIS, PERFORMANCEMaximum number of pending requests after which the leader starts rejecting requests from client.
hdds.ratis.server.num.snapshots.retained5STORAGEConfig parameter to specify number of old snapshots retained at the Ratis leader.
hdds.ratis.server.retry-cache.timeout.duration600000msOZONE, RATIS, MANAGEMENTRetry Cache entry timeout for ratis server.
hdds.ratis.snapshot.threshold100000OZONE, CONTAINER, RATISNumber of transactions after which a ratis snapshot should be taken.
hdds.scm.block.deleting.service.interval60sSCM, DELETIONTime interval of the scm block deleting service. The block deletingservice runs on SCM periodically and deletes blocks queued for deletion. Unit could be defined with postfix (ns,ms,s,m,h,d).
hdds.scm.block.deletion.per-interval.max500000SCM, DELETIONMaximum number of blocks which SCM processes during an interval. The block num is counted at the replica level.If SCM has 100000 blocks which need to be deleted and the configuration is 5000 then it would only send 5000 blocks for deletion to the datanodes.
hdds.scm.block.deletion.txn.dn.commit.map.limit5000000SCMThis value indicates the size of the transactionToDNsCommitMap after which we will skip one round of scm block deleting interval.
hdds.scm.ec.pipeline.choose.policy.implorg.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RandomPipelineChoosePolicySCM, PIPELINESets the policy for choosing an EC pipeline. The value should be the full name of a class which implements org.apache.hadoop.hdds.scm.PipelineChoosePolicy. The class decides which pipeline will be used when selecting an EC Pipeline. If not set, org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RandomPipelineChoosePolicy will be used as default value. One of the following values can be used: (1) org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RandomPipelineChoosePolicy : chooses a pipeline randomly. (2) org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.HealthyPipelineChoosePolicy : chooses a healthy pipeline randomly. (3) org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.CapacityPipelineChoosePolicy : chooses the pipeline with lower utilization from two random pipelines. Note that random choose method will be executed twice in this policy.(4) org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RoundRobinPipelineChoosePolicy : chooses a pipeline in a round robin fashion. Intended for troubleshooting and testing purposes only.
hdds.scm.http.auth.kerberos.keytabSECURITYThe keytab file used by SCM http server to login as its service principal.
hdds.scm.http.auth.kerberos.principalSECURITYThis Kerberos principal is used when communicating to the HTTP server of SCM.The protocol used is SPNEGO.
hdds.scm.http.auth.typesimpleOM, SECURITY, KERBEROSsimple or kerberos. If kerberos is set, SPNEGO will be used for http authentication.
hdds.scm.kerberos.keytab.file/etc/security/keytabs/SCM.keytabSCM, SECURITY, KERBEROSThe keytab file used by SCM daemon to login as its service principal.
hdds.scm.kerberos.principalSCM/_HOST@REALMSCM, SECURITY, KERBEROSThe SCM service principal. e.g. scm/_HOST@REALM.COM
hdds.scm.pipeline.choose.policy.implorg.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RandomPipelineChoosePolicySCM, PIPELINESets the policy for choosing a pipeline for a Ratis container. The value should be the full name of a class which implements org.apache.hadoop.hdds.scm.PipelineChoosePolicy. The class decides which pipeline will be used to find or allocate Ratis containers. If not set, org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RandomPipelineChoosePolicy will be used as default value. One of the following values can be used: (1) org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RandomPipelineChoosePolicy : chooses a pipeline randomly. (2) org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.HealthyPipelineChoosePolicy : chooses a healthy pipeline randomly. (3) org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.CapacityPipelineChoosePolicy : chooses the pipeline with lower utilization from two random pipelines. Note that random choose method will be executed twice in this policy.(4) org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.RoundRobinPipelineChoosePolicy : chooses a pipeline in a round robin fashion. Intended for troubleshooting and testing purposes only.
hdds.scm.replication.container.sample.limit100SCMThe number of containers to sample in each state per iteration of the replication manager. This is useful for debugging when Recon is not available. The samples are included in the ReplicationManagerReport for each lifecycle and health state.
hdds.scm.replication.datanode.delete.container.limit40SCM, DATANODEA limit to restrict the total number of delete container commands queued on a datanode. Note this is intended to be a temporary config until we have a more dynamic way of limiting load
hdds.scm.replication.datanode.reconstruction.weight3SCM, DATANODEWhen counting the number of replication commands on a datanode, the number of reconstruction commands is multiplied by this weight to ensure reconstruction commands use more of the capacity, as they are more expensive to process.
hdds.scm.replication.datanode.replication.limit20SCM, DATANODEA limit to restrict the total number of replication and reconstruction commands queued on a datanode. Note this is intended to be a temporary config until we have a more dynamic way of limiting load.
hdds.scm.replication.event.timeout12mSCM, OZONETimeout for the container replication/deletion commands sent to datanodes. After this timeout the command will be retried.
hdds.scm.replication.event.timeout.datanode.offset6mSCM, OZONEThe amount of time to subtract from hdds.scm.replication.event.timeout to give a deadline on the datanodes which is less than the SCM timeout. This ensures the datanodes will not process a command after SCM believes it should have expired.
hdds.scm.replication.inflight.limit.factor0.75SCMThe overall replication task limit on a cluster is the number healthy nodes, times the datanode.replication.limit. This factor, which should be between zero and 1, scales that limit down to reduce the overall number of replicas pending creation on the cluster. A setting of zero disables global limit checking. A setting of 1 effectively disables it, by making the limit equal to the above equation. However if there are many decommissioning nodes on the cluster, the decommission nodes will have a higher than normal limit, so the setting of 1 may still provide some limit in extreme circumstances.
hdds.scm.replication.maintenance.remaining.redundancy1SCM, OZONEThe number of redundant containers in a group which must be available for a node to enter maintenance. If putting a node into maintenance reduces the redundancy below this value , the node will remain in the ENTERING_MAINTENANCE state until a new replica is created. For Ratis containers, the default value of 1 ensures at least two replicas are online, meaning 1 more can be lost without data becoming unavailable. For any EC container it will have at least dataNum + 1 online, allowing the loss of 1 more replica before data becomes unavailable. Currently only EC containers use this setting. Ratis containers use hdds.scm.replication.maintenance.replica.minimum. For EC, if nodes are in maintenance, it is likely reconstruction reads will be required if some of the data replicas are offline. This is seamless to the client, but will affect read performance.
hdds.scm.replication.maintenance.replica.minimum2SCM, OZONEThe minimum number of container replicas which must be available for a node to enter maintenance. If putting a node into maintenance reduces the available replicas for any container below this level, the node will remain in the entering maintenance state until a new replica is created.
hdds.scm.replication.over.replicated.interval30sSCM, OZONEHow frequently to check if there are work to process on the over replicated queue
hdds.scm.replication.pushtrueSCM, DATANODEIf false, replication happens by asking the target to pull from source nodes. If true, the source node is asked to push to the target node.
hdds.scm.replication.quasi.closed.stuck.best.origin.copies3SCMFor quasi-closed stuck containers with multiple diverged origins, the number of replicas to maintain for the origin with the highest bcsId among healthy replicas. This origin is considered the 'best' copy and receives extra fault-tolerance. If multiple origins share the same highest bcsId, all of them receive this count.
hdds.scm.replication.quasi.closed.stuck.other.origin.copies2SCMFor quasi-closed stuck containers with multiple diverged origins, the number of replicas to maintain for each origin that does not have the highest block commit sequence ID (BCSID). These replicas are kept to preserve data integrity across diverged copies.
hdds.scm.replication.thread.interval300sSCM, OZONEThere is a replication monitor thread running inside SCM which takes care of replicating the containers in the cluster. This property is used to configure the interval in which that thread runs.
hdds.scm.replication.under.replicated.interval30sSCM, OZONEHow frequently to check if there are work to process on the under replicated queue
hdds.scm.safemode.atleast.one.node.reported.pipeline.pct0.90HDDS, SCM, OPERATIONPercentage of pipelines, where at least one datanode is reported in the pipeline.
hdds.scm.safemode.enabledtrueHDDS, SCM, OPERATIONBoolean value to enable or disable SCM safe mode.
hdds.scm.safemode.healthy.pipeline.pct0.10HDDS, SCM, OPERATIONPercentage of healthy pipelines, where all 3 datanodes are reported in the pipeline.
hdds.scm.safemode.log.interval1mHDDS, SCM, OPERATIONInterval at which SCM logs safemode status while SCM is in safemode. Default is 1 minute.
hdds.scm.safemode.min.datanode3HDDS, SCM, OPERATIONMinimum DataNodes which should be registered to get SCM out of safe mode.
hdds.scm.safemode.pipeline.creationtrueHDDS, SCM, OPERATIONBoolean value to enable background pipeline creation in SCM safe mode.
hdds.scm.safemode.threshold.pct0.99HDDS, SCM, OPERATION% of containers which should have at least one reported replica before SCM comes out of safe mode.
hdds.scm.unknown-container.actionWARNSCM, MANAGEMENTThe action taken by SCM to process unknown containers that reported by Datanodes. The default action is just logging container not found warning, another available action is DELETE action. These unknown containers will be deleted under this action way.
hdds.scm.wait.time.after.safemode.exit5mHDDS, SCM, OPERATIONAfter exiting safemode, wait for configured interval of time to start replication monitor and cleanup activities of unhealthy pipelines.
hdds.scmclient.failover.max.retry15OZONE, SCM, CLIENTMax retry count for SCM Client when failover happens.
hdds.scmclient.failover.retry.interval2sOZONE, SCM, CLIENTSCM Client timeout on waiting for the next connection retry to other SCM IP. The default value is set to 2 seconds.
hdds.scmclient.max.retry.timeout10mOZONE, SCM, CLIENTMax retry timeout for SCM Client
hdds.scmclient.rpc.timeout15mOZONE, SCM, CLIENTRpcClient timeout on waiting for the response from SCM. The default value is set to 15 minutes. If ipc.client.ping is set to true and this rpc-timeout is greater than the value of ipc.ping.interval, the effective value of the rpc-timeout is rounded up to multiple of ipc.ping.interval.
hdds.secret.key.algorithmHmacSHA256SCM, SECURITY, CRYPTO_COMPLIANCEThe algorithm that SCM uses to generate symmetric secret keys. A valid algorithm is the one supported by KeyGenerator, as described at https://docs.oracle.com/javase/8/docs/technotes/guides/security/StandardNames.html#KeyGenerator.
hdds.secret.key.expiry.duration9dSCM, SECURITYThe duration for which symmetric secret keys issued by SCM are valid. Secret key is used to sign delegation tokens signed by OM, so the secret key must be valid for at least (ozone.manager.delegation.token.max-lifetime + hdds.secret.key.rotate.duration + ozone.manager.delegation.remover.scan.interval) time to guarantee that delegation tokens can be verified by OM. Considering the default value of three properties mentioned and rounding up to days, this property's default value, in combination with hdds.secret.key.rotate.duration=1d, results in 9 secret keys (for the last 9 days) are kept valid at any point of time. If any of ozone.manager.delegation.token.max-lifetime, hdds.secret.key.rotate.duration or ozone.manager.delegation.remover.scan.interval value is changed, this property should be checked, and updated accordingly if necessary.
hdds.secret.key.file.namesecret_keys.jsonSCM, SECURITYName of file which stores symmetric secret keys for token signatures.
hdds.secret.key.rotate.check.duration10mSCM, SECURITYThe duration that SCM periodically checks if it's time to generate new symmetric secret keys. This config has an impact on the practical correctness of secret key expiry and rotation period. For example, if hdds.secret.key.rotate.duration=1d and hdds.secret.key.rotate.check.duration=10m, the actual key rotation will happen each 1d +/- 10m.
hdds.secret.key.rotate.duration1dSCM, SECURITYThe duration that SCM periodically generate a new symmetric secret keys.
hdds.security.client.datanode.container.protocol.acl*SECURITYComma separated list of users and groups allowed to access client datanode container protocol.
hdds.security.client.datanode.disk.balancer.protocol.acl*SECURITYComma separated list of users and groups allowed to access disk balancer protocol.
hdds.security.client.scm.block.protocol.acl*SECURITYComma separated list of users and groups allowed to access client scm block protocol.
hdds.security.client.scm.certificate.protocol.acl*SECURITYComma separated list of users and groups allowed to access client scm certificate protocol.
hdds.security.client.scm.container.protocol.acl*SECURITYComma separated list of users and groups allowed to access client scm container protocol.
hdds.security.client.scm.secretkey.datanode.protocol.acl*SECURITYComma separated list of users and groups allowed to access client scm secret key protocol for datanodes.
hdds.security.client.scm.secretkey.om.protocol.acl*SECURITYComma separated list of users and groups allowed to access client scm secret key protocol for om.
hdds.security.client.scm.secretkey.scm.protocol.acl*SECURITYComma separated list of users and groups allowed to access client scm secret key protocol for om.
hdds.security.providerBCOZONE, HDDS, X509, SECURITY, CRYPTO_COMPLIANCEThe main security provider used for various cryptographic algorithms.
hdds.x509.ca.rotation.ack.timeoutPT15MOZONE, HDDS, SECURITYMax time that SCM leader will wait for the rotation preparation acks before it believes the rotation is failed. Default is 15 minutes.
hdds.x509.ca.rotation.check.intervalP1DOZONE, HDDS, SECURITYCheck interval of whether internal root certificate is going to expire and needs to start rotation or not. Default is 1 day. The property value should be less than the value of property hdds.x509.renew.grace.duration.
hdds.x509.ca.rotation.enabledfalseOZONE, HDDS, SECURITYWhether auto root CA and sub CA certificate rotation is enabled or not. Default is disabled.
hdds.x509.ca.rotation.time-of-day02:00:00OZONE, HDDS, SECURITYTime of day to start the rotation. Default 02:00 AM to avoid impacting daily workload. The supported format is 'hh:mm:ss', representing hour, minute, and second.
hdds.x509.default.durationP365DOZONE, HDDS, SECURITYDefault duration for which x509 certificates issued by SCM are valid. The formats accepted are based on the ISO-8601 duration format PnDTnHnMn.nS
hdds.x509.dir.namecertsOZONE, HDDS, SECURITYX509 certificate directory name.
hdds.x509.expired.certificate.check.intervalP1DInterval to use for removing expired certificates. A background task to remove expired certificates from the scm metadata store is scheduled to run at the rate this configuration option specifies.
hdds.x509.file.namecertificate.crtOZONE, HDDS, SECURITYCertificate file name.
hdds.x509.max.durationP1865DOZONE, HDDS, SECURITYMax time for which certificate issued by SCM CA are valid. This duration is used for self-signed root cert and scm sub-ca certs issued by root ca. The formats accepted are based on the ISO-8601 duration format PnDTnHnMn.nS
hdds.x509.renew.grace.durationP28DOZONE, HDDS, SECURITYDuration of the grace period within which a certificate should be * renewed before the current one expires. Default is 28 days.
hdds.x509.rootca.certificate.filePath to an external CA certificate. The file format is expected to be pem. This certificate is used when initializing SCM to create a root certificate authority. By default, a self-signed certificate is generated instead. Note that this certificate is only used for Ozone's internal communication, and it does not affect the certificates used for HTTPS protocol at WebUIs as they can be configured separately.
hdds.x509.rootca.certificate.polling.intervalPT2hInterval to use for polling in certificate clients for a new root ca certificate. Every time the specified time duration elapses, the clients send a request to the SCMs to see if a new root ca certificate was generated. Once there is a change, the system automatically adds the new root ca to the clients' trust stores and requests a new certificate to be signed.
hdds.x509.rootca.private.key.filePath to an external private key. The file format is expected to be pem. This private key is later used when initializing SCM to sign certificates as the root certificate authority. When not specified a private and public key is generated instead. These keys are only used for Ozone's internal communication, and it does not affect the HTTPS protocol at WebUIs as they can be configured separately.
hdds.x509.rootca.public.key.filePath to an external public key. The file format is expected to be pem. This public key is later used when initializing SCM to sign certificates as the root certificate authority. When only the private key is specified the public key is read from the external certificate. Note that this is only used for Ozone's internal communication, and it does not affect the HTTPS protocol at WebUIs as they can be configured separately.
hdds.x509.signature.algorithmSHA256withRSAOZONE, HDDS, SECURITY, CRYPTO_COMPLIANCEX509 signature certificate.
hdds.xframe.enabledtrueOZONE, HDDSIf true, then enables protection against clickjacking by returning X_FRAME_OPTIONS header value set to SAMEORIGIN. Clickjacking protection prevents an attacker from using transparent or opaque layers to trick a user into clicking on a button or link on another page.
hdds.xframe.valueSAMEORIGINOZONE, HDDSThis configration value allows user to specify the value for the X-FRAME-OPTIONS. The possible values for this field are DENY, SAMEORIGIN and ALLOW-FROM. Any other value will throw an exception when Datanodes are starting up.
httpfs.access.moderead-writeSets the access mode for HTTPFS. If access is not allowed the FORBIDDED (403) is returned. Valid access modes are: read-write Full Access allowed write-only PUT POST and DELETE full Access. GET only allows GETFILESTATUS and LISTSTATUS read-only GET Full Access PUT POST and DELETE are FORBIDDEN
httpfs.buffer.size4096The buffer size used by a read/write request when streaming data from/to HDFS.
httpfs.delegation.token.manager.max.lifetime604800HttpFS delegation token maximum lifetime, default 7 days, in seconds
httpfs.delegation.token.manager.renewal.interval86400HttpFS delegation token update interval, default 1 day, in seconds.
httpfs.delegation.token.manager.update.interval86400HttpFS delegation token update interval, default 1 day, in seconds.
httpfs.hadoop.authentication.kerberos.keytab${user.home}/httpfs.keytabThe Kerberos keytab file with the credentials for the Kerberos principal used by httpfs to connect to the HDFS Namenode.
httpfs.hadoop.authentication.kerberos.principal${user.name}/${httpfs.hostname}@${kerberos.realm}The Kerberos principal used by httpfs to connect to the HDFS Namenode.
httpfs.hadoop.authentication.typesimpleDefines the authentication mechanism used by httpfs to connect to the HDFS Namenode. Valid values are 'simple' and 'kerberos'.
httpfs.hadoop.filesystem.cache.purge.frequency60Frequency, in seconds, for the idle filesystem purging daemon runs.
httpfs.hadoop.filesystem.cache.purge.timeout60Timeout, in seconds, for an idle filesystem to be purged.
httpfs.hostname${httpfs.http.hostname}Property used to synthetize the HTTP Kerberos principal used by httpfs. This property is only used to resolve other properties within this configuration file.
httpfs.http.administratorsACL for the admins, this configuration is used to control who can access the default servlets for HttpFS server. The value should be a comma separated list of users and groups. The user list comes first and is separated by a space followed by the group list, e.g. "user1,user2 group1,group2". Both users and groups are optional, so "user1", " group1", "", "user1 group1", "user1,user2 group1,group2" are all valid (note the leading space in " group1"). '' grants access to all users and groups, e.g. '', '* ' and ' *' are all valid.
httpfs.http.hostname0.0.0.0The bind host for HttpFS REST API.
httpfs.http.port14000The HTTP port for HttpFS REST API.
httpfs.servicesorg.apache.ozone.lib.service.instrumentation.InstrumentationService, org.apache.ozone.lib.service.scheduler.SchedulerService, org.apache.ozone.lib.service.security.GroupsService, org.apache.ozone.lib.service.hadoop.FileSystemAccessServiceServices used by the httpfs server.
httpfs.ssl.enabledfalseWhether SSL is enabled. Default is false, i.e. disabled.
kerberos.realmLOCALHOSTKerberos realm, used only if Kerberos authentication is used between the clients and httpfs or between HttpFS and HDFS. This property is only used to resolve other properties within this configuration file.
net.topology.node.switch.mapping.implorg.apache.hadoop.net.ScriptBasedMappingOZONE, SCMThe default implementation of the DNSToSwitchMapping. It invokes a script specified in net.topology.script.file.name to resolve node names. If the value for net.topology.script.file.name is not set, the default value of DEFAULT_RACK is returned for all node names.
ozone.UnsafeByteOperations.enabledtrueOZONE, PERFORMANCE, CLIENTIt specifies whether to use unsafe or safe buffer to byteString copy.
ozone.acl.authorizer.classorg.apache.hadoop.ozone.security.acl.OzoneAccessAuthorizerOZONE, SECURITY, ACLAcl authorizer for Ozone.
ozone.acl.enabledfalseOZONE, SECURITY, ACLKey to enable/disable ozone acls.
ozone.administratorsOZONE, SECURITYOzone administrator users delimited by the comma. If not set, only the user who launches an ozone service will be the admin user. This property must be set if ozone services are started by different users. Otherwise, the RPC layer will reject calls from other servers which are started by users not in the list.
ozone.administrators.groupsOZONE, SECURITYOzone administrator groups delimited by the comma. This is the list of groups who can access admin only information from ozone. It is enough to either have the name defined in ozone.administrators or be directly or indirectly in a group defined in this property.
ozone.audit.log.debug.cmd.list.dnauditDATANODEA comma separated list of Datanode commands that are written to the DN audit logs only if the audit log level is debug. Ex: "CREATE_CONTAINER,READ_CONTAINER,UPDATE_CONTAINER".
ozone.audit.log.debug.cmd.list.omauditOMA comma separated list of OzoneManager commands that are written to the OzoneManager audit logs only if the audit log level is debug. Ex: "ALLOCATE_BLOCK,ALLOCATE_KEY,COMMIT_KEY".
ozone.audit.log.debug.cmd.list.scmauditSCMA comma separated list of SCM commands that are written to the SCM audit logs only if the audit log level is debug. Ex: "GET_VERSION,REGISTER,SEND_HEARTBEAT".
ozone.authorization.enabledtrueOZONE, SECURITY, AUTHORIZATIONMaster switch to enable/disable authorization checks in Ozone (admin privilege checks and ACL checks). This property only takes effect when ozone.security.enabled is true. When true: admin privilege checks are always performed, and object ACL checks are controlled by ozone.acl.enabled. When false: no authorization checks are performed. Default is true.
ozone.block.deleting.service.interval1mOZONE, PERFORMANCE, SCMTime interval of the block deleting service. The block deleting service runs on each datanode periodically and deletes blocks queued for deletion. Unit could be defined with postfix (ns,ms,s,m,h,d)
ozone.block.deleting.service.timeout300000msOZONE, PERFORMANCE, SCMA timeout value of block deletion service. If this is set greater than 0, the service will stop waiting for the block deleting completion after this time. This setting supports multiple time unit suffixes as described in dfs.heartbeat.interval. If no suffix is specified, then milliseconds is assumed.
ozone.block.deleting.service.workers10OZONE, PERFORMANCE, SCMNumber of workers executed of block deletion service. This configuration should be set to greater than 0.
ozone.chunk.read.buffer.default.size1MBOZONE, SCM, CONTAINER, PERFORMANCEThe default read buffer size during read chunk operations when checksum is disabled. Chunk data will be cached in buffers of this capacity. For chunk data with checksum, the read buffer size will be the same as the number of bytes per checksum (ozone.client.bytes.per.checksum) corresponding to the chunk.
ozone.chunk.read.mapped.buffer.max.count0OZONE, SCM, CONTAINER, PERFORMANCEThe default max count of memory mapped buffers allowed for a DN. Default 0 means no mapped buffers allowed for data read.
ozone.chunk.read.mapped.buffer.threshold32KBOZONE, SCM, CONTAINER, PERFORMANCEThe default read threshold to use memory mapped buffers.
ozone.client.bucket.replication.config.refresh.time.ms30000OZONEDefault time period to refresh the bucket replication config in o3fs clients. Until the bucket replication config refreshed, client will continue to use existing replication config irrespective of whether bucket replication config updated at OM or not.
ozone.client.bytes.per.checksum16KBCLIENT, CRYPTO_COMPLIANCEChecksum will be computed for every bytes per checksum number of bytes and stored sequentially. The minimum value for this config is 8KB.
ozone.client.checksum.combine.modeCOMPOSITE_CRCCLIENTThe combined checksum type [MD5MD5CRC / COMPOSITE_CRC] determines which algorithm would be used to compute file checksum.COMPOSITE_CRC calculates the combined CRC of the whole file, where the lower-level chunk/block checksums are combined into file-level checksum.MD5MD5CRC calculates the MD5 of MD5 of checksums of individual chunks.Default checksum type is COMPOSITE_CRC.
ozone.client.checksum.typeCRC32CLIENT, CRYPTO_COMPLIANCEThe checksum type [NONE/ CRC32/ CRC32C/ SHA256/ MD5] determines which algorithm would be used to compute checksum for chunk data. Default checksum type is CRC32.
ozone.client.connection.timeout5000msOZONE, PERFORMANCE, CLIENTConnection timeout for Ozone client in milliseconds.
ozone.client.datastream.buffer.flush.size16MBCLIENTThe boundary at which putBlock is executed
ozone.client.datastream.min.packet.size1MBCLIENTThe maximum size of the ByteBuffer (used via ratis streaming)
ozone.client.datastream.pipeline.modetrueCLIENTStreaming write support both pipeline mode(datanode1->datanode2->datanode3) and star mode(datanode1->datanode2, datanode1->datanode3). By default we use pipeline mode.
ozone.client.datastream.sync.size0BCLIENTThe minimum size of written data before forcing the datanodes in the pipeline to flush the pending data to underlying storage. If set to zero or negative, the client will not force the datanodes to flush.
ozone.client.datastream.window.size64MBCLIENTMaximum size of BufferList(used for retry) size per BlockDataStreamOutput instance
ozone.client.ec.grpc.retries.enabledtrueCLIENTTo enable Grpc client retries for EC.
ozone.client.ec.grpc.retries.max3CLIENTThe maximum attempts GRPC client makes before failover.
ozone.client.ec.grpc.write.timeout30sOZONE, CLIENT, MANAGEMENTTimeout for ozone ec grpc client during write.
ozone.client.ec.reconstruct.stripe.read.pool.limit30CLIENTThread pool max size for parallel read available ec chunks to reconstruct the whole stripe.
ozone.client.ec.reconstruct.stripe.write.pool.limit30CLIENTThread pool max size for parallel write available ec chunks to reconstruct the whole stripe.
ozone.client.ec.stripe.queue.size2CLIENTThe max number of EC stripes can be buffered in client before flushing into datanodes.
ozone.client.elastic.byte.buffer.pool.max.size16GBOZONE, CLIENTThe maximum total size of buffers that can be cached in the client-side ByteBufferPool. This pool is used heavily during EC read and write operations. Setting a limit prevents unbounded memory growth in long-lived rpc clients like the S3 Gateway. Once this limit is reached, used buffers are not put back to the pool and will be garbage collected.
ozone.client.exclude.nodes.expiry.time600000CLIENTTime after which an excluded node is reconsidered for writes. If the value is zero, the node is excluded for the life of the client
ozone.client.failover.max.attempts500Expert only. Ozone RpcClient attempts talking to each OzoneManager ipc.client.connect.max.retries (default = 10) number of times before failing over to another OzoneManager, if available. This parameter represents the number of times per request the client will failover before giving up. This value is kept high so that client does not give up trying to connect to OMs easily.
ozone.client.follower.read.default.consistencyLINEARIZABLE_ALLOW_FOLLOWERThe default consistency when client enables follower read. Currently, the supported follower read consistency are LINEARIZABLE_ALLOW_FOLLOWER and LOCAL_LEASE The default value is LINEARIZABLE_ALLOW_FOLLOWER to preserve the same strong consistency behavior when switching from leader-only read to follower read.
ozone.client.follower.read.enabledfalseEnable client to read from OM followers. If false, all client requests are sent to the OM leader.
ozone.client.fs.default.bucket.layoutFILE_SYSTEM_OPTIMIZEDOZONE, CLIENTDefault bucket layout value used when buckets are created using OFS. Supported values are LEGACY and FILE_SYSTEM_OPTIMIZED. FILE_SYSTEM_OPTIMIZED: This layout allows the bucket to support atomic rename/delete operations and also allows interoperability between S3 and FS APIs. Keys written via S3 API with a "/" delimiter will create intermediate directories.
ozone.client.hbase.enhancements.allowedfalseCLIENTWhen set to false, client-side HBase enhancement-related Ozone (experimental) features are disabled (not allowed to be enabled) regardless of whether those configs are set. Here is the list of configs and values overridden when this config is set to false: 1. ozone.fs.hsync.enabled = false 2. ozone.client.incremental.chunk.list = false 3. ozone.client.stream.putblock.piggybacking = false 4. ozone.client.key.write.concurrency = 1 A warning message will be printed if any of the above configs are overridden by this.
ozone.client.incremental.chunk.listfalseCLIENTClient PutBlock request can choose incremental chunk list rather than full chunk list to optimize performance. Critical to HBase. EC does not support this feature. Can be enabled only when ozone.client.hbase.enhancements.allowed = true
ozone.client.key.latest.version.locationtrueOZONE, CLIENTOzone client gets the latest version location.
ozone.client.key.provider.cache.expiry10dOZONE, CLIENT, SECURITYOzone client security key provider cache expiration time.
ozone.client.key.write.concurrency1CLIENTMaximum concurrent writes allowed on each key. Defaults to 1 which matches the behavior before HDDS-9844. For unlimited write concurrency, set this to -1 or any negative integer value. Any value other than 1 is effective only when ozone.client.hbase.enhancements.allowed = true
ozone.client.leader.read.default.consistencyDEFAULTThe default consistency when client disables follower read. Currently, the supported leader read consistency are DEFAULT and LINEARIZABLE_LEADER_ONLY. The default value is DEFAULT for backward compatibility reason which is mostly strongly consistent.
ozone.client.list.cache1000OZONE, PERFORMANCEConfiguration property to configure the cache size of client list calls.
ozone.client.max.ec.stripe.write.retries10CLIENTWhen EC stripe write failed, client will request to allocate new block group and write the failed stripe into new block group. If the same stripe failure continued in newly acquired block group also, then it will retry by requesting to allocate new block group again. This configuration is used to limit these number of retries. By default the number of retries are 10.
ozone.client.max.retries5CLIENTMaximum number of retries by Ozone Client on encountering exception while writing a key
ozone.client.read.max.retries3CLIENTMaximum number of retries by Ozone Client on encountering connectivity exception when reading a key.
ozone.client.read.retry.interval1CLIENTIndicates the time duration in seconds a client will wait before retrying a read key request on encountering a connectivity exception from Datanodes. By default the interval is 1 second
ozone.client.read.timeout30sOZONE, CLIENT, MANAGEMENTTimeout for ozone grpc client during read.
ozone.client.retry.interval0CLIENTIndicates the time duration a client will wait before retrying a write key request on encountering an exception. By default there is no wait
ozone.client.server-defaults.validity.period.ms3600000OZONE, CLIENT, SECURITYThe amount of milliseconds after which cached server defaults are updated. By default this parameter is set to 1 hour. Support multiple time unit suffix(case insensitive). If no time unit is specified then milliseconds is assumed.
ozone.client.socket.timeout5000msOZONE, CLIENTSocket timeout for Ozone client. Unit could be defined with postfix (ns,ms,s,m,h,d)
ozone.client.stream.buffer.flush.delaytrueCLIENTDefault true, when call flush() and determine whether the data in the current buffer is greater than ozone.client.stream.buffer.size, if greater than then send buffer to the datanode. You can turn this off by setting this configuration to false.
ozone.client.stream.buffer.flush.size16MBCLIENTSize which determines at what buffer position a partial flush will be initiated during write. It should be a multiple of ozone.client.stream.buffer.size
ozone.client.stream.buffer.increment0BCLIENTBuffer (defined by ozone.client.stream.buffer.size) will be incremented with this steps. If zero, the full buffer will be created at once. Setting it to a variable between 0 and ozone.client.stream.buffer.size can reduce the memory usage for very small keys, but has a performance overhead.
ozone.client.stream.buffer.max.size32MBCLIENTSize which determines at what buffer position write call be blocked till acknowledgement of the first partial flush happens by all servers.
ozone.client.stream.buffer.size4MBCLIENTThe size of chunks the client will send to the server
ozone.client.stream.putblock.piggybackingfalseCLIENTAllow PutBlock to be piggybacked in WriteChunk requests if the chunk is small. Can be enabled only when ozone.client.hbase.enhancements.allowed = true
ozone.client.stream.read.pre-read-size33554432CLIENTExtra bytes to prefetch during streaming reads.
ozone.client.stream.read.response-data-size1048576CLIENTChunk size of streaming read responses from datanodes.
ozone.client.stream.read.timeout10sCLIENTTimeout for receiving streaming read responses.
ozone.client.stream.readblock.enablefalseCLIENTAllow ReadBlock to stream all the readChunk in one request.
ozone.client.verify.checksumtrueCLIENTOzone client to verify checksum of the checksum blocksize data.
ozone.client.wait.between.retries.millis2000Expert only. The time to wait, in milliseconds, between retry attempts to contact OM. Wait time increases linearly if same OM is retried again. If retrying on multiple OMs proxies in round robin fashion, the wait time is introduced after all the OM proxies have been attempted once.
ozone.container.cache.lock.stripes1024PERFORMANCE, CONTAINER, STORAGEContainer DB open is an exclusive operation. We use a stripe lock to guarantee that different threads can open different container DBs concurrently, while for one container DB, only one thread can open it at the same time. This setting controls the lock stripes.
ozone.container.cache.size1024PERFORMANCE, CONTAINER, STORAGEThe open container is cached on the data node side. We maintain an LRU cache for caching the recently used containers. This setting controls the size of that cache.
ozone.csi.default-volume-size1000000000STORAGEThe default size of the create volumes (if not specified).
ozone.csi.mount.commandgoofys --endpoint %s %s %sSTORAGEThis is the mount command which is used to publish volume. these %s will be replicated by s3gAddress, volumeId and target path.
ozone.csi.ownerSTORAGEThis is the username which is used to create the requested storage. Used as a hadoop username and the generated ozone volume used to store all the buckets. WARNING: It can be a security hole to use CSI in a secure environments as ALL the users can request the mount of a specific bucket via the CSI interface.
ozone.csi.s3g.addresshttp://localhost:9878STORAGEThe address of S3 Gateway endpoint.
ozone.csi.socket/var/lib/csi.sockSTORAGEThe socket where all the CSI services will listen (file name).
ozone.default.bucket.layoutOZONE, MANAGEMENTDefault bucket layout used by Ozone Manager during bucket creation when a client does not specify the bucket layout option. Supported values are OBJECT_STORE and FILE_SYSTEM_OPTIMIZED. OBJECT_STORE: This layout allows the bucket to behave as a pure object store and will not allow interoperability between S3 and FS APIs. FILE_SYSTEM_OPTIMIZED: This layout allows the bucket to support atomic rename/delete operations and also allows interoperability between S3 and FS APIs. Keys written via S3 API with a "/" delimiter will create intermediate directories.
ozone.directory.deleting.service.interval1mOZONE, PERFORMANCE, OMTime interval of the directory deleting service. It runs on OM periodically and cleanup orphan directory and its sub-tree. For every orphan directory it deletes the sub-path tree structure(dirs/files). It sends sub-files to KeyDeletingService to deletes its blocks. Unit could be defined with postfix (ns,ms,s,m,h,d)
ozone.filesystem.snapshot.enabledtrueOZONE, OMEnables Ozone filesystem snapshot feature if set to true on the OM side. Disables it otherwise.
ozone.freon.http-address0.0.0.0:9884OZONE, MANAGEMENTThe address and the base port where the FREON web ui will listen on. If the port is 0 then the server will start on a free port.
ozone.freon.http-bind-host0.0.0.0OZONE, MANAGEMENTThe actual address the Freon web server will bind to. If this optional address is set, it overrides only the hostname portion of ozone.freon.http-address.
ozone.freon.http.auth.kerberos.keytab/etc/security/keytabs/HTTP.keytabSECURITYKeytab used by Freon.
ozone.freon.http.auth.kerberos.principalHTTP/_HOST@REALMSECURITYSecurity principal used by freon.
ozone.freon.http.auth.typesimpleFREON, SECURITYsimple or kerberos. If kerberos is set, SPNEGO will be used for http authentication.
ozone.freon.http.enabledtrueOZONE, MANAGEMENTProperty to enable or disable FREON web ui.
ozone.freon.https-address0.0.0.0:9885OZONE, MANAGEMENTThe address and the base port where the Freon web server will listen on using HTTPS. If the port is 0 then the server will start on a free port.
ozone.freon.https-bind-host0.0.0.0OZONE, MANAGEMENTThe actual address the Freon web server will bind to using HTTPS. If this optional address is set, it overrides only the hostname portion of ozone.freon.http-address.
ozone.fs.datastream.auto.threshold4MBOZONE, DATANODEA threshold to auto select datastream to write files in OzoneFileSystem.
ozone.fs.datastream.enabledfalseOZONE, DATANODETo enable/disable filesystem write via ratis streaming.
ozone.fs.hsync.enabledfalseOZONE, CLIENT, OMEnable hsync/hflush on the Ozone Manager and/or client side. Disabled by default. Can be enabled only when ozone.hbase.enhancements.allowed = true
ozone.fs.iterate.batch-size100OZONE, OZONEFSIterate batch size of delete when use BasicOzoneFileSystem.
ozone.fs.listing.page.size1024OZONE, CLIENTListing page size value used by client for listing number of items on fs related sub-commands output. Kindly set this config value responsibly to avoid high resource usage. Maximum value restricted is 5000 for optimum performance.
ozone.fs.listing.page.size.max5000OZONE, OMMaximum listing page size value enforced by server for listing items on fs related sub-commands output. Kindly set this config value responsibly to avoid high resource usage. Maximum value restricted is 5000 for optimum performance.
ozone.hbase.enhancements.allowedfalseOZONE, OMWhen set to false, server-side HBase enhancement-related Ozone (experimental) features are disabled (not allowed to be enabled) regardless of whether those configs are set. Here is the list of configs and values overridden when this config is set to false: 1. ozone.fs.hsync.enabled = false A warning message will be printed if any of the above configs are overridden by this.
ozone.http.basedirOZONE, OM, SCM, MANAGEMENTThe base dir for HTTP Jetty server to extract contents. If this property is not configured, by default, Jetty will create a directory inside the directory named by the ${ozone.metadata.dirs}/webserver. While in production environment, it's strongly suggested instructing Jetty to use a different parent directory by setting this property to the name of the desired parent directory. The value of the property will be used to set Jetty context attribute 'org.eclipse.jetty.webapp.basetempdir'. The directory named by this property must exist and be writeable.
ozone.http.filter.initializersOZONE, SECURITY, KERBEROSSet to org.apache.hadoop.security.AuthenticationFilterInitializer to enable Kerberos authentication for Ozone HTTP web consoles is enabled using the SPNEGO protocol. When this property is set, ozone.security.http.kerberos.enabled should be set to true.
ozone.http.policyHTTP_ONLYOZONE, SECURITY, MANAGEMENTDecide if HTTPS(SSL) is supported on Ozone This configures the HTTP endpoint for Ozone daemons: The following values are supported: - HTTP_ONLY : Service is provided only on http - HTTPS_ONLY : Service is provided only on https - HTTP_AND_HTTPS : Service is provided both on http and https
ozone.https.client.keystore.resourcessl-client.xmlOZONE, SECURITY, MANAGEMENTResource file from which ssl client keystore information will be extracted
ozone.https.client.need-authfalseOZONE, SECURITY, MANAGEMENTWhether SSL client certificate authentication is required
ozone.https.server.keystore.resourcessl-server.xmlOZONE, SECURITY, MANAGEMENTResource file from which ssl server keystore information will be extracted
ozone.key.deleting.limit.per.task50000OM, PERFORMANCEA maximum number of keys to be scanned by key deleting service per time interval in OM. Those keys are sent to delete metadata and generate transactions in SCM for next async deletion between SCM and DataNode.
ozone.key.preallocation.max.blocks64OZONE, OM, PERFORMANCEWhile allocating blocks from OM, this configuration limits the maximum number of blocks being allocated. This configuration ensures that the allocated block response do not exceed rpc payload limit. If client needs more space for the write, separate block allocation requests will be made.
ozone.manager.delegation.remover.scan.interval3600000Time interval after which ozone secret manger scans for expired delegation token.
ozone.manager.delegation.token.max-lifetime7dDefault max time interval after which ozone delegation token will not be renewed. Delegation Token is signed and verified using secret key which has a max hdds.secret.key.expiry.duration lifetime. To guarantee that the delegation token can be properly loaded, verified, and renewed during its lifetime, (ozone.manager.delegation.token.max-lifetime + hdds.secret.key.rotate.duration + ozone.manager.delegation.remover.scan.interval) must not be greater than hdds.secret.key.expiry.duration. If any of ozone.manager.delegation.token.max-lifetime, hdds.secret.key.expiry.duration, hdds.secret.key.rotate.duration or ozone.manager.delegation.remover.scan.interval value is changed, The above constrain must be checked and values be adjusted accordingly if necessary.
ozone.manager.delegation.token.renew-interval1dDefault time interval after which ozone delegation token will require renewal before any further use.
ozone.metadata.dirsOZONE, OM, SCM, CONTAINER, STORAGE, REQUIREDThis setting is the fallback location for SCM, OM, Recon and DataNodes to store their metadata. This setting may be used only in test/PoC clusters to simplify configuration. For production clusters or any time you care about performance, it is recommended that ozone.om.db.dirs, ozone.scm.db.dirs and hdds.container.ratis.datanode.storage.dir be configured separately.
ozone.metadata.dirs.permissions700Permissions for the metadata directories for fallback location for SCM, OM, Recon and DataNodes to store their metadata. The permissions have to be octal or symbolic. This is the fallback used in case the default permissions for OM,SCM,Recon,Datanode are not set.
ozone.metastore.rocksdb.cf.write.buffer.size128MBOZONE, OM, SCM, STORAGE, PERFORMANCEThe write buffer (memtable) size for each column family of the rocksdb store. Check the rocksdb documentation for more details.
ozone.metastore.rocksdb.statisticsOFFOZONE, OM, SCM, STORAGE, PERFORMANCEThe statistics level of the rocksdb store. If you use any value from org.rocksdb.StatsLevel (eg. ALL or EXCEPT_DETAILED_TIMERS), the rocksdb statistics will be exposed over JMX bean with the choosed setting. Set it to OFF to not initialize rocksdb statistics at all. Please note that collection of statistics could have 5-10% performance penalty. Check the rocksdb documentation for more details.
ozone.network.flexible.fqdn.resolution.enabledfalseOZONE, SCM, OMSCM, OM hosts will be able to resolve itself based on its host name instead of fqdn. It is useful for deploying to kubernetes environment, during the initial launching time when [pod_name].[service_name] is not resolvable yet because of the probe.
ozone.network.jvm.address.cache.enabledtrueOZONE, SCM, OM, DATANODEDisable the jvm network address cache. In environment such as kubernetes, IPs of instances of scm, om and datanodes can be changed. Disabling this cache helps to quickly resolve the fqdn's to the new IPs.
ozone.network.topology.aware.readtrueOZONE, PERFORMANCEWhether to enable topology aware read to improve the read performance.
ozone.om.address0.0.0.0:9862OM, REQUIREDThe address of the Ozone OM service. This allows clients to discover the address of the OM. When HA mode is enabled, append the service ID and node ID to each OM property. For example: ozone.om.address.service1.om1
ozone.om.admin.protocol.max.retries20OM, MANAGEMENTExpert only. The maximum number of retries for Ozone Manager Admin protocol on each OM.
ozone.om.admin.protocol.wait.between.retries1000OM, MANAGEMENTExpert only. The time to wait, in milliseconds, between retry attempts for Ozone Manager Admin protocol.
ozone.om.allow.leader.skip.linearizable.readfalseOM, PERFORMANCE, HAAllow leader to handler requests directly, no need to check the leadership for every request.
ozone.om.client.rpc.timeout15mOZONE, OM, CLIENTRpcClient timeout on waiting for the response from OzoneManager. The default value is set to 15 minutes. If ipc.client.ping is set to true and this rpc-timeout is greater than the value of ipc.ping.interval, the effective value of the rpc-timeout is rounded up to multiple of ipc.ping.interval.
ozone.om.client.trash.core.pool.size5OZONE, OM, CLIENTTotal number of threads in pool for the Trash Emptier
ozone.om.compaction.service.columnfamilieskeyTable,fileTable,directoryTable,deletedTable,deletedDirectoryTable,multipartInfoTableOZONE, OM, PERFORMANCEA comma separated, no spaces list of all the column families that are compacted by the compaction service. If this is empty, no column families are compacted.
ozone.om.compaction.service.enabledfalseOZONE, OM, PERFORMANCEEnable or disable a background job that periodically compacts rocksdb tables flagged for compaction.
ozone.om.compaction.service.run.interval6hOZONE, OM, PERFORMANCEA background job that periodically compacts rocksdb tables flagged for compaction. Unit could be defined with postfix (ns,ms,s,m,h,d)
ozone.om.compaction.service.timeout10mOZONE, OM, PERFORMANCEA timeout value of compaction service. If this is set greater than 0, the service will stop waiting for compaction completion after this time. Unit could be defined with postfix (ns,ms,s,m,h,d)
ozone.om.container.location.cache.size100000OZONE, OMThe size of the container locations cache in Ozone Manager. This cache allows Ozone Manager to populate block locations in key-read responses without calling SCM, thus increases Ozone Manager read performance.
ozone.om.container.location.cache.ttl360mOZONE, OMThe time to live for container location cache in Ozone.
ozone.om.db.checkpoint.use.inode.based.transfertrueOZONE, OMDenotes if the OM bootstrap inode based transfer implementation is set as default.
ozone.om.db.dirsOZONE, OM, STORAGE, PERFORMANCEDirectory where the OzoneManager stores its metadata. This should be specified as a single directory. If the directory does not exist then the OM will attempt to create it. If undefined, then the OM will log a warning and fallback to ozone.metadata.dirs. This fallback approach is not recommended for production environments.
ozone.om.db.dirs.permissions700Permissions for the metadata directories for Ozone Manager. The permissions have to be octal or symbolic. If the default permissions are not set then the default value of 700 will be used.
ozone.om.db.max.open.files-1OZONE, OMMax number of open files that OM RocksDB will open simultaneously Essentially sets max_open_files config for the active OM RocksDB instance. This will limit the total number of files opened by OM db. Default is -1 which is unlimited and gives max performance. If you are certain that your ulimit will always be bigger than number of files in the database, set max_open_files to -1, or else set it to a value lesser than or equal to ulimit.
ozone.om.decommissioned.nodes.EXAMPLEOMSERVICEIDOM, HAComma-separated list of OM node Ids which have been decommissioned. OMs present in this list will not be included in the OM HA ring.
ozone.om.delta.update.data.size.max.limit1024MBOM, MANAGEMENTRecon get a limited delta updates from OM periodically since sequence number. Based on sequence number passed, OM DB delta update may have large number of log files and each log batch data may be huge depending on frequent writes and updates by ozone client, so to avoid increase in heap memory, this config is used as limiting factor of default 1 GB while preparing DB updates object.
ozone.om.edekcacheloader.initial.delay.ms3000When KeyProvider is configured, the time delayed until the first attempt to warm up edek cache on OM start up.
ozone.om.edekcacheloader.interval.ms1000When KeyProvider is configured, the interval time of warming up edek cache on OM starts up. All edeks will be loaded from KMS into provider cache. The edek cache loader will try to warm up the cache until succeed or OM leaves active state.
ozone.om.edekcacheloader.max-retries10When KeyProvider is configured, the max retries allowed to attempt warm up edek cache if none of key successful on OM start up.
ozone.om.enable.filesystem.pathsfalseOM, OZONEIf true, key names will be interpreted as file system paths. '/' will be treated as a special character and paths will be normalized and must follow Unix filesystem path naming conventions. This flag will be helpful when objects created by S3G need to be accessed using OFS/O3Fs. If false, it will fallback to default behavior of Key/MPU create requests where key paths are not normalized and any intermediate directories will not be created or any file checks happens to check filesystem semantics.
ozone.om.enable.ofs.shared.tmp.dirfalseOZONE, OMEnable shared ofs tmp directory ofs://tmp. Allows a root tmp directory with sticky-bit behaviour.
ozone.om.follower.read.local.lease.enabledfalseOM, PERFORMANCE, HA, RATISIf we enabled the local lease for Follower Read. If enabled, follower OM will decide if return local data directly based on lag log and time.
ozone.om.follower.read.local.lease.log.limit10000OM, PERFORMANCE, HA, RATISIf the log lag between leader OM and follower OM is larger than this number, the follower OM is not up-to-date. Setting this to -1 to allow infinite lag.
ozone.om.follower.read.local.lease.time.ms5000OM, PERFORMANCE, HA, RATISIf the lag time Ms between leader OM and follower OM is larger than this number, the follower OM is not up-to-date. By default, it's set to Ratis RPC timeout value. Setting this to -1 to allow infinite lag.
ozone.om.fs.snapshot.max.limit10000OZONE, OM, MANAGEMENTThe maximum number of filesystem snapshot allowed in an Ozone Manager.
ozone.om.group.rightsREAD, LISTOM, SECURITYDefault group permissions set for an object in OzoneManager.
ozone.om.grpc.bossgroup.size8OZONE, OM, S3GATEWAYOM grpc server netty boss event group size.
ozone.om.grpc.maximum.response.length134217728OZONE, OM, S3GATEWAYOM/S3GATEWAY OMRequest, OMResponse over grpc max message length (bytes).
ozone.om.grpc.port8981MANAGEMENTPort used for the GrpcOmTransport OzoneManagerServiceGrpc server
ozone.om.grpc.read.thread.num32OZONE, OM, S3GATEWAYOM grpc server read thread pool core thread size.
ozone.om.grpc.workergroup.size32OZONE, OM, S3GATEWAYOM grpc server netty worker event group size.
ozone.om.ha.raft.server.log.appender.wait-time.min0msOZONE, OM, RATIS, PERFORMANCEMinimum wait time between two appendEntries calls.
ozone.om.ha.raft.server.read.leader.lease.enabledfalseOZONE, OM, RATIS, PERFORMANCEIf we enabled the leader lease on Ratis Leader.
ozone.om.ha.raft.server.read.optionDEFAULTOZONE, OM, RATIS, PERFORMANCESelect the Ratis server read option. Possible values are: DEFAULT - Directly query statemachine (non-linearizable). Only the leader can serve read requests. LINEARIZABLE - Use ReadIndex (see Raft Paper section 6.4) to maintain linearizability. Both the leader and the followers can serve read requests.
ozone.om.ha.raft.server.retrycache.expirytime300sOZONE, OM, RATISThe timeout duration of the retry cache.
ozone.om.handler.count.key100OM, PERFORMANCEThe number of RPC handler threads for OM service endpoints.
ozone.om.hierarchical.resource.locks.hard.limit10000Maximum number of lock objects that could be present in the pool.
ozone.om.hierarchical.resource.locks.soft.limit1024Soft limit for number of lock objects that could be idle in the pool.
ozone.om.http-address0.0.0.0:9874OM, MANAGEMENTThe address and the base port where the OM web UI will listen on. If the port is 0, then the server will start on a free port. However, it is best to specify a well-known port, so it is easy to connect and see the OM management UI. When HA mode is enabled, append the service ID and node ID to each OM property. For example: ozone.om.http-address.service1.om1
ozone.om.http-bind-host0.0.0.0OM, MANAGEMENTThe actual address the OM web server will bind to. If this optional the address is set, it overrides only the hostname portion of ozone.om.http-address. When HA mode is enabled, append the service ID and node ID to each OM property. For example: ozone.om.http-bind-host.service1.om1
ozone.om.http.auth.kerberos.keytab/etc/security/keytabs/HTTP.keytabOZONE, SECURITY, KERBEROSThe keytab file used by OM http server to login as its service principal if SPNEGO is enabled for om http server.
ozone.om.http.auth.kerberos.principalHTTP/_HOST@REALMOZONE, SECURITY, KERBEROSOzone Manager http server service principal if SPNEGO is enabled for om http server.
ozone.om.http.auth.typesimpleOM, SECURITY, KERBEROSsimple or kerberos. If kerberos is set, SPNEGO will be used for http authentication.
ozone.om.http.enabledtrueOM, MANAGEMENTProperty to enable or disable OM web user interface.
ozone.om.https-address0.0.0.0:9875OM, MANAGEMENT, SECURITYThe address and the base port where the OM web UI will listen on using HTTPS. If the port is 0 then the server will start on a free port. When HA mode is enabled, append the service ID and node ID to each OM property. For example: ozone.om.https-address.service1.om1
ozone.om.https-bind-host0.0.0.0OM, MANAGEMENT, SECURITYThe actual address the OM web server will bind to using HTTPS. If this optional address is set, it overrides only the hostname portion of ozone.om.https-address. When HA mode is enabled, append the service ID and node ID to each OM property. For example: ozone.om.https-bind-host.service1.om1
ozone.om.internal.service.idOM, HAService ID of the Ozone Manager. If this is not set fall back to ozone.om.service.ids to find the service ID it belongs to.
ozone.om.kerberos.keytab.file/etc/security/keytabs/OM.keytabOZONE, SECURITY, KERBEROSThe keytab file used by OzoneManager daemon to login as its service principal. The principal name is configured with ozone.om.kerberos.principal.
ozone.om.kerberos.principalOM/_HOST@REALMOZONE, SECURITY, KERBEROSThe OzoneManager service principal. Ex om/_HOST@REALM.COM
ozone.om.kerberos.principal.pattern*A client-side RegEx that can be configured to control allowed realms to authenticate with (useful in cross-realm env.)
ozone.om.key.path.lock.enabledfalseOZONE, OMDefaults to false. If true, the fine-grained KEY_PATH_LOCK functionality is enabled. If false, it is disabled.
ozone.om.keyname.character.check.enabledfalseOM, OZONEIf true, then enable to check if the key name contains illegal characters when creating/renaming key. For the definition of illegal characters, follow the rules in Amazon S3's object key naming guide.
ozone.om.leader.election.minimum.timeout.duration5sOZONE, OM, RATIS, MANAGEMENT, DEPRECATEDDEPRECATED. Leader election timeout uses ratis rpc timeout which can be set via ozone.om.ratis.minimum.timeout.
ozone.om.lease.hard.limit7dOZONE, OM, PERFORMANCEControls how long an open hsync key is considered as active. Specifically, if a hsync key has been open longer than the value of this config entry, that open hsync key is considered as expired (e.g. due to client crash). Unit could be defined with postfix (ns,ms,s,m,h,d)
ozone.om.lease.soft.limit60sOZONE, OMHsync soft limit lease period.
ozone.om.lock.fairfalseIf this is true, the Ozone Manager lock will be used in Fair mode, which will schedule threads in the order received/queued. If this is false, uses non-fair ordering. See java.util.concurrent.locks.ReentrantReadWriteLock for more information on fair/non-fair locks.
ozone.om.max.buckets100000OZONE, OMmaximum number of buckets across all volumes.
ozone.om.multitenancy.enabledfalseOZONE, OMEnable S3 Multi-Tenancy. If disabled, all S3 multi-tenancy requests are rejected.
ozone.om.multitenancy.ranger.sync.interval10mOZONE, OMDetermines how often the Multi-Tenancy Ranger background sync thread service should run. Background thread periodically checks Ranger policies and roles created by Multi-Tenancy feature. And overwrites them if obvious discrepancies are detected. Value should be set with a unit suffix (ns,ms,s,m,h,d)
ozone.om.multitenancy.ranger.sync.timeout10sOZONE, OMThe timeout for each Multi-Tenancy Ranger background sync thread run. If the timeout has been reached, a warning message will be logged.
ozone.om.namespace.s3.stricttrueOZONE, OMOzone namespace should follow S3 naming rule by default. However this parameter allows the namespace to support non-S3 compatible characters.
ozone.om.network.topology.refresh.duration1hSCM, OZONE, OMThe duration at which we periodically fetch the updated network topology cluster tree from SCM.
ozone.om.node.idOM, HAThe ID of this OM node. If the OM node ID is not configured it is determined automatically by matching the local node's address with the configured address. If node ID is not deterministic from the configuration, then it is set to default node id - om1.
ozone.om.nodes.EXAMPLEOMSERVICEIDOM, HAComma-separated list of OM node Ids for a given OM service ID (eg. EXAMPLEOMSERVICEID). The OM service ID should be the value (one of the values if there are multiple) set for the parameter ozone.om.service.ids. Decommissioned nodes (represented by node Ids in ozone.om.decommissioned.nodes config list) will be ignored and not included in the OM HA setup even if added to this list. Unique identifiers for each OM Node, delimited by commas. This will be used by OzoneManagers in HA setup to determine all the OzoneManagers belonging to the same OMservice in the cluster. For example, if you used “omService1” as the OM service ID previously, and you wanted to use “om1”, “om2” and "om3" as the individual IDs of the OzoneManagers, you would configure a property ozone.om.nodes.omService1, and its value "om1,om2,om3".
ozone.om.object.creation.ignore.client.aclsfalseOM, SECURITYIgnore ACLs sent by client to OzoneManager during volume/bucket/key creation.
ozone.om.open.key.cleanup.limit.per.task1000OZONE, OM, PERFORMANCEThe maximum number of open keys to be identified as expired and marked for deletion by one run of the open key cleanup service on the OM. This property is used to throttle the actual number of open key deletions on the OM.
ozone.om.open.key.cleanup.service.interval24hOZONE, OM, PERFORMANCEA background job that periodically checks open key entries and marks expired open keys for deletion. This entry controls the interval of this cleanup check. Unit could be defined with postfix (ns,ms,s,m,h,d)
ozone.om.open.key.cleanup.service.timeout300sOZONE, OM, PERFORMANCEA timeout value of open key cleanup service. If this is set greater than 0, the service will stop waiting for the open key deleting completion after this time. If timeout happens to a large proportion of open key deletion, this value needs to be increased or ozone.om.open.key.cleanup.limit.per.task should be decreased. Unit could be defined with postfix (ns,ms,s,m,h,d)
ozone.om.open.key.expire.threshold7dOZONE, OM, PERFORMANCEControls how long an open key operation is considered active. Specifically, if a key has been open longer than the value of this config entry, that open key is considered as expired (e.g. due to client crash). Unit could be defined with postfix (ns,ms,s,m,h,d)
ozone.om.open.mpu.cleanup.service.interval24hOZONE, OM, PERFORMANCEA background job that periodically checks inactive multipart info send multipart upload abort requests for them. This entry controls the interval of this cleanup check. Unit could be defined with postfix (ns,ms,s,m,h,d)
ozone.om.open.mpu.cleanup.service.timeout300sOZONE, OM, PERFORMANCEA timeout value of multipart upload cleanup service. If this is set greater than 0, the service will stop waiting for the multipart info abort completion after this time. If timeout happens to a large proportion of multipart aborts, this value needs to be increased or ozone.om.open.key.cleanup.limit.per.task should be decreased. Unit could be defined with postfix (ns,ms,s,m,h,d)
ozone.om.open.mpu.expire.threshold30dOZONE, OM, PERFORMANCEControls how long multipart upload is considered active. Specifically, if a multipart info has been ongoing longer than the value of this config entry, that multipart info is considered as expired (e.g. due to client crash). Unit could be defined with postfix (ns,ms,s,m,h,d)
ozone.om.open.mpu.parts.cleanup.limit.per.task1000OZONE, OM, PERFORMANCEThe maximum number of parts, rounded up to the nearest number of expired multipart upload. This property is used to approximately throttle the number of MPU parts sent to the OM.
ozone.om.ratis.log.appender.queue.byte-limit32MBOZONE, DEBUG, OM, RATISByte limit for Raft's Log Worker queue.
ozone.om.ratis.log.appender.queue.num-elements1024OZONE, DEBUG, OM, RATISNumber of operation pending with Raft's Log Worker.
ozone.om.ratis.log.purge.gap1000000OZONE, OM, RATISThe minimum gap between log indices for Raft server to purge its log segments after taking snapshot.
ozone.om.ratis.log.purge.preservation.log.num0OZONE, OM, RATISThe number of latest Raft logs to not be purged after taking snapshot.
ozone.om.ratis.log.purge.upto.snapshot.indextrueOZONE, OM, RATISEnable/disable Raft server to purge its log up to the snapshot index after taking snapshot.
ozone.om.ratis.minimum.timeout5sOZONE, OM, RATIS, MANAGEMENTThe minimum timeout duration for OM's Ratis server rpc.
ozone.om.ratis.port9872OZONE, OM, RATISThe port number of the OzoneManager's Ratis server.
ozone.om.ratis.rpc.typeGRPCOZONE, OM, RATIS, MANAGEMENTRatis supports different kinds of transports like netty, GRPC, Hadoop RPC etc. This picks one of those for this cluster.
ozone.om.ratis.segment.preallocated.size4MBOZONE, OM, RATIS, PERFORMANCEThe size of the buffer which is preallocated for raft segment used by Apache Ratis on OM. (4 MB by default)
ozone.om.ratis.segment.size64MBOZONE, OM, RATIS, PERFORMANCEThe size of the raft segment used by Apache Ratis on OM. (64 MB by default)
ozone.om.ratis.server.close.threshold60sOZONE, OM, RATISRaft Server will close if JVM pause longer than the threshold.
ozone.om.ratis.server.failure.timeout.duration120sOZONE, OM, RATIS, MANAGEMENTThe timeout duration for ratis server failure detection, once the threshold has reached, the ratis state machine will be informed about the failure in the ratis ring.
ozone.om.ratis.server.leaderelection.pre-votetrueOZONE, OM, RATIS, MANAGEMENTEnable/disable OM HA leader election pre-vote phase.
ozone.om.ratis.server.pending.write.element-limit4096OZONE, DEBUG, OM, RATISMaximum number of pending write requests.
ozone.om.ratis.server.request.timeout3sOZONE, OM, RATIS, MANAGEMENTThe timeout duration for OM's ratis server request .
ozone.om.ratis.server.retry.cache.timeout600000msOZONE, OM, RATIS, MANAGEMENTRetry Cache entry timeout for OM's ratis server.
ozone.om.ratis.snapshot.dirOZONE, OM, STORAGE, MANAGEMENT, RATISThis directory is used for storing OM's snapshot related files like the ratisSnapshotIndex and DB checkpoint from leader OM. If undefined, OM snapshot dir will fallback to ozone.metadata.dirs. This fallback approach is not recommended for production environments.
ozone.om.ratis.snapshot.max.total.sst.size10737418240OZONE, OM, RATISMax size of SST files in OM Ratis Snapshot tarball.
ozone.om.ratis.storage.dirOZONE, OM, STORAGE, MANAGEMENT, RATISThis directory is used for storing OM's Ratis metadata like logs. If this is not set then default metadata dirs is used. A warning will be logged if this not set. Ideally, this should be mapped to a fast disk like an SSD. If undefined, OM ratis storage dir will fallback to ozone.metadata.dirs. This fallback approach is not recommended for production environments.
ozone.om.read.threadpool10OM, PERFORMANCEThe number of threads in RPC server reading from the socket for OM service endpoints. This config overrides Hadoop configuration "ipc.server.read.threadpool.size" for Ozone Manager.
ozone.om.s3.grpc.server_enabledtrueOZONE, OM, S3GATEWAYProperty to enable or disable Ozone Manager gRPC endpoint for clients. Right now, it is used by S3 Gateway only.
ozone.om.save.metrics.interval5mOZONE, OMTime interval used to store the omMetrics in to a file. Background thread periodically stores the OM metrics in to a file. Unit could be defined with postfix (ns,ms,s,m,h,d)
ozone.om.security.admin.protocol.acl*SECURITYComma separated list of users and groups allowed to access ozone manager admin protocol.
ozone.om.security.client.protocol.acl*SECURITYComma separated list of users and groups allowed to access client ozone manager protocol.
ozone.om.server.list.max.size1000OM, OZONEConfiguration property to configure the max server side response size for list calls on om.
ozone.om.service.idsOM, HAComma-separated list of OM service Ids. This property allows the client to figure out quorum of OzoneManager address.
ozone.om.snapshot.cache.cleanup.service.run.interval1mOZONE, OMInterval at which snapshot cache clean up will run. Uses millisecond by default when no time unit is specified.
ozone.om.snapshot.cache.max.size10OZONE, OMSize of the OM Snapshot LRU cache. This is a soft limit of open OM Snapshot RocksDB instances that will be held. The actual number of cached instance could exceed this limit if more than this number of snapshot instances are still in-use by snapDiff or other tasks.
ozone.om.snapshot.checkpoint.dir.creation.poll.timeout20sOZONE, PERFORMANCE, OMMax poll timeout for snapshot dir exists check performed before loading a snapshot in cache. Unit defaults to millisecond if a unit is not specified.
ozone.om.snapshot.compact.non.snapshot.diff.tablesfalseOZONE, OM, PERFORMANCEEnable or disable compaction of tables that are not tracked by snapshot diff when their snapshots are evicted from cache.
ozone.om.snapshot.compaction.dag.max.time.allowed30dOZONE, OMMaximum time a snapshot is allowed to be in compaction DAG before it gets pruned out by pruning daemon. Uses millisecond by default when no time unit is specified.
ozone.om.snapshot.compaction.dag.prune.daemon.run.interval10mOZONE, OMInterval at which compaction DAG pruning daemon thread is running to remove older snapshots with compaction history from compaction DAG. Uses millisecond by default when no time unit is specified.
ozone.om.snapshot.db.max.open.files100OZONE, OMMax number of open files for each snapshot db present in the snapshot cache. Essentially sets max_open_files config for RocksDB instances opened for Ozone snapshots. This will limit the total number of files opened by a snapshot db thereby limiting the total number of open file handles by snapshot dbs. Max total number of open handles = (snapshot cache size * max open files)
ozone.om.snapshot.diff.cleanup.service.run.interval1mOZONE, OMInterval at which snapshot diff clean up service will run. Uses millisecond by default when no time unit is specified.
ozone.om.snapshot.diff.cleanup.service.timeout5mOZONE, OMTimeout for snapshot diff clean up service. Uses millisecond by default when no time unit is specified.
ozone.om.snapshot.diff.db.dirOZONE, OMDirectory where the OzoneManager stores the snapshot diff related data. This should be specified as a single directory. If the directory does not exist then the OM will attempt to create it. If undefined, then the OM will log a warning and fallback to ozone.metadata.dirs. This fallback approach is not recommended for production environments.
ozone.om.snapshot.diff.disable.native.libsfalseOZONE, OMFlag to perform snapshot diff without using native libs(can be slow).
ozone.om.snapshot.diff.job.default.wait.time1mOZONE, OMDefault wait time returned to client to wait before retrying snap diff request. Uses millisecond by default when no time unit is specified.
ozone.om.snapshot.diff.job.report.persistent.time7dOZONE, OMMaximum time a successful snapshot diff job and its report will be persisted. Uses millisecond by default when no time unit is specified.
ozone.om.snapshot.diff.max.allowed.keys.changed.per.job10000000OZONE, OMMax numbers of keys changed allowed for a snapshot diff job.
ozone.om.snapshot.diff.max.jobs.purge.per.task100OZONE, OMMaximum number of snapshot diff jobs to be purged per snapDiff clean up run.
ozone.om.snapshot.diff.max.page.size1000OZONE, OMMaximum number of entries to be returned in a single page of snap diff report.
ozone.om.snapshot.diff.thread.pool.size10OZONE, OMMaximum numbers of concurrent snapshot diff jobs are allowed.
ozone.om.snapshot.directory.metrics.update.interval5mOZONE, OMTime interval used to update the space consumption stats of the Ozone Manager snapshot directories. Background thread periodically calculates and updates these stats. Unit could be defined with postfix (ns,ms,s,m,h,d)
ozone.om.snapshot.force.full.difffalseOZONE, OMFlag to always perform full snapshot diff (can be slow) without using the optimised compaction DAG.
ozone.om.snapshot.load.native.libtrueOZONE, OMLoad native library for performing optimized snapshot diff.
ozone.om.snapshot.local.data.manager.service.interval5mInterval for cleaning up orphan snapshot local data versions corresponding to snapshots
ozone.om.snapshot.provider.connection.timeout5000sOZONE, OM, HA, MANAGEMENTConnection timeout for HTTP call made by OM Snapshot Provider to request OM snapshot from OM Leader.
ozone.om.snapshot.provider.request.timeout300000msOZONE, OM, HA, MANAGEMENTConnection request timeout for HTTP call made by OM Snapshot Provider to request OM snapshot from OM Leader.
ozone.om.snapshot.provider.socket.timeout5000sOZONE, OM, HA, MANAGEMENTSocket timeout for HTTP call made by OM Snapshot Provider to request OM snapshot from OM Leader.
ozone.om.snapshot.prune.compaction.backup.batch.size2000OZONE, OMPrune SST files in Compaction backup directory in batches every ozone.om.snapshot.compaction.dag.prune.daemon.run.interval.
ozone.om.snapshot.rocksdb.metrics.enabledfalseOZONE, OMSkip collecting RocksDBStore metrics for Snapshotted DB.
ozone.om.transport.classorg.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransportFactoryOM, MANAGEMENTProperty to determine the transport protocol for the client to Ozone Manager channel.
ozone.om.unflushed.transaction.max.count10000OZONE, OMthe unflushed transactions here are those requests that have been applied to OM state machine but not been flushed to OM rocksdb. when OM meets high concurrency-pressure and flushing is not fast enough, too many pending requests will be hold in memory and will lead to long GC of OM, which will slow down flushing further. there are some cases that flushing is slow, for example, 1 rocksdb is on a HDD, which has poor IO performance than SSD. 2 a big compaction is happening internally in rocksdb and write stall of rocksdb happens. 3 long GC, which may caused by other factors. the property is to limit the max count of unflushed transactions, so that the maximum memory occupied by unflushed transactions is limited.
ozone.om.upgrade.finalization.ratis.based.timeout30sOM, UPGRADEMaximum time to wait for a slow follower to be finalized through a Ratis snapshot. This is an advanced config, and needs to be changed only under a special circumstance when the leader OM has purged the finalize request from its logs, and a follower OM was down during upgrade finalization. Default is 30s.
ozone.om.upgrade.quota.recalculate.enabledtrueOZONE, OMquota recalculation trigger when upgrade to the layout version QUOTA. while upgrade, re-calculation of quota used will block write operation to existing buckets till this operation is completed.
ozone.om.user.max.volume1024OM, MANAGEMENTThe maximum number of volumes a user can have on a cluster.Increasing or decreasing this number has no real impact on ozone cluster. This is defined only for operational purposes. Only an administrator can create a volume, once a volume is created there are no restrictions on the number of buckets or keys inside each bucket a user can create.
ozone.om.user.rightsALLOM, SECURITYDefault user permissions set for an object in OzoneManager.
ozone.om.volume.listall.allowedtrueOM, MANAGEMENTAllows everyone to list all volumes when set to true. Defaults to true. When set to false, non-admin users can only list the volumes they have access to. Admins can always list all volumes. Note that this config only applies to OzoneNativeAuthorizer. For other authorizers, admin needs to set policies accordingly to allow all volume listing e.g. for Ranger, a new policy with special volume "/" can be added to allow group public LIST access.
ozone.path.deleting.limit.per.task20000OZONE, PERFORMANCE, OMA maximum number of paths(dirs/files) to be deleted by directory deleting service per time interval.
ozone.readonly.administratorsOzone read only admin users delimited by the comma. If set, This is the list of users are allowed to read operations skip checkAccess.
ozone.readonly.administrators.groupsOzone read only admin groups delimited by the comma. If set, This is the list of groups are allowed to read operations skip checkAccess.
ozone.recon.addressRECON, MANAGEMENTRPC address of Recon Server. If not set, datanodes will not configure Recon Server.
ozone.recon.administratorsRECON, SECURITYRecon administrator users delimited by a comma. This is the list of users who can access admin only information from recon. Users defined in ozone.administrators will always be able to access all recon information regardless of this setting.
ozone.recon.administrators.groupsRECON, SECURITYRecon administrator groups delimited by a comma. This is the list of groups who can access admin only information from recon. It is enough to either have the name defined in ozone.recon.administrators or be directly or indirectly in a group defined in this property.
ozone.recon.containerkey.flush.db.max.threshold150000OZONE, RECON, PERFORMANCEMaximum threshold number of entries to hold in memory for Container Key Mapper task in hashmap before flushing to recon rocks DB containerKeyTable
ozone.recon.db.dirOZONE, RECON, STORAGE, PERFORMANCEDirectory where the Recon Server stores its metadata. This should be specified as a single directory. If the directory does not exist then the Recon will attempt to create it. If undefined, then the Recon will log a warning and fallback to ozone.metadata.dirs. This fallback approach is not recommended for production environments.
ozone.recon.db.dirs.permissions700Permissions for the metadata directories for Recon. The permissions can either be octal or symbolic. If the default permissions are not set then the default value of 700 will be used.
ozone.recon.dn.metrics.collection.minimum.api.delay30sOZONE, RECON, DNMinimum delay in API to start a new task for Jmx collection. It behaves like a rate limiter to avoid unnecessary task creation.
ozone.recon.dn.metrics.collection.timeout10mOZONE, RECON, DNMaximum time taken for the api to complete. If it exceeds pending tasks will be cancelled.
ozone.recon.filesizecount.flush.db.max.threshold200000OZONE, RECON, PERFORMANCEMaximum threshold number of entries to hold in memory for File Size Count task in hashmap before flushing to recon derby DB
ozone.recon.heatmap.enablefalseOZONE, RECONTo enable/disable recon heatmap feature. Along with this config, user must also provide the implementation of "org.apache.hadoop.ozone.recon.heatmap.IHeatMapProvider" interface and configure in "ozone.recon.heatmap.provider" configuration.
ozone.recon.heatmap.providerOZONE, RECONFully qualified heatmap provider implementation class name. If this value is not set, then HeatMap feature will be disabled and not exposed in Recon UI. Please refer Ozone doc for more details regarding the implementation of "org.apache.hadoop.ozone.recon.heatmap.IHeatMapProvider" interface.
ozone.recon.http-address0.0.0.0:9888RECON, MANAGEMENTThe address and the base port where the Recon web UI will listen on. If the port is 0, then the server will start on a free port. However, it is best to specify a well-known port, so it is easy to connect and see the Recon management UI.
ozone.recon.http-bind-host0.0.0.0RECON, MANAGEMENTThe actual address the Recon server will bind to. If this optional the address is set, it overrides only the hostname portion of ozone.recon.http-address.
ozone.recon.http.auth.kerberos.keytab/etc/security/keytabs/HTTP.keytabRECON, SECURITY, KERBEROSThe keytab file for HTTP Kerberos authentication in Recon.
ozone.recon.http.auth.kerberos.principalHTTP/_HOST@REALMRECON, SECURITY, KERBEROSThe server principal used by Ozone Recon server. This is typically set to HTTP/_HOST@REALM.TLD The SPNEGO server principal begins with the prefix HTTP/ by convention.
ozone.recon.http.auth.typesimpleRECON, SECURITY, KERBEROSsimple or kerberos. If kerberos is set, SPNEGO will be used for http authentication.
ozone.recon.http.enabledtrueRECON, MANAGEMENTProperty to enable or disable Recon web user interface.
ozone.recon.https-address0.0.0.0:9889RECON, MANAGEMENT, SECURITYThe address and the base port where the Recon web UI will listen on using HTTPS. If the port is 0 then the server will start on a free port.
ozone.recon.https-bind-host0.0.0.0RECON, MANAGEMENT, SECURITYThe actual address the Recon web server will bind to using HTTPS. If this optional address is set, it overrides only the hostname portion of ozone.recon.https-address.
ozone.recon.kerberos.keytab.fileSECURITY, RECON, OZONEThe keytab file used by Recon daemon to login as its service principal.
ozone.recon.kerberos.principalSECURITY, RECON, OZONEThis Kerberos principal is used by the Recon service.
ozone.recon.nssummary.flush.db.max.threshold150000OZONE, RECON, PERFORMANCEMaximum threshold number of entries to hold in memory for NSSummary task in hashmap before flushing to recon rocks DB namespaceSummaryTable
ozone.recon.om.connection.request.timeout5000OZONE, RECON, OMConnection request timeout in milliseconds for HTTP call made by Recon to request OM DB snapshot.
ozone.recon.om.connection.timeout5sOZONE, RECON, OMConnection timeout for HTTP call in milliseconds made by Recon to request OM snapshot.
ozone.recon.om.db.dirOZONE, RECON, STORAGEDirectory where the Recon Server stores its OM snapshot DB. This should be specified as a single directory. If the directory does not exist then the Recon will attempt to create it. If undefined, then the Recon will log a warning and fallback to ozone.metadata.dirs. This fallback approach is not recommended for production environments.
ozone.recon.om.event.buffer.capacity20000OZONE, RECON, OM, PERFORMANCEMaximum capacity of the event buffer used by Recon to queue OM delta updates during task reinitialization. When tasks are being reprocessed on staging DB, this buffer holds incoming delta updates to prevent blocking the OM sync process. If the buffer overflows, task reinitialization will be triggered.
ozone.recon.om.snapshot.task.flush.paramfalseOZONE, RECON, OMRequest to flush the OM DB before taking checkpoint snapshot.
ozone.recon.om.snapshot.task.initial.delay1mOZONE, RECON, OMInitial delay in MINUTES by Recon to request OM DB Snapshot.
ozone.recon.om.snapshot.task.interval.delay5sOZONE, RECON, OMInterval in SECONDS by Recon to request OM DB Snapshot.
ozone.recon.om.socket.timeout5sOZONE, RECON, OMSocket timeout in milliseconds for HTTP call made by Recon to request OM snapshot.
ozone.recon.scm.connection.request.timeout5sOZONE, RECON, SCMConnection request timeout in milliseconds for HTTP call made by Recon to request SCM DB snapshot.
ozone.recon.scm.connection.timeout5sOZONE, RECON, SCMConnection timeout for HTTP call in milliseconds made by Recon to request SCM snapshot.
ozone.recon.scm.container.threshold100OZONE, RECON, SCMThreshold value for the difference in number of containers in SCM and RECON.
ozone.recon.scm.snapshot.enabledtrueOZONE, RECON, SCMIf enabled, SCM DB Snapshot is taken by Recon.
ozone.recon.scm.snapshot.task.initial.delay1mOZONE, MANAGEMENT, RECONInitial delay in MINUTES by Recon to request SCM DB Snapshot.
ozone.recon.scm.snapshot.task.interval.delay24hOZONE, MANAGEMENT, RECONInterval in MINUTES by Recon to request SCM DB Snapshot.
ozone.recon.scmclient.failover.max.retry3OZONE, RECON, SCMMax retry count for SCM Client when failover happens.
ozone.recon.scmclient.max.retry.timeout6sOZONE, RECON, SCMMax retry timeout for SCM Client when Recon connects to SCM. This config is used to dynamically compute the max retry count for SCM Client when failover happens. Check the SCMClientConfig class getRetryCount method.
ozone.recon.scmclient.rpc.timeout1mOZONE, RECON, SCMRpcClient timeout on waiting for the response from SCM when Recon connects to SCM.
ozone.recon.security.client.datanode.container.protocol.acl*SECURITY, RECON, OZONEComma separated acls (users, groups) allowing clients accessing datanode container protocol
ozone.recon.sql.db.auto.committrueSTORAGE, RECON, OZONESets the Ozone Recon database connection property of auto-commit to true/false.
ozone.recon.sql.db.conn.idle.max.age3600sSTORAGE, RECON, OZONESets maximum time to live for idle connection in seconds.
ozone.recon.sql.db.conn.idle.testSELECT 1STORAGE, RECON, OZONEThe query to send to the DB to maintain keep-alives and test for dead connections.
ozone.recon.sql.db.conn.idle.test.period60sSTORAGE, RECON, OZONESets maximum time to live for idle connection in seconds.
ozone.recon.sql.db.conn.max.active5STORAGE, RECON, OZONEThe max active connections to the SQL database.
ozone.recon.sql.db.conn.max.age1800sSTORAGE, RECON, OZONESets maximum time a connection can be active in seconds.
ozone.recon.sql.db.conn.timeout30000msSTORAGE, RECON, OZONESets time in milliseconds before call to getConnection is timed out.
ozone.recon.sql.db.driverorg.apache.derby.jdbc.EmbeddedDriverSTORAGE, RECON, OZONERecon SQL DB driver class. Defaults to Derby.
ozone.recon.sql.db.jdbc.urljdbc:derby:${ozone.recon.db.dir}/ozone_recon_derby.dbSTORAGE, RECON, OZONEOzone Recon SQL database jdbc url.
ozone.recon.sql.db.jooq.dialectDERBYSTORAGE, RECON, OZONERecon internally uses Jooq to talk to its SQL DB. By default, we support Derby and Sqlite out of the box. Please refer to https://www.jooq.org/javadoc/latest/org.jooq/org/jooq/SQLDialect.html to specify different dialect.
ozone.recon.sql.db.passwordSTORAGE, RECON, OZONEOzone Recon SQL database password.
ozone.recon.sql.db.usernameSTORAGE, RECON, OZONEOzone Recon SQL database username.
ozone.recon.task.containercounttask.interval60sRECON, OZONEThe time interval to wait between each runs of container count task.
ozone.recon.task.missingcontainer.interval300sRECON, OZONEThe time interval of the periodic check for unhealthy containers in the cluster as reported by Datanodes.
ozone.recon.task.pipelinesync.interval300sRECON, OZONEThe time interval of periodic sync of pipeline state from SCM to Recon.
ozone.recon.task.reprocess.max.iterators5OZONE, RECON, PERFORMANCEMaximum number of iterator threads to use for parallel table iteration during reprocess
ozone.recon.task.reprocess.max.keys.in.memory2000OZONE, RECON, PERFORMANCEMaximum number of keys to batch in memory before handing to worker threads during parallel reprocess
ozone.recon.task.reprocess.max.workers20OZONE, RECON, PERFORMANCEMaximum number of worker threads to use for parallel table processing during reprocess
ozone.recon.task.safemode.wait.threshold300sRECON, OZONEThe time interval to wait for starting container health task and pipeline sync task before recon exits out of safe or warmup mode.
ozone.recon.task.thread.count1OZONE, RECONThe number of Recon Tasks that are waiting on updates from OM.
ozone.replication.allowed-configs^((STANDALONE|RATIS)/(ONE|THREE))|(EC/(3-2|6-3|10-4)-(512|1024|2048|4096)k)$STORAGERegular expression to restrict enabled replication schemes
ozone.rest.client.http.connection.max100OZONE, CLIENTThis defines the overall connection limit for the connection pool used in RestClient.
ozone.rest.client.http.connection.per-route.max20OZONE, CLIENTThis defines the connection limit per one HTTP route/host. Total max connection is limited by ozone.rest.client.http.connection.max property.
ozone.s3.administratorsOZONE, SECURITYS3 administrator users delimited by a comma. This is the list of users who can access admin only information from s3. If this property is empty then ozone.administrators will be able to access all s3 information regardless of this setting.
ozone.s3.administrators.groupsOZONE, SECURITYS3 administrator groups delimited by a comma. This is the list of groups who can access admin only information from S3. It is enough to either have the name defined in ozone.s3.administrators or be directly or indirectly in a group defined in this property.
ozone.s3g.client.buffer.size4MBOZONE, S3GATEWAYThe size of the buffer which is for read block. (4MB by default).
ozone.s3g.default.bucket.layoutOBJECT_STOREOZONE, S3GATEWAYThe bucket layout that will be used when buckets are created through the S3 API.
ozone.s3g.domain.nameOZONE, S3GATEWAYList of Ozone S3Gateway domain names. If multiple domain names to be provided, they should be a "," separated. This parameter is only required when virtual host style pattern is followed.
ozone.s3g.http-address0.0.0.0:9878OZONE, S3GATEWAYThe address and the base port where the Ozone S3Gateway Server will listen on.
ozone.s3g.http-bind-host0.0.0.0OZONE, S3GATEWAYThe actual address the HTTP server will bind to. If this optional address is set, it overrides only the hostname portion of ozone.s3g.http-address. This is useful for making the Ozone S3Gateway HTTP server listen on all interfaces by setting it to 0.0.0.0.
ozone.s3g.http.auth.kerberos.keytab/etc/security/keytabs/HTTP.keytabOZONE, S3GATEWAY, SECURITY, KERBEROSThe keytab file used by the S3Gateway server to login as its service principal.
ozone.s3g.http.auth.kerberos.principalHTTP/_HOST@REALMOZONE, S3GATEWAY, SECURITY, KERBEROSThe server principal used by Ozone S3Gateway server. This is typically set to HTTP/_HOST@REALM.TLD The SPNEGO server principal begins with the prefix HTTP/ by convention.
ozone.s3g.http.auth.typesimpleS3GATEWAY, SECURITY, KERBEROSsimple or kerberos. If kerberos is set, SPNEGO will be used for http authentication.
ozone.s3g.http.enabledtrueOZONE, S3GATEWAYThe boolean which enables the Ozone S3Gateway server .
ozone.s3g.https-address0.0.0.0:9879OZONE, S3GATEWAYOzone S3Gateway HTTPS server address and port.
ozone.s3g.https-bind-hostOZONE, S3GATEWAYThe actual address the HTTPS server will bind to. If this optional address is set, it overrides only the hostname portion of ozone.s3g.https-address. This is useful for making the Ozone S3Gateway HTTPS server listen on all interfaces by setting it to 0.0.0.0.
ozone.s3g.kerberos.keytab.file/etc/security/keytabs/s3g.keytabOZONE, SECURITY, KERBEROS, S3GATEWAYThe keytab file used by S3Gateway daemon to login as its service principal. The principal name is configured with ozone.s3g.kerberos.principal.
ozone.s3g.kerberos.principals3g/_HOST@REALMOZONE, SECURITY, KERBEROS, S3GATEWAYThe S3Gateway service principal. Ex: s3g/_HOST@REALM.COM
ozone.s3g.list-keys.shallow.enabledtrueOZONE, S3GATEWAYIf this is true, there will be efficiency optimization effects when calling s3g list interface with delimiter '/' parameter, especially when there are a large number of keys.
ozone.s3g.list.max.keys.limit1000Maximum number of keys returned by S3 ListObjects/ListObjectsV2 API. AWS default is 1000. Can be overridden per deployment in ozone-site.xml.
ozone.s3g.metrics.percentiles.intervals.seconds60S3GATEWAY, PERFORMANCESpecifies the interval in seconds for the rollover of MutableQuantiles metrics. Setting this interval equal to the metrics sampling time ensures more detailed metrics.
ozone.s3g.secret.http.auth.typekerberosS3GATEWAY, SECURITY, KERBEROSsimple or kerberos. If kerberos is set, Kerberos SPNEOGO will be used for http authentication.
ozone.s3g.secret.http.enabledfalseOZONE, S3GATEWAYThe boolean which enables the Ozone S3Gateway Secret endpoint.
ozone.s3g.volume.names3vOZONE, S3GATEWAYThe volume name to access through the s3gateway.
ozone.s3g.webadmin.http-address0.0.0.0:19878OZONE, S3GATEWAYThe address and port where Ozone S3Gateway serves web content.
ozone.s3g.webadmin.http-bind-host0.0.0.0OZONE, S3GATEWAYThe actual address the HTTP server will bind to. If this optional address is set, it overrides only the hostname portion of ozone.s3g.webadmin.http-address. This is useful for making the Ozone S3Gateway HTTP server listen on all interfaces by setting it to 0.0.0.0.
ozone.s3g.webadmin.http.enabledtrueOZONE, S3GATEWAYThis option can be used to disable the web server which serves additional content in Ozone S3 Gateway.
ozone.s3g.webadmin.https-address0.0.0.0:19879OZONE, S3GATEWAYOzone S3Gateway content server's HTTPS address and port.
ozone.s3g.webadmin.https-bind-hostOZONE, S3GATEWAYThe actual address the HTTPS server will bind to. If this optional address is set, it overrides only the hostname portion of ozone.s3g.webadmin.https-address. This is useful for making the Ozone S3Gateway HTTPS server listen on all interfaces by setting it to 0.0.0.0.
ozone.scm.block.client.addressOZONE, SCMThe address of the Ozone SCM block client service. If not defined value of ozone.scm.client.address is used.
ozone.scm.block.client.bind.host0.0.0.0OZONE, SCMThe hostname or IP address used by the SCM block client endpoint to bind.
ozone.scm.block.client.port9863OZONE, SCMThe port number of the Ozone SCM block client service.
ozone.scm.block.deletion.per.dn.distribution.factor8OZONE, SCMFactor with which number of delete blocks sent to each datanode in every interval. If total number of DNs are 100 and hdds.scm.block.deletion.per-interval.max is 500000 Then maximum 500000/(100/8) = 40000 blocks will be sent to each DN in every interval.
ozone.scm.block.handler.count.key100OZONE, MANAGEMENT, PERFORMANCEUsed to set the number of RPC handlers when accessing blocks. The default value is 100.
ozone.scm.block.read.threadpool10OZONE, MANAGEMENT, PERFORMANCEThe number of threads in RPC server reading from the socket when accessing blocks. This config overrides Hadoop configuration "ipc.server.read.threadpool.size" for SCMBlockProtocolServer. The default value is 10.
ozone.scm.block.size256MBOZONE, SCMThe default size of a scm block. This is maps to the default Ozone block size.
ozone.scm.ca.list.retry.interval10sOZONE, SCM, OM, DATANODESCM client wait duration between each retry to get Scm CA list. OM/Datanode obtain CA list during startup, and wait for the CA List size to be matched with SCM node count size plus 1. (Additional one certificate is root CA certificate). If the received CA list size is not matching with expected count, this is the duration used to wait before making next attempt to get CA list.
ozone.scm.chunk.size4MBOZONE, SCM, CONTAINER, PERFORMANCEThe chunk size for reading/writing chunk operations in bytes. The chunk size defaults to 4MB. If the value configured is more than the maximum size (32MB), it will be reset to the maximum size (32MB). This maps to the network packet sizes and file write operations in the client to datanode protocol. When tuning this parameter, flow control window parameter should be tuned accordingly. Refer to hdds.ratis.raft.grpc.flow.control.window for more information.
ozone.scm.client.addressOZONE, SCM, REQUIREDThe address of the Ozone SCM client service. This is a required setting. It is a string in the host:port format. The port number is optional and defaults to 9860.
ozone.scm.client.bind.host0.0.0.0OZONE, SCM, MANAGEMENTThe hostname or IP address used by the SCM client endpoint to bind. This setting is used by the SCM only and never used by clients. The setting can be useful in multi-homed setups to restrict the availability of the SCM client service to a specific interface. The default is appropriate for most clusters.
ozone.scm.client.handler.count.key100OZONE, MANAGEMENT, PERFORMANCEUsed to set the number of RPC handlers used by Client to access SCM. The default value is 100.
ozone.scm.client.port9860OZONE, SCM, MANAGEMENTThe port number of the Ozone SCM client service.
ozone.scm.client.read.threadpool10OZONE, MANAGEMENT, PERFORMANCEThe number of threads in RPC server reading from the socket used by Client to access SCM. This config overrides Hadoop configuration "ipc.server.read.threadpool.size" for SCMClientProtocolServer. The default value is 10.
ozone.scm.close.container.wait.duration150sSCM, OZONE, RECONWait duration before which close container is send to DN.
ozone.scm.container.layoutFILE_PER_BLOCKOZONE, SCM, CONTAINER, PERFORMANCEContainer layout defines how chunks, blocks and containers are stored on disk. Each chunk is stored separately with FILE_PER_CHUNK. All chunks of a block are stored in the same file with FILE_PER_BLOCK. The default is FILE_PER_BLOCK.
ozone.scm.container.list.max.count4096OZONE, SCM, CONTAINERThe max number of containers info could be included in response of ListContainer request.
ozone.scm.container.lock.stripes512OZONE, SCM, PERFORMANCE, MANAGEMENTThe number of stripes created for the container state manager lock.
ozone.scm.container.placement.ec.implorg.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackScatterOZONE, MANAGEMENTThe full name of class which implements org.apache.hadoop.hdds.scm.PlacementPolicy. The class decides which datanode will be used to host the container replica in EC mode. If not set, org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackScatter will be used as default value.
ozone.scm.container.placement.implorg.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAwareOZONE, MANAGEMENTThe full name of class which implements org.apache.hadoop.hdds.scm.PlacementPolicy. The class decides which datanode will be used to host the container replica. If not set, org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware will be used as default value.
ozone.scm.container.size5GBOZONE, PERFORMANCE, MANAGEMENTDefault container size used by Ozone. There are two considerations while picking this number. The speed at which a container can be replicated, determined by the network speed and the metadata that each container generates. So selecting a large number creates less SCM metadata, but recovery time will be more. 5GB is a number that maps to quick replication times in gigabit networks, but still balances the amount of metadata.
ozone.scm.datanode.addressOZONE, MANAGEMENTThe address of the Ozone SCM service used for internal communication between the DataNodes and the SCM. It is a string in the host:port format. The port number is optional and defaults to 9861. This setting is optional. If unspecified then the hostname portion is picked from the ozone.scm.client.address setting and the default service port of 9861 is chosen.
ozone.scm.datanode.admin.monitor.interval30sSCMThis sets how frequently the datanode admin monitor runs to check for nodes added to the admin workflow or removed from it. The progress of decommissioning and entering maintenance nodes is also checked to see if they have completed.
ozone.scm.datanode.admin.monitor.logging.limit1000SCMWhen a node is checked for decommission or maintenance, this setting controls how many degraded containers are logged on each pass. The limit is applied separately for each type of container, ie under-replicated and unhealthy will each have their own limit.
ozone.scm.datanode.bind.hostOZONE, MANAGEMENTThe hostname or IP address used by the SCM service endpoint to bind.
ozone.scm.datanode.disallow.same.peersfalseOZONE, SCM, PIPELINEDisallows same set of datanodes to participate in multiple pipelines when set to true. Default is set to false.
ozone.scm.datanode.handler.count.key100OZONE, MANAGEMENT, PERFORMANCEUsed to set the number of RPC handlers used by DataNode to access SCM. The default value is 100.
ozone.scm.datanode.id.dirOZONE, MANAGEMENTThe path that datanodes will use to store the datanode ID. If this value is not set, then datanode ID is created under the metadata directory.
ozone.scm.datanode.pipeline.limit2OZONE, SCM, PIPELINEMax number of pipelines per datanode can be engaged in. Setting the value to 0 means the pipeline limit per dn will be determined by the no of metadata volumes reported per dn.
ozone.scm.datanode.port9861OZONE, MANAGEMENTThe port number of the Ozone SCM service.
ozone.scm.datanode.ratis.volume.free-space.min1GBOZONE, DATANODEMinimum amount of storage space required for each ratis volume on a datanode to hold a new pipeline. Datanodes with all its ratis volumes with space under this value will not be allocated a pipeline or container replica.
ozone.scm.datanode.read.threadpool10OZONE, MANAGEMENT, PERFORMANCEThe number of threads in RPC server reading from the socket used by DataNode to access SCM. This config overrides Hadoop configuration "ipc.server.read.threadpool.size" for SCMDatanodeProtocolServer. The default value is 10.
ozone.scm.db.dirsOZONE, SCM, STORAGE, PERFORMANCEDirectory where the StorageContainerManager stores its metadata. This should be specified as a single directory. If the directory does not exist then the SCM will attempt to create it. If undefined, then the SCM will log a warning and fallback to ozone.metadata.dirs. This fallback approach is not recommended for production environments.
ozone.scm.db.dirs.permissions700Permissions for the metadata directories for Storage Container Manager. The permissions can either be octal or symbolic. If the default permissions are not set then the default value of 700 will be used.
ozone.scm.dead.node.interval10mOZONE, MANAGEMENTThe interval between heartbeats before a node is tagged as dead.
ozone.scm.default.service.idOZONE, SCM, HAService ID of the SCM. If this is not set fall back to ozone.scm.service.ids to find the service ID it belongs to.
ozone.scm.ec.pipeline.minimum5STORAGEThe minimum number of pipelines to have open for each Erasure Coding configuration
ozone.scm.ec.pipeline.per.volume.factor1SCMTODO
ozone.scm.event.ContainerReport.thread.pool.size10OZONE, SCMThread pool size configured to process container reports.
ozone.scm.expired.container.replica.op.scrub.interval5mOZONE, SCM, CONTAINERSCM schedules a fixed interval job using the configured interval to scrub expired container replica operation.
ozone.scm.grpc.port9895OZONE, SCM, HA, RATISThe port number of the SCM's grpc server.
ozone.scm.ha.dbtransactionbuffer.flush.interval60sSCM, OZONEWait duration for flush of buffered transaction.
ozone.scm.ha.grpc.deadline.interval30mSCM, OZONE, HA, RATISDeadline for SCM DB checkpoint interval.
ozone.scm.ha.raft.server.log.appender.wait-time.min0msOZONE, SCM, RATIS, PERFORMANCEMinimum wait time between two appendEntries calls.
ozone.scm.ha.raft.server.rpc.first-election.timeoutSCM, OZONE, HA, RATISratis timeout for the first election of a leader. If not configured, fallback to ozone.scm.ha.ratis.leader.election.timeout.
ozone.scm.ha.ratis.leader.election.timeout5sSCM, OZONE, HA, RATISThe minimum timeout duration for SCM ratis leader election. Default is 1s.
ozone.scm.ha.ratis.leader.ready.check.interval2sSCM, OZONE, HA, RATISThe interval between ratis server performing a leader readiness check.
ozone.scm.ha.ratis.leader.ready.wait.timeout60sSCM, OZONE, HA, RATISThe minimum timeout duration for waiting for leader readiness.
ozone.scm.ha.ratis.log.appender.queue.byte-limit32MBSCM, OZONE, HA, RATISByte limit for Raft's Log Worker queue.
ozone.scm.ha.ratis.log.appender.queue.num-elements1024SCM, OZONE, HA, RATISNumber of operation pending with Raft's Log Worker.
ozone.scm.ha.ratis.log.purge.enabledfalseSCM, OZONE, HA, RATISwhether enable raft log purge.
ozone.scm.ha.ratis.log.purge.gap1000000SCM, OZONE, HA, RATISThe minimum gap between log indices for Raft server to purge its log segments after taking snapshot.
ozone.scm.ha.ratis.request.timeout30sSCM, OZONE, HA, RATISThe timeout duration for SCM's Ratis server RPC.
ozone.scm.ha.ratis.rpc.typeGRPCSCM, OZONE, HA, RATISRatis supports different kinds of transports like netty, GRPC, Hadoop RPC etc. This picks one of those for this cluster.
ozone.scm.ha.ratis.segment.preallocated.size4MBSCM, OZONE, HA, RATISThe size of the buffer which is preallocated for raft segment used by Apache Ratis on SCM. (4 MB by default)
ozone.scm.ha.ratis.segment.size64MBSCM, OZONE, HA, RATISThe size of the raft segment used by Apache Ratis on SCM. (64 MB by default)
ozone.scm.ha.ratis.server.failure.timeout.duration120sSCM, OZONE, HA, RATISThe timeout duration for ratis server failure detection, once the threshold has reached, the ratis state machine will be informed about the failure in the ratis ring.
ozone.scm.ha.ratis.server.leaderelection.pre-votetrueSCM, OZONE, HA, RATISEnable/disable SCM HA leader election pre-vote phase.
ozone.scm.ha.ratis.server.retry.cache.timeout60sSCM, OZONE, HA, RATISRetry Cache entry timeout for SCM's Ratis server.
ozone.scm.ha.ratis.server.snapshot.creation.gap1024SCM, OZONERaft snapshot gap index after which snapshot can be taken.
ozone.scm.ha.ratis.snapshot.dirSCM, OZONE, HA, RATISThe ratis snapshot dir location.
ozone.scm.ha.ratis.snapshot.threshold1000SCM, OZONE, HA, RATISThe threshold to trigger a Ratis taking snapshot operation for SCM.
ozone.scm.ha.ratis.storage.dirOZONE, SCM, HA, RATISStorage directory used by SCM to write Ratis logs.
ozone.scm.handler.count.key100OZONE, MANAGEMENT, PERFORMANCEThe number of RPC handler threads for each SCM service endpoint. The default is appropriate for small clusters (tens of nodes). Set a value that is appropriate for the cluster size. Generally, HDFS recommends RPC handler count is set to 20 * log2(Cluster Size) with an upper limit of 200. However, Ozone SCM will not have the same amount of traffic as HDFS Namenode, so a value much smaller than that will work well too. To specify handlers for individual RPC servers, set the following configuration properties instead: ---- RPC type ---- : ---- Configuration properties ---- SCMClientProtocolServer : 'ozone.scm.client.handler.count.key' SCMBlockProtocolServer : 'ozone.scm.block.handler.count.key' SCMDatanodeProtocolServer: 'ozone.scm.datanode.handler.count.key'
ozone.scm.heartbeat.log.warn.interval.count10OZONE, MANAGEMENTDefines how frequently we will log the missing of a heartbeat to SCM. For example in the default case, we will write a warning message for each ten consecutive heartbeats that we miss to SCM. This helps in reducing clutter in a data node log, but trade off is that logs will have less of this statement.
ozone.scm.heartbeat.rpc-retry-count15OZONE, MANAGEMENTRetry count for the RPC from Datanode to SCM. The rpc-retry-interval is 1s by default. Make sure rpc-retry-count * (rpc-timeout + rpc-retry-interval) is less than hdds.heartbeat.interval.
ozone.scm.heartbeat.rpc-retry-interval1sOZONE, MANAGEMENTRetry interval for the RPC from Datanode to SCM. Make sure rpc-retry-count * (rpc-timeout + rpc-retry-interval) is less than hdds.heartbeat.interval.
ozone.scm.heartbeat.rpc-timeout5sOZONE, MANAGEMENTTimeout value for the RPC from Datanode to SCM.
ozone.scm.heartbeat.thread.interval3sOZONE, MANAGEMENTWhen a heartbeat from the data node arrives on SCM, It is queued for processing with the time stamp of when the heartbeat arrived. There is a heartbeat processing thread inside SCM that runs at a specified interval. This value controls how frequently this thread is run. There are some assumptions build into SCM such as this value should allow the heartbeat processing thread to run at least three times more frequently than heartbeats and at least five times more than stale node detection time. If you specify a wrong value, SCM will gracefully refuse to run. For more info look at the node manager tests in SCM. In short, you don't need to change this.
ozone.scm.http-address0.0.0.0:9876OZONE, MANAGEMENTThe address and the base port where the SCM web ui will listen on. If the port is 0 then the server will start on a free port.
ozone.scm.http-bind-host0.0.0.0OZONE, MANAGEMENTThe actual address the SCM web server will bind to. If this optional address is set, it overrides only the hostname portion of ozone.scm.http-address.
ozone.scm.http.enabledtrueOZONE, MANAGEMENTProperty to enable or disable SCM web ui.
ozone.scm.https-address0.0.0.0:9877OZONE, MANAGEMENTThe address and the base port where the SCM web UI will listen on using HTTPS. If the port is 0 then the server will start on a free port.
ozone.scm.https-bind-host0.0.0.0OZONE, MANAGEMENTThe actual address the SCM web server will bind to using HTTPS. If this optional address is set, it overrides only the hostname portion of ozone.scm.https-address.
ozone.scm.info.wait.duration10mOZONE, SCM, OMMaximum amount of duration OM/SCM waits to get Scm Info/Scm signed cert during OzoneManager init/SCM bootstrap.
ozone.scm.keyvalue.container.deletion-choosing.policyorg.apache.hadoop.ozone.container.common.impl.TopNOrderedContainerDeletionChoosingPolicyOZONE, MANAGEMENTThe policy used for choosing desired keyvalue containers for block deletion. Datanode selects some containers to process block deletion in a certain interval defined by ozone.block.deleting.service.interval. The number of containers to process in each interval is defined by ozone.block.deleting.container.limit.per.interval. This property is used to configure the policy applied while selecting containers. There are two policies supporting now: RandomContainerDeletionChoosingPolicy and TopNOrderedContainerDeletionChoosingPolicy. org.apache.hadoop.ozone.container.common.impl.RandomContainerDeletionChoosingPolicy implements a simply random policy that to return a random list of containers. org.apache.hadoop.ozone.container.common.impl.TopNOrderedContainerDeletionChoosingPolicy implements a policy that choosing top count number of containers in a pending-deletion-blocks's num based descending order.
ozone.scm.namesOZONE, REQUIREDThe value of this property is a set of DNS | DNS:PORT | IP Address | IP:PORT. Written as a comma separated string. e.g. scm1, scm2:8020, 7.7.7.7:7777. This property allows datanodes to discover where SCM is, so that datanodes can send heartbeat to SCM.
ozone.scm.network.topology.schema.filenetwork-topology-default.xmlOZONE, MANAGEMENTThe schema file defines the ozone network topology. We currently support xml(default) and yaml format. Refer to the samples in the topology awareness document for xml and yaml topology definition samples.
ozone.scm.node.idOZONE, SCM, HAThe ID of this SCM node. If the SCM node ID is not configured it is determined automatically by matching the local node's address with the configured address. If node ID is not deterministic from the configuration, then it is set to the scmId from the SCM version file.
ozone.scm.nodes.EXAMPLESCMSERVICEIDOZONE, SCM, HAComma-separated list of SCM node Ids for a given SCM service ID (eg. EXAMPLESCMSERVICEID). The SCM service ID should be the value (one of the values if there are multiple) set for the parameter ozone.scm.service.ids. Unique identifiers for each SCM Node, delimited by commas. This will be used by SCMs in HA setup to determine all the SCMs belonging to the same SCM in the cluster. For example, if you used “scmService1” as the SCM service ID previously, and you wanted to use “scm1”, “scm2” and "scm3" as the individual IDs of the SCMs, you would configure a property ozone.scm.nodes.scmService1, and its value "scm1,scm2,scm3".
ozone.scm.pipeline.allocated.timeout5mOZONE, SCM, PIPELINETimeout for every pipeline to stay in ALLOCATED stage. When pipeline is created, it should be at OPEN stage once pipeline report is successfully received by SCM. If a pipeline stays at ALLOCATED longer than the specified period of time, it should be scrubbed so that new pipeline can be created. This timeout is for how long pipeline can stay at ALLOCATED stage until it gets scrubbed.
ozone.scm.pipeline.creation.auto.factor.onetrueOZONE, SCM, PIPELINEIf enabled, SCM will auto create RATIS factor ONE pipeline.
ozone.scm.pipeline.creation.interval120sOZONE, SCM, PIPELINESCM schedules a fixed interval job using the configured interval to create pipelines.
ozone.scm.pipeline.destroy.timeout66sOZONE, SCM, PIPELINEOnce a pipeline is closed, SCM should wait for the above configured time before destroying a pipeline.
ozone.scm.pipeline.leader-choose.policyorg.apache.hadoop.hdds.scm.pipeline.leader.choose.algorithms.MinLeaderCountChoosePolicyOZONE, SCM, PIPELINEThe policy used for choosing desired leader for pipeline creation. There are two policies supporting now: DefaultLeaderChoosePolicy, MinLeaderCountChoosePolicy. org.apache.hadoop.hdds.scm.pipeline.leader.choose.algorithms.DefaultLeaderChoosePolicy implements a policy that choose leader without depending on priority. org.apache.hadoop.hdds.scm.pipeline.leader.choose.algorithms.MinLeaderCountChoosePolicy implements a policy that choose leader which has the minimum exist leader count. In the future, we need to add policies which consider: 1. resource, the datanode with the most abundant cpu and memory can be made the leader 2. topology, the datanode nearest to the client can be made the leader
ozone.scm.pipeline.owner.container.count3OZONE, SCM, PIPELINENumber of containers per owner per disk in a pipeline.
ozone.scm.pipeline.per.metadata.disk2OZONE, SCM, PIPELINENumber of pipelines to be created per raft log disk.
ozone.scm.pipeline.scrub.interval5mOZONE, SCM, PIPELINESCM schedules a fixed interval job using the configured interval to scrub pipelines.
ozone.scm.primordial.node.idOZONE, SCM, HAoptional config, if being set will cause scm --init to only take effect on the specific node and ignore scm --bootstrap cmd. Similarly, scm --init will be ignored on the non-primordial scm nodes. The config can either be set equal to the hostname or the node id of any of the scm nodes. With the config set, applications/admins can safely execute init and bootstrap commands safely on all scm instances. If a cluster is upgraded from non-ratis to ratis based SCM, scm --init needs to re-run for switching from non-ratis based SCM to ratis-based SCM on the primary node.
ozone.scm.ratis.pipeline.limit0OZONE, SCM, PIPELINEUpper limit for how many pipelines can be OPEN in SCM. 0 as default means there is no limit. Otherwise, the number is the limit of max amount of pipelines which are OPEN.
ozone.scm.ratis.port9894OZONE, SCM, HA, RATISThe port number of the SCM's Ratis server.
ozone.scm.security.handler.count.key2OZONE, HDDS, SECURITYThreads configured for SCMSecurityProtocolServer.
ozone.scm.security.read.threadpool1OZONE, HDDS, SECURITY, PERFORMANCEThe number of threads in RPC server reading from the socket when performing security related operations with SCM. This config overrides Hadoop configuration "ipc.server.read.threadpool.size" for SCMSecurityProtocolServer. The default value is 1.
ozone.scm.security.service.addressOZONE, HDDS, SECURITYAddress of SCMSecurityProtocolServer.
ozone.scm.security.service.bind.host0.0.0.0OZONE, HDDS, SECURITYSCM security server host.
ozone.scm.security.service.port9961OZONE, HDDS, SECURITYSCM security server port.
ozone.scm.sequence.id.batch.size1000OZONE, SCMSCM allocates sequence id in a batch way. This property determines how many ids will be allocated in a single batch.
ozone.scm.service.idsOZONE, SCM, HAComma-separated list of SCM service Ids. This property allows the client to figure out quorum of OzoneManager address.
ozone.scm.skip.bootstrap.validationfalseOZONE, SCM, HAoptional config, the config when set to true skips the clusterId validation from leader scm during bootstrap
ozone.scm.stale.node.interval5mOZONE, MANAGEMENTThe interval for stale node flagging. Please see ozone.scm.heartbeat.thread.interval before changing this value.
ozone.security.crypto.compliance.modeunrestrictedOZONE, SECURITY, HDDS, CRYPTO_COMPLIANCEBased on this property the security compliance mode is loaded and enables filtering cryptographic configuration options according to the specified compliance mode.
ozone.security.enabledfalseOZONE, SECURITY, KERBEROSTrue if security is enabled for ozone. When this property is true, hadoop.security.authentication should be Kerberos.
ozone.security.http.kerberos.enabledfalseOZONE, SECURITY, KERBEROSTrue if Kerberos authentication for Ozone HTTP web consoles is enabled using the SPNEGO protocol. When this property is true, hadoop.security.authentication should be Kerberos and ozone.security.enabled should be set to true.
ozone.security.reconfigure.protocol.acl*SECURITYComma separated list of users and groups allowed to access reconfigure protocol.
ozone.server.default.replication3OZONEDefault replication value. The actual number of replications can be specified when writing the key. The default is used if replication is not specified when creating key or no default replication set at bucket. Supported values: For RATIS: 1, 3 For EC (Erasure Coding) supported format: {ECCodec}-{DataBlocks}-{ParityBlocks}-{ChunkSize} ECCodec: Codec for encoding stripe. Supported values : XOR, RS (Reed Solomon) DataBlocks: Number of data blocks in a stripe. ParityBlocks: Number of parity blocks in a stripe. ChunkSize: Chunk size in bytes. E.g. 1024k, 2048k etc. Supported combinations of {DataBlocks}-{ParityBlocks} : 3-2, 6-3, 10-4
ozone.server.default.replication.typeRATISOZONEDefault replication type to be used while writing key into ozone. The value can be specified when writing the key, default is used when nothing is specified when creating key or no default value set at bucket. Supported values: RATIS, EC.
ozone.service.shutdown.timeout60sOZONE, OM, SCM, DATANODE, RECON, S3GATEWAYTimeout to wait for each shutdown operation to completeIf a hook takes longer than this time to complete, it will be interrupted, so the service will shutdown. This allows the service shutdown to recover from a blocked operation. The minimum duration of the timeout is 1 second, if hook has been configured with a timeout less than 1 second.
ozone.snapshot.deep.cleaning.enabledfalseOZONE, PERFORMANCE, OMFlag to enable/disable snapshot deep cleaning.
ozone.snapshot.defrag.limit.per.task1OZONE, PERFORMANCE, OMThe maximum number of snapshots that would be defragmented in each task run of snapshot defragmentation service.
ozone.snapshot.defrag.service.interval-1OZONE, PERFORMANCE, OMTask interval of snapshot defragmentation service.
ozone.snapshot.defrag.service.timeout300sOZONE, PERFORMANCE, OMTimeout value of a run of snapshot defragmentation service.
ozone.snapshot.deleting.limit.per.task10OZONE, PERFORMANCE, OMThe maximum number of snapshots that would be reclaimed by Snapshot Deleting Service per run.
ozone.snapshot.deleting.service.interval30sOZONE, PERFORMANCE, OMThe time interval between successive SnapshotDeletingService thread run.
ozone.snapshot.deleting.service.timeout300sOZONE, PERFORMANCE, OMTimeout value for SnapshotDeletingService.
ozone.snapshot.directory.service.interval24hOZONE, PERFORMANCE, OM, DEPRECATEDDEPRECATED. The time interval between successive SnapshotDirectoryCleaningService thread run.
ozone.snapshot.directory.service.timeout300sOZONE, PERFORMANCE, OM, DEPRECATEDDEPRECATED. Timeout value for SnapshotDirectoryCleaningService.
ozone.snapshot.filtering.limit.per.task2OZONE, PERFORMANCE, OMA maximum number of snapshots to be filtered by sst filtering service per time interval.
ozone.snapshot.filtering.service.interval1mOZONE, PERFORMANCE, OMTime interval of the SST File filtering service from Snapshot.
ozone.snapshot.key.deleting.limit.per.task20000OM, PERFORMANCEThe maximum number of deleted keys to be scanned by Snapshot Deleting Service per snapshot run.
ozone.sst.filtering.service.timeout300000msOZONE, PERFORMANCE, OMA timeout value of sst filtering service.
ozone.tracing.enabledfalseOZONE, HDDSIf true, tracing is initialized and spans may be exported (subject to sampling).
ozone.tracing.endpointOZONE, HDDSOTLP gRPC receiver endpoint URL.
ozone.tracing.sampler-1OZONE, HDDSRoot trace sampling ratio (0.0 to 1.0).
ozone.tracing.span.samplingOZONE, HDDSOptional per-span sampling: comma-separated spanName:rate entries.
ozone.volume.io.percentiles.intervals.seconds60OZONE, DATANODEThis setting specifies the interval (in seconds) for monitoring percentile performance metrics. It helps in tracking the read and write performance of DataNodes in real-time, allowing for better identification and analysis of performance issues.
ozone.xceiver.client.metrics.percentiles.intervals.seconds60XCEIVER, PERFORMANCESpecifies the interval in seconds for the rollover of XceiverClient MutableQuantiles metrics. Setting this interval equal to the metrics sampling time ensures more detailed metrics.
recon.om.delta.update.lag.threshold0OZONE, RECONAt every Recon OM sync, recon starts fetching OM DB updates, and it continues to fetch from OM till the lag, between OM DB WAL sequence number and Recon OM DB snapshot WAL sequence number, is less than this lag threshold value.
recon.om.delta.update.limit50000OZONE, RECONRecon each time get a limited delta updates from OM. The actual fetched data might be larger than this limit.
scm.container.client.idle.threshold10sOZONE, PERFORMANCEIn the standalone pipelines, the SCM clients use netty to communicate with the container. It also uses connection pooling to reduce client side overheads. This allows a connection to stay idle for a while before the connection is closed.
scm.container.client.max.size256OZONE, PERFORMANCEControls the maximum number of connections that are cached via client connection pooling. If the number of connections exceed this count, then the oldest idle connection is evicted.
ssl.server.keystore.keypasswordOZONE, SECURITY, MANAGEMENTKeystore key password for HTTPS SSL configuration
ssl.server.keystore.locationOZONE, SECURITY, MANAGEMENTKeystore location for HTTPS SSL configuration
ssl.server.keystore.passwordOZONE, SECURITY, MANAGEMENTKeystore password for HTTPS SSL configuration
ssl.server.keystore.typejksOZONE, SECURITY, CRYPTO_COMPLIANCEThe keystore type for HTTP Servers used in ozone.
ssl.server.truststore.locationOZONE, SECURITY, MANAGEMENTTruststore location for HTTPS SSL configuration
ssl.server.truststore.passwordOZONE, SECURITY, MANAGEMENTTruststore password for HTTPS SSL configuration
ssl.server.truststore.typejksOZONE, SECURITY, CRYPTO_COMPLIANCEThe truststore type for HTTP Servers used in ozone.