Releases Archive
- HDDS-10295 | Major | Provide an “ozone repair” subcommand to update the snapshot info in transactionInfoTable
- HDDS-11258 | Blocker | [hsync] Add new OM layout version
- HDDS-11227 | Major | Use OM’s KMS from client side when connecting to a cluster and dealing with encrypted data
- HDDS-11375 | Major | DN Startup fails with Illegal configuration
- HDDS-11342 | Major | [hsync] Add a config as HBase-related features master switch
- HDDS-7593 | Major | Supporting HSync and lease recovery
- HDDS-11329 | Major | Update Ozone images to Rocky Linux-based runner
- HDDS-11705 | Critical | Snapshot operations on linked buckets should work on actual underlying bucket
- HDDS-11617 | Blocker | Update hadoop to 3.4.1
- HDDS-8101 | Major | Add FSO repair tool to ozone CLI in read-only and repair modes
- HDDS-7852 | Major | SCM Decommissioning Support
- HDDS-11753 | Blocker | Deprecate file per chunk layout from datanode code
- HDDS-12488 | Major | S3G should handle the signature calculation with trailers
- HDDS-12327 | Blocker | Restore non-HA (to HA) upgrade test
- HDDS-11754 | Blocker | Drop support for non-Ratis OM and SCM
- HDDS-12750 | Major | Move StorageTypeProto from ScmServerDatanodeHeartbeatProtocol.proto to hdds.proto
- Proto wire/binary format: compatible (unchanged)
- Proto text format: compatible (unchanged)
- Java API: incompatible (changed java_outer_classname)
- HDDS-9218 | Major | S3 secret managment through HTTP
- New pipeline choosing policy: CapacityPipelineChoosePolicy. This policy randomly chooses pipelines with relatively lower utilization. To use, configure
hdds.scm.pipeline.choose.policy.impl
toorg.apache.hadoop.hdds.scm.pipeline.choose.algorithms.CapacityPipelineChoosePolicy
. (HDDS-9345) - APIs to fetch single datanode specific information, reducing data transfer from server to client. (HDDS-9648)
- Support for symmetric keys for delegation tokens. (HDDS-8829)
- A Storage Container Manager (SCM) can now be decommissioned from a set of SCM nodes. (HDDS-7852)
- Option to close all pipelines via CLI (
ozone admin pipeline close --all
). (HDDS-10742) - Metrics to monitor bucket state including usage, quota, and available space. (HDDS-10476)
- Unit tests and documentation for creating keys/files with EC replication config using ofs/o3fs. (HDDS-10553)
- Support for passing Kerberos credentials in GrpcOmTransport. (HDDS-11041)
- S3 Gateway endpoints for static content and admin purposes (/prom, /logs, etc.) are now served on a separate port (default: 19878). Config keys are under
ozone.s3g.webadmin
. (HDDS-7307) - Improved logging for container not found in CloseContainerCommandHandler to INFO level. (HDDS-9958)
- Container scanner (
hdds.datanode.container.scrub.enabled
) is now enabled by default. (HDDS-10485) - Upgraded jgrapht to 1.4.0. (HDDS-10503)
- Bumped follow-redirects to 1.15.6 in Ozone Recon. (HDDS-10526)
- Bumped axios to 0.28.0 in Ozone Recon. (HDDS-10669)
- Bumped es5-ext to 0.10.64 in Ozone Recon. (HDDS-10673)
- Bumped ip to 1.1.9 in Ozone Recon. (HDDS-10674)
- Bumped browserify-sign to 4.2.3 in Ozone Recon. (HDDS-10676)
- Bumped plotly.js to 2.25.2 in Ozone Recon. (HDDS-10677)
- Replaced ConcurrentHashMap with HashMap protected by ReadWriteLock in NodeStateMap for potential performance improvement. (HDDS-10830)
- Replaced ConcurrentHashMap with HashMap in PipelineStateMap as access is already protected by locks. (HDDS-10971)
- Bumped express to 4.21.0 in Ozone Recon. (HDDS-11460)
- Bumped vite to 4.5.5 in Ozone Recon. (HDDS-11467)
- Improved array handling efficiency, avoiding legacy conversions and double conversions. (HDDS-11544)
- Extracted common Kubernetes definitions for HttpFS and Recon from getting-started example. (HDDS-11845)
- Reverted workaround added by HDDS-8715 for thread renaming, as the underlying Hadoop issue HDFS-13566 is fixed in the current Hadoop version. (HDDS-12470)
- Migrated Ozone Recon UI build process from react-scripts/Jest to Vite/vitest. (HDDS-11017)
- Added wrapper methods for getting/setting port details (Standalone, Ratis, Rest) in DatanodeDetails, replacing direct usage. (HDDS-117)
- Refactored OMRequest building in TrashOzoneFileSystem to reduce code duplication. (HDDS-6796)
- Switched chunk file reading in Datanode to use Netty’s ChunkedNioFile for potential performance improvement. (HDDS-7188)
- Improved multipart upload part ETag generation to use MD5 hash of content for consistency. (HDDS-9680)
- Pipeline failure now triggers an immediate heartbeat to SCM to minimize client impact. (HDDS-9823)
- Improved performance of processing IncrementalContainerReport requests from DN in Recon by batching SCM lookups and reducing client timeouts. (HDDS-9883)
- Changed Recon datanode ‘Last Heartbeat’ display to show relative time values (e.g., “2s ago”) instead of absolute timestamps. (HDDS-9933)
- SCM UI now shows cluster storage usage percentage in addition to absolute values. (HDDS-9988)
- Added functionality to freon OmMetadataGenerator (ommg) Test. (HDDS-10025)
- Improved logs for SCMDeletedBlockTransactionStatusManager. (HDDS-10029)
- OzoneManagerRatisServer.getServer() now returns the specific Ratis
Division
for the group. (HDDS-10036) - Reduced buffer copying in OMRatisHelper by using ByteBuffer. (HDDS-10037)
- Consolidated and added tests for the Ratis write path for prefix ACL operations. (HDDS-10066)
- Refined SCM start-up logs for clarity and reduced noise (removed duplicate balancer config, reduced cert info verbosity). (HDDS-10271)
- Removed unnecessary sorting when excluding Datanodes during Ratis Pipeline Creation based on pipeline limits. (HDDS-10345)
- CopyObjectResponse ETag is now based on the content hash of the copied key, consistent with PutObject. (HDDS-10403)
- Avoided unnecessary creation of ChunkInfo objects in container-service code by directly accessing proto fields. (HDDS-10410)
- Prefix ACL checks now correctly resolve bucket links. (HDDS-10412)
- Refined audit logging for bucket property update operations to include quota and replication details. (HDDS-10460)
- Implemented logic to fail Datanode decommission early if the cluster doesn’t have enough nodes to maintain replication requirements. (HDDS-10462)
- Refined audit logging for bucket creation to include quota, owner, and replication details. (HDDS-10475)
- Standardized byte array to String conversion for RocksDB LiveFileMetaData using UTF-8 and StringUtils.bytes2String, removing BouncyCastle dependency. (HDDS-10744)
- Tool
ozone admin find-ec-missing-padding-blocks
added to detect keys affected by missing EC padding blocks (HDDS-10681). (HDDS-10751) - Improved logging for signature verification failures in OzoneDelegationTokenSecretManager to aid debugging. (HDDS-10802)
- Implemented
getHomeDirectory
in OzoneFileSystem implementations to correctly return/user/<ugi user>
in secure clusters, respecting impersonation. (HDDS-10905) - Reduced client watch requests by using CommitInfoProto from NotReplicatedException (requires Ratis 3.1.0+ and config tuning). (HDDS-10932)
- Added Netty off-heap memory usage metrics to OM and SCM for better monitoring. (HDDS-11100)
- Enhanced
ozone admin containerbalancer status
output with richer information including start time, parameters, progress details, and involved datanodes using-v
or--verbose
. (HDDS-11120) - Improved SCM WebUI display: formatted JVM properties, added DN version/UUID to list, formatted SCM HA info as a list. (HDDS-11196)
- Added statistical indicators (min, max, median, stdev) for DataNode storage usage to SCM UI/metrics. (HDDS-11206)
- Added statistics for Capacity, ScmUsed, Remaining, NonScmUsed storage space indicators. (HDDS-11252)
- Improved CLI display for OM/SCM roles with a
--table
option. (HDDS-11268) - Added statistics for node status counts (Healthy, Dead, Decommissioning, EnteringMaintenance). (HDDS-11272)
- Allowed disabling OM version-specific features via internal config (e.g., atomic rewrite key). (HDDS-11378)
- Introduced schema versioning for Recon DB to handle upgrades and distinguish schema changes. (HDDS-11465)
- Added statistics for Pipeline and Container counts/states to SCM UI/metrics. (HDDS-11469)
- Improved
--duration
option handling in freon tests (ombg, ommg) for consistency with-n
limit. (HDDS-11494) - Made SCMDBDefinition a singleton to reflect its immutability. (HDDS-11555)
- Simplified DBColumnFamilyDefinition by removing redundant keyType/valueType fields (relying on Codec). (HDDS-11557)
- Made ReconSCMDBDefinition a singleton. (HDDS-11589)
- Clarified OM Ratis configuration change log message to avoid confusion about peer roles. (HDDS-11623)
- Optimized
OmUtils.normalizeKey
to checkisDebugEnabled
before performing string comparison. (HDDS-11669) - Enhanced Recon metrics for background task status (lastRunStatus, currentTaskStatus) and queue monitoring. (HDDS-11680)
- Implemented OM-side filtering for ranged GET requests for specific MPU parts to reduce network overhead. (HDDS-11699)
- Refactored S3 request unmarshalling logic to reduce code duplication. (HDDS-11739)
- Improved efficiency of
BufferUtils.writeFully
forByteBuffer[]
usingGatheringByteChannel
. (HDDS-11860) - The ozonefs-hadoop3-client jar may be optionally relocated to a different classpath fix by specifying the Maven properties
proto.shaded.prefix
. (HDDS-12116) - Changed default Replication Manager command deadline to 12 minutes (SCM) and Datanode offset to 6 minutes. (HDDS-12135)
- Improved error messages in Ozone CLI for FileSystemExceptions (e.g., NoSuchFileException, AccessDeniedException) when not in verbose mode. (HDDS-12241)
- Returned explicit QUOTA_EXCEEDED S3 error code instead of a generic 500 internal error. (HDDS-12329)
- Optimized
listMultipartUploads
by removing duplicate key scanning inOmMetadataManagerImpl
. (HDDS-12371) - Changed ContainerID to be a value-based class, enforcing factory methods and improving efficiency with cached proto/hash. (HDDS-12541)
- Combined
containerMap
andreplicaMap
in SCM’sContainerStateMap
into a single map for simplicity and efficiency. (HDDS-12555) - Moved
StorageTypeProto
enum from OM/SCM specific proto files to the commonhdds.proto
. This is a Java API incompatible change for internal protocols but wire compatible. (HDDS-12750) - Added configuration (
ozone.client.ratis.watch.type
) to tune the replication level (ALL_COMMITTED or MAJORITY_COMMITTED) for client watch requests. (HDDS-2887) - SCM StateMachine now uses Ratis
notifyLeaderReady
API instead of relying solely onnotifyTermIndexUpdated
. (HDDS-10690) - Refactored OM request
validateAndUpdateCache
methods to passExecutionContext
instead of justTermIndex
. (HDDS-11975) - Reduced unnecessary object creation (RunningDatanodeState, EndpointTasks) during Datanode heartbeat processing when state is RUNNING. (HDDS-11083)
- Improved replication metrics consistency across Datanode commands handled by ReplicationSupervisor and those handled directly. (HDDS-11376)
- Improved logging in Container Balancer’s AbstractFindTargetGreedy to detail why potential targets are excluded. (HDDS-10198)
- Refined
ozone admin containerbalancer status
output for better readability and detail, including time consumption and data units (MB/GB). (HDDS-11367) - Added Pipeline count to
ozone admin datanode usageinfo
output. (HDDS-11357) - Removed redundant
CommandHandler
thread pool size methods (already covered by ReplicationSupervisor metrics). (HDDS-11304) - Replaced
clusterId
parameter inKeyValueHandler
with initialization viasetClusterId
to prevent potential NPE during concurrent container creation under high load. (HDDS-11396) - Added
ozone.om.ratis.leader.election.minimum.timeout.duration.key
config to OM RaftProperties for leader election timeout. (HDDS-10761) - Added configuration (
ozone.om.rocksdb.max_open_files
) to set RocksDBmax_open_files
option for OM DB. (HDDS-11191) - Standardized Datanode command metrics tracking across ReplicationSupervisor and direct command handlers. (HDDS-11444)
- Optimized Recon List Keys API by reusing calculated path prefix for consecutive keys with the same parent ID. (HDDS-11668)
- Optimized Recon List Keys API response generation by reducing object creation (avoiding OmKeyInfo) and memory buffering. (HDDS-11660)
- Optimized Recon List Keys API filtering logic by replacing predicate lambdas with simple IF statements for performance. (HDDS-11649)
- Added foundational schema upgrade action (
InitialConstraintUpgradeAction
) for Recon to handle constraints on existing tables (e.g., Unhealthy Containers) upon first upgrade to schema versioning. (HDDS-11615) - Added Ozone wrapper configurations (
ozone.scm.ipc.server.read.threadpool.size
,ozone.hdds.datanode.ipc.server.read.threadpool.size
) to increaseipc.server.read.threadpool.size
for SCM and Datanode RPC servers (default 10). (HDDS-11302) - Refactored
ContainerStateMap
to restrictContainerAttribute
generic type T to Enum, removing unused ownerMap/repConfigMap. (HDDS-12532) - Refactored
ContainerStateManager
interface to remove redundantContainerID
parameters whenContainerReplica
(which contains the ID) is already passed. (HDDS-12572) - Refactored DB/Table classes to use the DB name as the thread name prefix implicitly, removing the explicit parameter. (HDDS-12590)
- Included
ContainerInfo
withinContainerAttribute
to avoid extra map lookups inContainerStateManager
methods. (HDDS-12591) - Enabled custom
ValueCodec
forTypedTable
to allow performance optimizations like partial deserialization (e.g., OmKeyInfo without ACLs/locations). (HDDS-12582) - Made
ozone admin scm safemode --verbose
show rule status even when SCM is not in safe mode. (HDDS-12548) - Addressed thread safety issue in
BlockOutputStream#failedServers
by using a concurrent collection. (HDDS-12331) - Added DatanodeID validation for incoming ContainerCommandRequests and on Ratis group joins to prevent operations on incorrect nodes. (HDDS-11667)
- Persisted the list of container IDs created on a Datanode to prevent recreation after volume failures, ensuring consistency for both Ratis and EC containers. (HDDS-11650)
- Added check for rocks_tools native library in
ozone checknative
CLI command output. (HDDS-11347) - Added Ozone cluster growth rate metric (based on
scm_node_manager_total_used
rate) to Grafana dashboard using PromQL. (HDDS-12168) - Added robust error handling for Recon OM background tasks (e.g., NSSummary) to prevent data inconsistencies if Recon crashes during partial event processing. (HDDS-12062)
- LegacyReplicationManager (
hdds.scm.replication.enable.legacy=true
) is removed and no longer supported. (HDDS-11759) - FILE_PER_CHUNK container layout (
ozone.scm.container.layout
) is deprecated. New containers cannot be created with this layout. Support will be removed in a future release. (HDDS-11753) - Removed LegacyReplicationManager implementation and the
hdds.scm.replication.enable.legacy
config property. (HDDS-11759) - Removed unused
resultCache
andgetMatchingContainerIDs
method fromContainerStateMap
. (HDDS-12445) - TriggerDBSyncEndpoint admin-only API handling in Recon fixed. (HDDS-11436)
- Fixed potential
NullPointerException
inOzoneManagerProtocolClientSideTranslatorPB.listStatusLight
when startKey is null (e.g., via s3a). (HDDS-10367) - Addressed memory leak caused by ThreadLocal usage in
OMClientRequest
(OMLockDetails). (HDDS-10385) - Fixed Container Balancer incorrectly selecting containers with 0 or negative size for moving. (HDDS-10483)
- Fixed inability to write files when Datanode chunk data validation (
hdds.datanode.chunk.data.validation.check
) is enabled due to buffer position issue. (HDDS-10547) - Fixed Recon startup failure (“used space cannot be negative”) by handling Datanode reports with negative used space gracefully. (HDDS-10614)
- Fixed
IOException: ParentKeyInfo ... is null
in Recon Namespace Summary task by handling cases where parent info might be missing. (HDDS-10855) - Fixed EC Reconstruction failure (
IllegalArgumentException: The chunk list has X entries, but the checksum chunks has Y entries
) potentially caused by out-of-order EC stripe writes leading to inaccurate chunk lists. (HDDS-10985) - Fixed OM crash (
SnapshotChainManager: Failure while loading snapshot chain
) caused by SstFilteringService directly updating snapshot info DB entries, potentially corrupting the chain if OM restarts before DoubleBuffer flush. (HDDS-11068) - Resolved
ClassCastException
(RepeatedOmKeyInfo to OmKeyInfo) in Recon’s FileSizeCountTask due to improper event handling in OMDBUpdatesHandler for conflicting keys across tables (e.g., file and directory with the same name). (HDDS-11187) - Fixed
ContainerSizeCountTask
in Recon logging ERROR for negative-sized containers; reduced log level as these are ignored functionally. (HDDS-12227) - Fixed duplicate key violation in Recon’s
FileSizeCountTask
by correctly handling theisDbTruncated
flag to allow updates instead of only inserts. (HDDS-12228) - Made OzoneClientException extend IOException. (HDDS-64)
- Fixed various S3 gateway issues including multipart upload and other improvements. (HDDS-1186)
- Fixed SCM Decommissioning issue causing
InvalidStateTransitionException
after recommissioning the same SCM node. (HDDS-9608) - Fixed Recon Disk Usage page UI issues with large numbers of keys/buckets/volumes (pie chart usability, axis ticks, path overflow). (HDDS-9626)
- Fixed Ozone admin namespace CLI
du
command printing incorrect validation error messages for root (""/"") or volume paths. (HDDS-9644) - Fixed Recon incorrectly including out-of-service (decommissioned, maintenance) nodes when checking container health status (over/under/mis-replication). (HDDS-9645)
- Fixed potential
NullPointerException
inContainerStateMap.ContainerAttribute
due to race condition between update and get operations. (HDDS-9527) - Fixed Recon potentially showing duplicate DEAD datanodes after decommission/reformat/recommission cycles. (HDDS-10409) -> Now only allows removing DEAD nodes. (HDDS-11032)
- Fixed potential memory overflow in Recon’s Container Health Task due to unbounded list growth. (HDDS-9819)
- Reduced Ozone client heap memory utilization during writes by using pooled direct buffers for chunks. (HDDS-9843)
- Fixed
Pipeline.nodesInOrder
using ThreadLocal, making it inaccessible to other threads after being set. (HDDS-9848) - Switched
KeyValueContainerCheck.verifyChecksum
to use direct/mapped buffers instead of heap buffers. (HDDS-9941) - Fixed
TokenRenewer
implementations (O3FS, OFS) not closing the createdOzoneClient
. Removed duplicate implementation. (HDDS-9943) - Fixed NSSummaryAdmin CLI commands not closing created OzoneClient instances and creating multiple instances unnecessarily. (HDDS-9944)
- Fixed incorrect synchronization in
RatisSnapshotInfo
, potentially leading to inconsistent term/index values. Class removed as redundant to TransactionInfo. (HDDS-9984) - Fixed
Options
andReadOptions
instances not being closed properly inrocksdb-checkpoint-differ
. (HDDS-10001) - Renamed
ManagedSstFileReader
inrocksdb-checkpoint-differ
toSstFileSetReader
to avoid name collision with the class inhdds-managed-rocksdb
. (HDDS-10007) - Fixed potential
NullPointerException
inVolumeInfoMetrics.getCommitted()
ifHddsVolume.committedBytes
is null. (HDDS-10027) - Refined SCM RPC handler counts to be configurable per protocol (Client, Block, Datanode) instead of a single global count. (HDDS-10088)
- Removed static
dbNameToCfHandleMap
from RocksDatabase, using non-staticcolumnFamilies
map instead. (HDDS-10107) - Fixed potential
NullPointerException
in OMDBCheckpointServlet lock acquisition when SstFilteringService is accessed before initialization. (HDDS-10138) - Enabled Zero-Copy reads during container replication for improved performance. (HDDS-10144)
- Corrected metric names
createOmResoonseLatencyNs
andvalidateAndUpdateCacneLatencyNs
inOMPerformanceMetrics
. (HDDS-10162) - Fixed
OmMetadataManagerImpl
creating a newS3Batcher
instance for each S3 secret operation instead of reusing one. (HDDS-10202) - Ensured atomic updates in
StateContext#updateCommandStatus
usingcomputeIfPresent
to prevent race conditions. (HDDS-10210) - Fixed Grafana dashboards: removed UID/hostnames, included secure/unsecure ports, corrected datastore count. (HDDS-10229)
- Prevented V3 Schema DatanodeStore from creating container DBs in incorrect locations under certain initialization paths. (HDDS-10230)
- Fixed
ContainerStateManager
finalizing OPEN containers without a healthy pipeline on follower SCMs; moved logic to leader-only path via Ratis. (HDDS-10231) - Improved JSON response for Deleted Directories and Open Keys Insight Endpoints in Recon for better clarity (using actual names instead of Object IDs). (HDDS-10241)
- Fixed
ContainerReport
admin command showing incorrect values immediately after SCM restart before Replication Manager runs. (HDDS-10272) - Fixed pagination on the OM DB Insights page in Recon. (HDDS-10282)
- Added support for direct ByteBuffers in Checksum calculations, using reflection for Java 9+ API while maintaining Java 8 compatibility. (HDDS-10288)
- Fixed
ECReconstructionCoordinator
ignoringozone-site.xml
client configurations and using defaultOzoneClientConfig
. (HDDS-10294) - Fixed potential orphan blocks during key overwrite operations, especially involving the deleted key table. (HDDS-10296)
- Fixed
KeyManagerImpl#listKeys
path normalization to correctly handle OBS/LEGACY buckets whenozone.om.enable.filesystem.paths
is true. (HDDS-10319) - Fixed metadata not being updated when overwriting existing keys via S3 PutObject. (HDDS-10324)
- Fixed
SetTimes
API not working with linked buckets due to missing link resolution. (HDDS-10369) - Fixed Recon not handling pre-existing MISSING_EMPTY containers correctly (introduced in HDDS-9695), leaving them marked as missing indefinitely. (HDDS-10370)
- Fixed S3 listParts incompatibility for keys created before HDDS-9680 (missing ETag metadata) and NPE when ETag is null. (HDDS-10395)
- Restricted directory deletion in LEGACY buckets via
ozone sh key delete
; users must useozone fs
interface. (HDDS-10397) - Fixed
ArrayIndexOutOfBoundsException
when listing keys in OBS buckets via S3/s3a under certain conditions. (HDDS-10399) - Fixed
ozone admin
CLI having hard-coded INFO log level, ignoring environment/config settings. (HDDS-10405) - Fixed Datanode startup failure (“Illegal configuration: raft.grpc.message.size.max must be 1m larger than …") when using latest Ratis due to default config mismatch. (HDDS-11375)
- Fixed Datanode startup failure (“checksum size setting 1024 is not in expected format”) due to incorrect type validation for
hdds.ratis.raft.server.snapshot.creation.gap
. (HDDS-10423) - Fixed Grafana dashboard Prometheus endpoint configuration for Datanodes and added missing Recon endpoint. (HDDS-10433)
- Fixed Datanode Maintenance failing early incorrectly (logic refined). (HDDS-10463)
- Fixed OM potentially crashing or failing requests if the configured S3 secret storage (Vault) is unavailable. (HDDS-10469)
- Fixed audit log for key creation missing EC replication config details (parity, chunk size, codec). (HDDS-10472)
- Fixed potential
NullPointerException
inOmUtils.getAllOMHAAddresses
if OM HA config keys are missing. (HDDS-10508) - Fixed S3 GetObject ETag header returning
"null"
for objects without an ETag, causing issues with AWS SDK validation. Now omits the header if ETag is missing. (HDDS-10521) - Fixed
MessageDigest
instance in S3 endpoint potentially not being reset after exceptions (e.g., client cancellation), leading to incorrect ETags on subsequent requests using the same thread. (HDDS-10587) - Fixed issue where client might attempt Ratis streaming for keys defaulted to EC replication if bucket replication isn’t explicitly set. (HDDS-10832)
- Fixed freon read/mixed operations failing with “Key not found” if prefix is unspecified; stopped adding random prefix. Fixed misleading random prefix log in
ommg
. (HDDS-10845) - Fixed Ozone CLI not respecting default
ozone.om.service.id
when only one service ID is configured. (HDDS-10861) - Fixed
ClosePipelineCommandHandler
potentially causingGroupMismatchException
by callingremoveGroup
before getting peer list for propagation. (HDDS-10875) - Fixed Recon
ReconContainerManager
potentially throwingDuplicatedPipelineIdException
when checking/adding containers due to race conditions or stale data. (HDDS-10880) - Improved logging clarity in Recon’s
ReconNodeManager
regarding datanode finalization status checks during upgrades. (HDDS-10883) - Fixed OM startup failure in single-node Docker container due to Ratis group directory mismatch when using default service ID. (HDDS-10909)
- Fixed Recon startup failing silently or logging incorrect errors in non-HA SCM scenarios due to inability to fetch SCM roles or snapshot. (HDDS-10937)
- Fixed OM decommission config (
ozone.om.decommissioned.nodes
) not working without service ID suffix when only one OM service ID is configured. (HDDS-10942) - Fixed EC key read corruption potentially occurring if a container’s replica index on a DN mismatches the index expected by the client (e.g., after container move). Added validation. (HDDS-10983)
- Fixed S3 gateway potentially throwing exceptions (
javax.xml.xpath.XPathExpressionException
) during concurrent XML parsing (e.g., CompleteMultipartUpload, DeleteObjects). (HDDS-10777) - Fixed
NullPointerException
inXceiverClientRatis.watchForCommit
whenupdateCommitInfosMap
encounters a new Datanode ID in the response after a previous timeout removed it fromcommitInfoMap
. (HDDS-10780) - Fixed potential
OMLeaderNotReadyException
after leader switch if transactions were pending in the double buffer, preventinglastNotifiedTermIndex
update. (HDDS-10798) - Fixed various HTTP server components (Recon, SCM, OM, DN) failing to start if configured with a wildcard Kerberos principal (
*
) due to missingkerb-core
dependency. (HDDS-10803) - Fixed S3
setBucketAcl
causingUnsupportedOperationException
due to attempting to modify an immutable list returned byOzoneVolume.getAcls()
. (HDDS-11737) - Fixed SCM leadership metric (
SCMLeader
) potentially being reset to null by HTTP server initialization after the Raft server has already determined leadership. (HDDS-11742) - Fixed
SnapshotDiffManager
loggingNativeLibraryNotLoadedException
as ERROR even when native tools are optional; changed to WARN. (HDDS-11486) - Fixed potential
NullPointerException
when checking container balancer status (ozone admin containerbalancer status
) if balancer is started but not fully initialized (e.g., waiting for DU info). (HDDS-11350) - Fixed
ozone fs -rm -r
prompt for volume deletion suggesting incorrectozone sh volume delete
options (-skipTrash
,-id
). (HDDS-11346) - Fixed
ozone sh key list -h
showing duplicate options (--all
,--length
) due to picocli version issue (reverted). (HDDS-11446) -> Reverted picocli upgrade. - Fixed S3 CompleteMultipartUpload returning 500 Internal Server Error instead of S3-compliant InvalidRequest error when no parts are specified in the request body. (HDDS-11457)
- Fixed multiple
IOzoneAuthorizer
instances potentially being created and leaked if Ratis snapshot installation fails repeatedly after stopping the metadata manager. (HDDS-11472) - Fixed
ozone sh volume delete
command line parsing error for-r
option. (HDDS-11535) -> Resolved as part of HDDS-11346 fix. - Fixed
NullPointerException
in OM when overwriting an empty file using multipart upload in FSO buckets (versioning disabled). (HDDS-12131) - Fixed Replication Manager (
hdds.scm.replication.thread.interval
) interval configuration description to correctly state milliseconds instead of seconds. (HDDS-12144) -> Resolved by removing unsupported types. - Fixed Grafana dashboard for Chunk read/write rates using incorrect interval variable (
$__interval
instead of$__rate_interval
). (HDDS-12112) - Fixed Replication Manager potentially expiring pending container deletes incorrectly instead of retrying them if the Datanode doesn’t confirm deletion within the deadline. (HDDS-12127)
- Fixed Replication Manager non-deterministically selecting replicas for deletion if preferred target nodes are overloaded, potentially deleting required replicas. (HDDS-12115)
- Fixed delete container commands potentially running indefinitely or past their deadline due to long lock waits or slow disk I/O; added lock timeout and moved ICR earlier. (HDDS-12114)
- Fixed Recon UI potentially switching from old UI to new UI automatically upon page refresh. (HDDS-12084)
- Fixed missing local refresh button in new Recon UI’s Disk Usage page to reload data for the current path without navigating back to root. (HDDS-12085)
- Fixed unnecessary parameters “Source Volume” & “Source Bucket” appearing in the metadata table for non-link buckets in the new Recon UI Disk Usage page. (HDDS-12073)
- Fixed Recon API endpoints
/api/v1/volumes
and/api/v1/buckets
missing from Swagger documentation. (HDDS-11300) - Fixed potential
NullPointerException
in Recon/api/v1/volumes
and/api/v1/buckets
endpoints if accessed before Recon tables are fully initialized after startup. (HDDS-11349) - Fixed
ozone freon cr
(closed container replication) command failing with NPE due to metrics map lookup failure in ReplicationSupervisor. (HDDS-12040) - Fixed incorrect display of Ozone Service ID name in Recon UI (New UI showed “OM ID”, Old UI showed “OM Service”). Corrected to “Ozone Service ID”. (HDDS-12049)
- Fixed difference in Cluster Capacity % calculation (floor vs round) and Container Pre-Allocated Size display (committed vs 0) between new and old Recon UI. (HDDS-12042)
- Fixed long path names wrapping to the next line in the new Recon UI Disk Usage page; made it scrollable instead. (HDDS-11957)
- Fixed Recon failing to update
version_number
inRECON_SCHEMA_VERSION
table (always -1), causing upgrade actions to run unnecessarily on fresh installs. (HDDS-11846) - Fixed serialization error (
Conflicting/ambiguous property name
) in Recon’s listKeys API due to Jackson ambiguity betweenkey
field andisKey()
getter. Renamed getter. (HDDS-11848) - Fixed potential deadlock in OM between DoubleBuffer flush thread (waiting for DeletedTable lock during snapshot checkpoint) and KeyDeletingService (holding DeletedTable lock, waiting for Ratis future). (HDDS-11124)
- Fixed OM crashing with
IOException: Rocks Database is closed
duringSnapshotMoveDeletedKeys
request processing if the snapshot was purged concurrently. (HDDS-11152) - Fixed containers potentially stuck in DELETING state after upgrade if they were affected by HDDS-8129 (incorrect block counts) and datanodes rejected delete commands due to negative counts. Added recovery logic. (HDDS-11136)
- Fixed
fs -mkdir
incorrectly creating directories in OBS buckets (bypassing layout validation added in HDDS-11235). Reverted the optimization for mkdir. (HDDS-11348) - Fixed Datanode potentially failing heartbeats or other operations due to deadlock on
StateContext#pipelineActions
map under high load. Replaced with concurrent map. (HDDS-11331) - Fixed S3 gateway returning 403 Forbidden instead of 302 Redirect for root path (
/
) requests containingAuthorization: Negotiate
header (used by newer curl versions). (HDDS-11096) - Fixed
DELETE_TENANT
request logging an unnecessary and uninformativeUPDATE_VOLUME
audit entry, even on failure. (HDDS-11119) - Fixed intermittent timeout in
TestBlockDeletion.testBlockDeletion
potentially caused by race conditions or slow command processing. (HDDS-9962) - Fixed “Bad file descriptor” error in
TestOmSnapshotFsoWithNativeLib.testSnapshotCompactionDag
when using native RocksDB tools library. (HDDS-10149) - Fixed
ManagedStatistics
objects not being closed properly in OM and DN when RocksDB statistics are enabled, leading to resource leaks. (HDDS-10184) - Fixed race condition in
RocksDatabase
where a close operation could occur betweenassertClose()
check and database operation, causing JVM crash. (HDDS-9527) - Fixed ReplicationManager metrics not being re-registered after RM restart via CLI (
stop
/start
), causing metrics to stop reporting. (HDDS-9235) - Fixed infinite loop in
WritableRatisContainerProvider.getContainer
if SCM restarts and existing pipeline nodes are not found (e.g., DNs stopped), causing log flooding. (HDDS-8982) - Fixed SCM follower nodes logging
NotLeaderException
errors when processing Pipeline Reports, which is expected behavior for followers. Suppressed error logging. (HDDS-11695) - Fixed
FileNotFoundException: ... (Too many open files)
and subsequent DN crashes during OM+DN decommission under heavy Freon load, likely due to excessive open file handles. (HDDS-11391) -> Addressed potential causes. - Fixed Recon showing incorrect (zero) count for DELETED containers in cluster state summary API (
/api/v1/clusterState
). (HDDS-11389) - Fixed issue where OM could fail if
IOzoneAuthorizer
(e.g., Ranger plugin) fails to initialize during snapshot installation and reload attempts create multiple instances, leading to heap exhaustion. (HDDS-11472) - Fixed
NoSuchUpload
error when aborting multipart uploads for keys where the parent directory was missing (potentially due to FSO-related bugs or cleanup issues). (HDDS-11784) - Fixed secure acceptance tests failing on arm64 due to keytab checksum mismatch when using keytabs generated on amd64. Regenerated keytabs for multi-arch compatibility. (HDDS-11810)
- Fixed race condition in datanode VERSION file creation where multiple threads could attempt to write using the same temporary file via
AtomicFileOutputStream
. (HDDS-12608) - Fixed SCM logging an error when updating sequence ID for a CLOSED container based on a replica report with higher BCSID; changed log level and added context. (HDDS-12409)
- Disabled REST endpoint for S3 secret manipulation by username for non-admin users via S3Gateway Secret REST endpoint. (HDDS-11040)
- Snapshots
- Erasure Coding V2: offline reconstruction, container balancer support
- Recon: heat maps, more diagnose info
- Certificate auto-rotation
- SCM decommissioning
- Improved OM decommissioning
- Symmetric key-based block/container token implementation
- Quota on FSO bucket
- Erasure Coding - V1 support (Online reconstruction support)
- Container balancer
- Limit number of rocksdb instances to 1 per disk on high-density datanodes
- Ozone S3 Multi-tenancy Support
- S3 Grpc improvements for metadata path
- Filesystem Optimized and Object Store bucket layout types
- Non-rolling downgrade support to 1.1.0
- Storage Container Manager high availability for new clusters
- Filesystem prefix optimization for new clusters
- Transparent data encryption for S3 buckets
- Security enhancements
- Volume/Bucket Quota Support
- Security related enhancements
- ofs/o3fs performance improvements with Teragen and Spark
- Recon improvements
- OM HA - HA support for Ozone Manager
- Security Phase II
- Support Hadoop 2.7 & security enabled Hadoop 2.x
- Ozone OFS new Filesystem scheme
- Ozone Filesystem performance improvement
- Recon & Recon UI improvements
- Bucket link & S3 volume/bucket mapping changes
- O3FS - S3A based file system semantics
- Able to handle 1 billion objects
- Network topology awareness for block placement
- GDPR Right to Erasure
- Ozone Native ACL support
- First class K8s support
- Stability improvements
- Hadoop Delegation Tokens and Block Tokens supported for Ozone.
- Transparent Data Encryption (TDE) Support - Allows data blocks to be encrypted-at-rest.
- Kerberos support for Ozone.
- Certificate Infrastructure for Ozone - Tokens use PKI instead of shared secrets.
- Datanode to Datanode communication secured via mutual TLS.
- Ability secure ozone cluster that works with Yarn, Hive, and Spark.
- Helm/Skaffold support to deploy Ozone clusters on K8s.
- Support S3 Authentication Mechanisms like - S3 v4 Authentication protocol.
- S3 Gateway supports Multipart upload.
- S3A file system is tested and supported.
- Support for Tracing and Profiling for all Ozone components.
- Audit Support - including Audit Parser tools.
- Apache Ranger Support in Ozone.
Release 2.0.0 available
2025 Apr 30
Release Notes
Apache Ozone 2.0.0 adds 1708 new features, improvements and bug fixes on top of Ozone 1.4.
A new command “ozone repair update-transaction” is added to update the highest index in OM transactionInfoTable.
A new layout version, HBASE_SUPPORT (7) is added to Ozone Manager that provides the guardrail for the full support of hsync, lease recovery and listOpenFiles APIs for HBase.
Ozone clients can now interact with multiple encrypted Ozone clusters. This improvement enables distcp to copy from one encrypted source Ozone cluster to another encrypted destination Ozone cluster.
Remove the predefined hdds.ratis.raft.grpc.message.size. Its default value is determined by hdds.container.ratis.log.appender.queue.byte-limit + 1MB = 33MB.
It is now required to toggle an extra config switch to allow HBase-related enhancements to be enabled.
Server-side (OM): Set ozone.hbase.enhancements.allowed to true. Client-side: Set ozone.client.hbase.enhancements.allowed to true.
For more details, see their respective config description.
Ozone 2.0 added support for output stream hsync/hflush API support. In addition, lease recovery (recoverLease()), setSafeMode(), file system API support are added.
Provide Rocky Linux-based convenience Ozone docker image
Ozone did not support snapshots on linked buckets before this release. However, a user could have inadvertently created snapshots on linked buckets. Hence when upgrading from an older version that doesn’t support snapshots on linked buckets to a newer version that supports snapshots on linked buckets, it is essential to ensure that there are no snapshots on linked buckets otherwise they will linger around. If there are any snapshots on linked buckets, those snapshots need to be deleted by using snapshot delete command:
ozone sh snapshot delete <vol>/<linked bucket name> <snapshot name>
Ozone’s Hadoop dependency version was updated from 3.3.6 to 3.4.1.
Added a new command “ozone repair om fso-tree” to detect and repair broken FSO trees caused by bugs such as HDDS-7592, which can orphan data in the OM.
Usage: ozone repair om fso-tree –db <dbPath> [–repair | –r] [–volume | -v <volName>] [–bucket | -b <bucketName>] [–verbose]
A Storage Container Manager can now be decommissioned from a set of SCM nodes. Check out user doc for usage and more details: https://ozone.apache.org/docs/edge/feature/decommission.html
FILE_PER_CHUNK container layout (ozone.scm.container.layout) is deprecated. Starting from Apache Ozone 2.0, users will not be able to create new FILE_PER_CHUNK containers.
The support will be removed in a future release.
AWS Java SDK V2 2.30.0 introduced an incompatible protocol change that caused file upload to Ozone S3 Gateway to fail or append a trailer data silently. S3G is now updated to support AWS Java SDK V2 2.30.0 and later.
Non-HA 1.4.1 cluster (in a non-rolling fashion) upgrade to 2.0.0 is tested.
Ozone Manager and Storage Container Manager will always run in HA (Ratis) mode. Clusters upgrading from non-Ratis (Standalone) mode will automatically run in single node HA (Ratis) mode.
Moved StorageTypeProto from OmClientProtocol.proto and ScmServerDatanodeHeartbeatProtocol.proto to hdds.proto. As a result, the java_outer_classname changed from OzoneManagerProtocolProtos and StorageContainerDatanodeProtocolProtos to HddsProtos.
This change is
A set of S3 REST API endpoints are available to manage S3 secrets: /secret for getting a secret. /revoke for revoking an existing secret. For more details, check out Securing S3 user document https://ozone.apache.org/docs/edge/security/securings3.html
Changelog
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[2.0.0] - 2025-04-04
Added
Changed
Deprecated
Removed
Fixed
Security
For more details, check out Apache Ozone 2.0.0 JIRA list
This is a generally available (GA) release. It represents a point of API stability and quality that we consider production-ready.
Generated-by: Google AI Studio + Gemini 2.5 Pro Preview 03-25, with input data from a filtered JIRA list using this prompt.
Image credit: Indiana Dunes National Lakeshore, Michigan City, Indiana, USA by Diego Delso, CC-BY-SA 3.0 / Text added to original
Release 1.4.1 available
2024 Nov 24
Apache Ozone 1.4.1 is a minor bug fix release that addresses several issues to enhance stability and reliability compared to version 1.4.0. For a detailed list of commits and resolved JIRA issues in this release, please refer to Changes between ozone-1.4.0 and ozone-1.4.1 on GitHub.
Release 1.4.0 available
2024 Jan 19
Apache Ozone 1.4.0 is released with following features:
This is a generally available (GA) release. It represents a point of API stability and quality that we consider production-ready.
Image credit: Hot Springs Mountain Tower by Jeremy Thompson, CC BY 2.0 / Text added to original
Release 1.3.0 available
2022 Dec 18
Apache Ozone 1.3.0 is released with following features:
This is a generally available (GA) release. It represents a point of API stability and quality that we consider production-ready.
Image credit: Grand Canyon - South Rim - Mather Point by G. Lamar, CC BY 2.0 / Text added to original
Release 1.2.1 available
2021 Dec 22
Apache Ozone 1.2.1 is released with following features:
This is a generally available (GA) release. It represents a point of API stability and quality that we consider production-ready.
Image credit: Glacier National Park 8 by Tony Hisgett, CC BY 2.0 / Text added to original
Release 1.1.0 available
2021 Apr 17
Apache Hadoop Ozone 1.1.0 is released with following features:
This is a generally available (GA) release. It represents a point of API stability and quality that we consider production-ready.
Image credit: diana_robinson , CC BY-NC-ND 2.0
Release 1.0.0 available
2020 Sep 2
Apache Hadoop Ozone 1.0.0 is released with following features:
This is a generally available (GA) release. It represents a point of API stability and quality that we consider production-ready.
Image credit: Denali National Park and Preserve, Caribou and Denali, CC-BY-2.0
Release 0.5.0-beta available
2020 Mar 24
Apache Hadoop Ozone 0.5.0-beta is released with following features:
This is a beta release and not production ready.
Release 0.4.1-alpha available
2019 Oct 13
Apache Hadoop Ozone 0.4.1-alpha is released with following features:
This is an alpha release and not production ready.
Release 0.4.0-alpha available
2019 May 7
Apache Hadoop Ozone 0.4.0-alpha is released with following features:
This is an alpha release and not production ready.