Release 2.0.0 available

Release Notes
Apache Ozone 2.0.0 adds 1708 new features, improvements and bug fixes on top of Ozone 1.4.
- HDDS-10295 | Major | Provide an “ozone repair” subcommand to update the snapshot info in transactionInfoTable
A new command “ozone repair update-transaction” is added to update the highest index in OM transactionInfoTable.
- HDDS-11258 | Blocker | [hsync] Add new OM layout version
A new layout version, HBASE_SUPPORT (7) is added to Ozone Manager that provides the guardrail for the full support of hsync, lease recovery and listOpenFiles APIs for HBase.
- HDDS-11227 | Major | Use OM’s KMS from client side when connecting to a cluster and dealing with encrypted data
Ozone clients can now interact with multiple encrypted Ozone clusters. This improvement enables distcp to copy from one encrypted source Ozone cluster to another encrypted destination Ozone cluster.
- HDDS-11375 | Major | DN Startup fails with Illegal configuration
Remove the predefined hdds.ratis.raft.grpc.message.size. Its default value is determined by hdds.container.ratis.log.appender.queue.byte-limit + 1MB = 33MB.
- HDDS-11342 | Major | [hsync] Add a config as HBase-related features master switch
It is now required to toggle an extra config switch to allow HBase-related enhancements to be enabled.
Server-side (OM): Set ozone.hbase.enhancements.allowed to true.
Client-side: Set ozone.client.hbase.enhancements.allowed to true.
For more details, see their respective config description.
- HDDS-7593 | Major | Supporting HSync and lease recovery
Ozone 2.0 added support for output stream hsync/hflush API support. In addition, lease recovery (recoverLease()), setSafeMode(), file system API support are added.
- HDDS-11329 | Major | Update Ozone images to Rocky Linux-based runner
Provide Rocky Linux-based convenience Ozone docker image
- HDDS-11705 | Critical | Snapshot operations on linked buckets should work on actual underlying bucket
Ozone did not support snapshots on linked buckets before this release. However, a user could have inadvertently created snapshots on linked buckets. Hence when upgrading from an older version that doesn’t support snapshots on linked buckets to a newer version that supports snapshots on linked buckets, it is essential to ensure that there are no snapshots on linked buckets otherwise they will linger around. If there are any snapshots on linked buckets, those snapshots need to be deleted by using snapshot delete command:
ozone sh snapshot delete <vol>/<linked bucket name> <snapshot name>
Ozone’s Hadoop dependency version was updated from 3.3.6 to 3.4.1.
- HDDS-8101 | Major | Add FSO repair tool to ozone CLI in read-only and repair modes
Added a new command “ozone repair om fso-tree” to detect and repair broken FSO trees caused by bugs such as HDDS-7592, which can orphan data in the OM.
Usage:
ozone repair om fso-tree –db <dbPath> [–repair | –r] [–volume | -v <volName>] [–bucket | -b <bucketName>] [–verbose]
- HDDS-7852 | Major | SCM Decommissioning Support
A Storage Container Manager can now be decommissioned from a set of SCM nodes. Check out user doc for usage and more details: https://ozone.apache.org/docs/edge/feature/decommission.html
- HDDS-11753 | Blocker | Deprecate file per chunk layout from datanode code
FILE_PER_CHUNK container layout (ozone.scm.container.layout) is deprecated. Starting from Apache Ozone 2.0, users will not be able to create new FILE_PER_CHUNK containers.
The support will be removed in a future release.
- HDDS-12488 | Major | S3G should handle the signature calculation with trailers
AWS Java SDK V2 2.30.0 introduced an incompatible protocol change that caused file upload to Ozone S3 Gateway to fail or append a trailer data silently. S3G is now updated to support AWS Java SDK V2 2.30.0 and later.
- HDDS-12327 | Blocker | Restore non-HA (to HA) upgrade test
Non-HA 1.4.1 cluster (in a non-rolling fashion) upgrade to 2.0.0 is tested.
- HDDS-11754 | Blocker | Drop support for non-Ratis OM and SCM
Ozone Manager and Storage Container Manager will always run in HA (Ratis) mode. Clusters upgrading from non-Ratis (Standalone) mode will automatically run in single node HA (Ratis) mode.
- HDDS-12750 | Major | Move StorageTypeProto from ScmServerDatanodeHeartbeatProtocol.proto to hdds.proto
Moved StorageTypeProto from OmClientProtocol.proto and ScmServerDatanodeHeartbeatProtocol.proto to hdds.proto. As a result, the java_outer_classname changed from OzoneManagerProtocolProtos and StorageContainerDatanodeProtocolProtos to HddsProtos.
This change is
- Proto wire/binary format: compatible (unchanged)
- Proto text format: compatible (unchanged)
- Java API: incompatible (changed java_outer_classname)
- HDDS-9218 | Major | S3 secret managment through HTTP
A set of S3 REST API endpoints are available to manage S3 secrets:
/secret for getting a secret.
/revoke for revoking an existing secret.
For more details, check out Securing S3 user document https://ozone.apache.org/docs/edge/security/securings3.html
Changelog
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[2.0.0] - 2025-04-04
Added
- New pipeline choosing policy: CapacityPipelineChoosePolicy. This policy randomly chooses pipelines with relatively lower utilization. To use, configure
hdds.scm.pipeline.choose.policy.impl
to org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.CapacityPipelineChoosePolicy
. (HDDS-9345)
- APIs to fetch single datanode specific information, reducing data transfer from server to client. (HDDS-9648)
- Support for symmetric keys for delegation tokens. (HDDS-8829)
- A Storage Container Manager (SCM) can now be decommissioned from a set of SCM nodes. (HDDS-7852)
- Option to close all pipelines via CLI (
ozone admin pipeline close --all
). (HDDS-10742)
- Metrics to monitor bucket state including usage, quota, and available space. (HDDS-10476)
- Unit tests and documentation for creating keys/files with EC replication config using ofs/o3fs. (HDDS-10553)
- Support for passing Kerberos credentials in GrpcOmTransport. (HDDS-11041)
Changed
- S3 Gateway endpoints for static content and admin purposes (/prom, /logs, etc.) are now served on a separate port (default: 19878). Config keys are under
ozone.s3g.webadmin
. (HDDS-7307)
- Improved logging for container not found in CloseContainerCommandHandler to INFO level. (HDDS-9958)
- Container scanner (
hdds.datanode.container.scrub.enabled
) is now enabled by default. (HDDS-10485)
- Upgraded jgrapht to 1.4.0. (HDDS-10503)
- Bumped follow-redirects to 1.15.6 in Ozone Recon. (HDDS-10526)
- Bumped axios to 0.28.0 in Ozone Recon. (HDDS-10669)
- Bumped es5-ext to 0.10.64 in Ozone Recon. (HDDS-10673)
- Bumped ip to 1.1.9 in Ozone Recon. (HDDS-10674)
- Bumped browserify-sign to 4.2.3 in Ozone Recon. (HDDS-10676)
- Bumped plotly.js to 2.25.2 in Ozone Recon. (HDDS-10677)
- Replaced ConcurrentHashMap with HashMap protected by ReadWriteLock in NodeStateMap for potential performance improvement. (HDDS-10830)
- Replaced ConcurrentHashMap with HashMap in PipelineStateMap as access is already protected by locks. (HDDS-10971)
- Bumped express to 4.21.0 in Ozone Recon. (HDDS-11460)
- Bumped vite to 4.5.5 in Ozone Recon. (HDDS-11467)
- Improved array handling efficiency, avoiding legacy conversions and double conversions. (HDDS-11544)
- Extracted common Kubernetes definitions for HttpFS and Recon from getting-started example. (HDDS-11845)
- Reverted workaround added by HDDS-8715 for thread renaming, as the underlying Hadoop issue HDFS-13566 is fixed in the current Hadoop version. (HDDS-12470)
- Migrated Ozone Recon UI build process from react-scripts/Jest to Vite/vitest. (HDDS-11017)
- Added wrapper methods for getting/setting port details (Standalone, Ratis, Rest) in DatanodeDetails, replacing direct usage. (HDDS-117)
- Refactored OMRequest building in TrashOzoneFileSystem to reduce code duplication. (HDDS-6796)
- Switched chunk file reading in Datanode to use Netty’s ChunkedNioFile for potential performance improvement. (HDDS-7188)
- Improved multipart upload part ETag generation to use MD5 hash of content for consistency. (HDDS-9680)
- Pipeline failure now triggers an immediate heartbeat to SCM to minimize client impact. (HDDS-9823)
- Improved performance of processing IncrementalContainerReport requests from DN in Recon by batching SCM lookups and reducing client timeouts. (HDDS-9883)
- Changed Recon datanode ‘Last Heartbeat’ display to show relative time values (e.g., “2s ago”) instead of absolute timestamps. (HDDS-9933)
- SCM UI now shows cluster storage usage percentage in addition to absolute values. (HDDS-9988)
- Added functionality to freon OmMetadataGenerator (ommg) Test. (HDDS-10025)
- Improved logs for SCMDeletedBlockTransactionStatusManager. (HDDS-10029)
- OzoneManagerRatisServer.getServer() now returns the specific Ratis
Division
for the group. (HDDS-10036)
- Reduced buffer copying in OMRatisHelper by using ByteBuffer. (HDDS-10037)
- Consolidated and added tests for the Ratis write path for prefix ACL operations. (HDDS-10066)
- Refined SCM start-up logs for clarity and reduced noise (removed duplicate balancer config, reduced cert info verbosity). (HDDS-10271)
- Removed unnecessary sorting when excluding Datanodes during Ratis Pipeline Creation based on pipeline limits. (HDDS-10345)
- CopyObjectResponse ETag is now based on the content hash of the copied key, consistent with PutObject. (HDDS-10403)
- Avoided unnecessary creation of ChunkInfo objects in container-service code by directly accessing proto fields. (HDDS-10410)
- Prefix ACL checks now correctly resolve bucket links. (HDDS-10412)
- Refined audit logging for bucket property update operations to include quota and replication details. (HDDS-10460)
- Implemented logic to fail Datanode decommission early if the cluster doesn’t have enough nodes to maintain replication requirements. (HDDS-10462)
- Refined audit logging for bucket creation to include quota, owner, and replication details. (HDDS-10475)
- Standardized byte array to String conversion for RocksDB LiveFileMetaData using UTF-8 and StringUtils.bytes2String, removing BouncyCastle dependency. (HDDS-10744)
- Tool
ozone admin find-ec-missing-padding-blocks
added to detect keys affected by missing EC padding blocks (HDDS-10681). (HDDS-10751)
- Improved logging for signature verification failures in OzoneDelegationTokenSecretManager to aid debugging. (HDDS-10802)
- Implemented
getHomeDirectory
in OzoneFileSystem implementations to correctly return /user/<ugi user>
in secure clusters, respecting impersonation. (HDDS-10905)
- Reduced client watch requests by using CommitInfoProto from NotReplicatedException (requires Ratis 3.1.0+ and config tuning). (HDDS-10932)
- Added Netty off-heap memory usage metrics to OM and SCM for better monitoring. (HDDS-11100)
- Enhanced
ozone admin containerbalancer status
output with richer information including start time, parameters, progress details, and involved datanodes using -v
or --verbose
. (HDDS-11120)
- Improved SCM WebUI display: formatted JVM properties, added DN version/UUID to list, formatted SCM HA info as a list. (HDDS-11196)
- Added statistical indicators (min, max, median, stdev) for DataNode storage usage to SCM UI/metrics. (HDDS-11206)
- Added statistics for Capacity, ScmUsed, Remaining, NonScmUsed storage space indicators. (HDDS-11252)
- Improved CLI display for OM/SCM roles with a
--table
option. (HDDS-11268)
- Added statistics for node status counts (Healthy, Dead, Decommissioning, EnteringMaintenance). (HDDS-11272)
- Allowed disabling OM version-specific features via internal config (e.g., atomic rewrite key). (HDDS-11378)
- Introduced schema versioning for Recon DB to handle upgrades and distinguish schema changes. (HDDS-11465)
- Added statistics for Pipeline and Container counts/states to SCM UI/metrics. (HDDS-11469)
- Improved
--duration
option handling in freon tests (ombg, ommg) for consistency with -n
limit. (HDDS-11494)
- Made SCMDBDefinition a singleton to reflect its immutability. (HDDS-11555)
- Simplified DBColumnFamilyDefinition by removing redundant keyType/valueType fields (relying on Codec). (HDDS-11557)
- Made ReconSCMDBDefinition a singleton. (HDDS-11589)
- Clarified OM Ratis configuration change log message to avoid confusion about peer roles. (HDDS-11623)
- Optimized
OmUtils.normalizeKey
to check isDebugEnabled
before performing string comparison. (HDDS-11669)
- Enhanced Recon metrics for background task status (lastRunStatus, currentTaskStatus) and queue monitoring. (HDDS-11680)
- Implemented OM-side filtering for ranged GET requests for specific MPU parts to reduce network overhead. (HDDS-11699)
- Refactored S3 request unmarshalling logic to reduce code duplication. (HDDS-11739)
- Improved efficiency of
BufferUtils.writeFully
for ByteBuffer[]
using GatheringByteChannel
. (HDDS-11860)
- The ozonefs-hadoop3-client jar may be optionally relocated to a different classpath fix by specifying the Maven properties
proto.shaded.prefix
. (HDDS-12116)
- Changed default Replication Manager command deadline to 12 minutes (SCM) and Datanode offset to 6 minutes. (HDDS-12135)
- Improved error messages in Ozone CLI for FileSystemExceptions (e.g., NoSuchFileException, AccessDeniedException) when not in verbose mode. (HDDS-12241)
- Returned explicit QUOTA_EXCEEDED S3 error code instead of a generic 500 internal error. (HDDS-12329)
- Optimized
listMultipartUploads
by removing duplicate key scanning in OmMetadataManagerImpl
. (HDDS-12371)
- Changed ContainerID to be a value-based class, enforcing factory methods and improving efficiency with cached proto/hash. (HDDS-12541)
- Combined
containerMap
and replicaMap
in SCM’s ContainerStateMap
into a single map for simplicity and efficiency. (HDDS-12555)
- Moved
StorageTypeProto
enum from OM/SCM specific proto files to the common hdds.proto
. This is a Java API incompatible change for internal protocols but wire compatible. (HDDS-12750)
- Added configuration (
ozone.client.ratis.watch.type
) to tune the replication level (ALL_COMMITTED or MAJORITY_COMMITTED) for client watch requests. (HDDS-2887)
- SCM StateMachine now uses Ratis
notifyLeaderReady
API instead of relying solely on notifyTermIndexUpdated
. (HDDS-10690)
- Refactored OM request
validateAndUpdateCache
methods to pass ExecutionContext
instead of just TermIndex
. (HDDS-11975)
- Reduced unnecessary object creation (RunningDatanodeState, EndpointTasks) during Datanode heartbeat processing when state is RUNNING. (HDDS-11083)
- Improved replication metrics consistency across Datanode commands handled by ReplicationSupervisor and those handled directly. (HDDS-11376)
- Improved logging in Container Balancer’s AbstractFindTargetGreedy to detail why potential targets are excluded. (HDDS-10198)
- Refined
ozone admin containerbalancer status
output for better readability and detail, including time consumption and data units (MB/GB). (HDDS-11367)
- Added Pipeline count to
ozone admin datanode usageinfo
output. (HDDS-11357)
- Removed redundant
CommandHandler
thread pool size methods (already covered by ReplicationSupervisor metrics). (HDDS-11304)
- Replaced
clusterId
parameter in KeyValueHandler
with initialization via setClusterId
to prevent potential NPE during concurrent container creation under high load. (HDDS-11396)
- Added
ozone.om.ratis.leader.election.minimum.timeout.duration.key
config to OM RaftProperties for leader election timeout. (HDDS-10761)
- Added configuration (
ozone.om.rocksdb.max_open_files
) to set RocksDB max_open_files
option for OM DB. (HDDS-11191)
- Standardized Datanode command metrics tracking across ReplicationSupervisor and direct command handlers. (HDDS-11444)
- Optimized Recon List Keys API by reusing calculated path prefix for consecutive keys with the same parent ID. (HDDS-11668)
- Optimized Recon List Keys API response generation by reducing object creation (avoiding OmKeyInfo) and memory buffering. (HDDS-11660)
- Optimized Recon List Keys API filtering logic by replacing predicate lambdas with simple IF statements for performance. (HDDS-11649)
- Added foundational schema upgrade action (
InitialConstraintUpgradeAction
) for Recon to handle constraints on existing tables (e.g., Unhealthy Containers) upon first upgrade to schema versioning. (HDDS-11615)
- Added Ozone wrapper configurations (
ozone.scm.ipc.server.read.threadpool.size
, ozone.hdds.datanode.ipc.server.read.threadpool.size
) to increase ipc.server.read.threadpool.size
for SCM and Datanode RPC servers (default 10). (HDDS-11302)
- Refactored
ContainerStateMap
to restrict ContainerAttribute
generic type T to Enum, removing unused ownerMap/repConfigMap. (HDDS-12532)
- Refactored
ContainerStateManager
interface to remove redundant ContainerID
parameters when ContainerReplica
(which contains the ID) is already passed. (HDDS-12572)
- Refactored DB/Table classes to use the DB name as the thread name prefix implicitly, removing the explicit parameter. (HDDS-12590)
- Included
ContainerInfo
within ContainerAttribute
to avoid extra map lookups in ContainerStateManager
methods. (HDDS-12591)
- Enabled custom
ValueCodec
for TypedTable
to allow performance optimizations like partial deserialization (e.g., OmKeyInfo without ACLs/locations). (HDDS-12582)
- Made
ozone admin scm safemode --verbose
show rule status even when SCM is not in safe mode. (HDDS-12548)
- Addressed thread safety issue in
BlockOutputStream#failedServers
by using a concurrent collection. (HDDS-12331)
- Added DatanodeID validation for incoming ContainerCommandRequests and on Ratis group joins to prevent operations on incorrect nodes. (HDDS-11667)
- Persisted the list of container IDs created on a Datanode to prevent recreation after volume failures, ensuring consistency for both Ratis and EC containers. (HDDS-11650)
- Added check for rocks_tools native library in
ozone checknative
CLI command output. (HDDS-11347)
- Added Ozone cluster growth rate metric (based on
scm_node_manager_total_used
rate) to Grafana dashboard using PromQL. (HDDS-12168)
- Added robust error handling for Recon OM background tasks (e.g., NSSummary) to prevent data inconsistencies if Recon crashes during partial event processing. (HDDS-12062)
Deprecated
- LegacyReplicationManager (
hdds.scm.replication.enable.legacy=true
) is removed and no longer supported. (HDDS-11759)
- FILE_PER_CHUNK container layout (
ozone.scm.container.layout
) is deprecated. New containers cannot be created with this layout. Support will be removed in a future release. (HDDS-11753)
Removed
- Removed LegacyReplicationManager implementation and the
hdds.scm.replication.enable.legacy
config property. (HDDS-11759)
- Removed unused
resultCache
and getMatchingContainerIDs
method from ContainerStateMap
. (HDDS-12445)
Fixed
- TriggerDBSyncEndpoint admin-only API handling in Recon fixed. (HDDS-11436)
- Fixed potential
NullPointerException
in OzoneManagerProtocolClientSideTranslatorPB.listStatusLight
when startKey is null (e.g., via s3a). (HDDS-10367)
- Addressed memory leak caused by ThreadLocal usage in
OMClientRequest
(OMLockDetails). (HDDS-10385)
- Fixed Container Balancer incorrectly selecting containers with 0 or negative size for moving. (HDDS-10483)
- Fixed inability to write files when Datanode chunk data validation (
hdds.datanode.chunk.data.validation.check
) is enabled due to buffer position issue. (HDDS-10547)
- Fixed Recon startup failure (“used space cannot be negative”) by handling Datanode reports with negative used space gracefully. (HDDS-10614)
- Fixed
IOException: ParentKeyInfo ... is null
in Recon Namespace Summary task by handling cases where parent info might be missing. (HDDS-10855)
- Fixed EC Reconstruction failure (
IllegalArgumentException: The chunk list has X entries, but the checksum chunks has Y entries
) potentially caused by out-of-order EC stripe writes leading to inaccurate chunk lists. (HDDS-10985)
- Fixed OM crash (
SnapshotChainManager: Failure while loading snapshot chain
) caused by SstFilteringService directly updating snapshot info DB entries, potentially corrupting the chain if OM restarts before DoubleBuffer flush. (HDDS-11068)
- Resolved
ClassCastException
(RepeatedOmKeyInfo to OmKeyInfo) in Recon’s FileSizeCountTask due to improper event handling in OMDBUpdatesHandler for conflicting keys across tables (e.g., file and directory with the same name). (HDDS-11187)
- Fixed
ContainerSizeCountTask
in Recon logging ERROR for negative-sized containers; reduced log level as these are ignored functionally. (HDDS-12227)
- Fixed duplicate key violation in Recon’s
FileSizeCountTask
by correctly handling the isDbTruncated
flag to allow updates instead of only inserts. (HDDS-12228)
- Made OzoneClientException extend IOException. (HDDS-64)
- Fixed various S3 gateway issues including multipart upload and other improvements. (HDDS-1186)
- Fixed SCM Decommissioning issue causing
InvalidStateTransitionException
after recommissioning the same SCM node. (HDDS-9608)
- Fixed Recon Disk Usage page UI issues with large numbers of keys/buckets/volumes (pie chart usability, axis ticks, path overflow). (HDDS-9626)
- Fixed Ozone admin namespace CLI
du
command printing incorrect validation error messages for root (""/"") or volume paths. (HDDS-9644)
- Fixed Recon incorrectly including out-of-service (decommissioned, maintenance) nodes when checking container health status (over/under/mis-replication). (HDDS-9645)
- Fixed potential
NullPointerException
in ContainerStateMap.ContainerAttribute
due to race condition between update and get operations. (HDDS-9527)
- Fixed Recon potentially showing duplicate DEAD datanodes after decommission/reformat/recommission cycles. (HDDS-10409) -> Now only allows removing DEAD nodes. (HDDS-11032)
- Fixed potential memory overflow in Recon’s Container Health Task due to unbounded list growth. (HDDS-9819)
- Reduced Ozone client heap memory utilization during writes by using pooled direct buffers for chunks. (HDDS-9843)
- Fixed
Pipeline.nodesInOrder
using ThreadLocal, making it inaccessible to other threads after being set. (HDDS-9848)
- Switched
KeyValueContainerCheck.verifyChecksum
to use direct/mapped buffers instead of heap buffers. (HDDS-9941)
- Fixed
TokenRenewer
implementations (O3FS, OFS) not closing the created OzoneClient
. Removed duplicate implementation. (HDDS-9943)
- Fixed NSSummaryAdmin CLI commands not closing created OzoneClient instances and creating multiple instances unnecessarily. (HDDS-9944)
- Fixed incorrect synchronization in
RatisSnapshotInfo
, potentially leading to inconsistent term/index values. Class removed as redundant to TransactionInfo. (HDDS-9984)
- Fixed
Options
and ReadOptions
instances not being closed properly in rocksdb-checkpoint-differ
. (HDDS-10001)
- Renamed
ManagedSstFileReader
in rocksdb-checkpoint-differ
to SstFileSetReader
to avoid name collision with the class in hdds-managed-rocksdb
. (HDDS-10007)
- Fixed potential
NullPointerException
in VolumeInfoMetrics.getCommitted()
if HddsVolume.committedBytes
is null. (HDDS-10027)
- Refined SCM RPC handler counts to be configurable per protocol (Client, Block, Datanode) instead of a single global count. (HDDS-10088)
- Removed static
dbNameToCfHandleMap
from RocksDatabase, using non-static columnFamilies
map instead. (HDDS-10107)
- Fixed potential
NullPointerException
in OMDBCheckpointServlet lock acquisition when SstFilteringService is accessed before initialization. (HDDS-10138)
- Enabled Zero-Copy reads during container replication for improved performance. (HDDS-10144)
- Corrected metric names
createOmResoonseLatencyNs
and validateAndUpdateCacneLatencyNs
in OMPerformanceMetrics
. (HDDS-10162)
- Fixed
OmMetadataManagerImpl
creating a new S3Batcher
instance for each S3 secret operation instead of reusing one. (HDDS-10202)
- Ensured atomic updates in
StateContext#updateCommandStatus
using computeIfPresent
to prevent race conditions. (HDDS-10210)
- Fixed Grafana dashboards: removed UID/hostnames, included secure/unsecure ports, corrected datastore count. (HDDS-10229)
- Prevented V3 Schema DatanodeStore from creating container DBs in incorrect locations under certain initialization paths. (HDDS-10230)
- Fixed
ContainerStateManager
finalizing OPEN containers without a healthy pipeline on follower SCMs; moved logic to leader-only path via Ratis. (HDDS-10231)
- Improved JSON response for Deleted Directories and Open Keys Insight Endpoints in Recon for better clarity (using actual names instead of Object IDs). (HDDS-10241)
- Fixed
ContainerReport
admin command showing incorrect values immediately after SCM restart before Replication Manager runs. (HDDS-10272)
- Fixed pagination on the OM DB Insights page in Recon. (HDDS-10282)
- Added support for direct ByteBuffers in Checksum calculations, using reflection for Java 9+ API while maintaining Java 8 compatibility. (HDDS-10288)
- Fixed
ECReconstructionCoordinator
ignoring ozone-site.xml
client configurations and using default OzoneClientConfig
. (HDDS-10294)
- Fixed potential orphan blocks during key overwrite operations, especially involving the deleted key table. (HDDS-10296)
- Fixed
KeyManagerImpl#listKeys
path normalization to correctly handle OBS/LEGACY buckets when ozone.om.enable.filesystem.paths
is true. (HDDS-10319)
- Fixed metadata not being updated when overwriting existing keys via S3 PutObject. (HDDS-10324)
- Fixed
SetTimes
API not working with linked buckets due to missing link resolution. (HDDS-10369)
- Fixed Recon not handling pre-existing MISSING_EMPTY containers correctly (introduced in HDDS-9695), leaving them marked as missing indefinitely. (HDDS-10370)
- Fixed S3 listParts incompatibility for keys created before HDDS-9680 (missing ETag metadata) and NPE when ETag is null. (HDDS-10395)
- Restricted directory deletion in LEGACY buckets via
ozone sh key delete
; users must use ozone fs
interface. (HDDS-10397)
- Fixed
ArrayIndexOutOfBoundsException
when listing keys in OBS buckets via S3/s3a under certain conditions. (HDDS-10399)
- Fixed
ozone admin
CLI having hard-coded INFO log level, ignoring environment/config settings. (HDDS-10405)
- Fixed Datanode startup failure (“Illegal configuration: raft.grpc.message.size.max must be 1m larger than …") when using latest Ratis due to default config mismatch. (HDDS-11375)
- Fixed Datanode startup failure (“checksum size setting 1024 is not in expected format”) due to incorrect type validation for
hdds.ratis.raft.server.snapshot.creation.gap
. (HDDS-10423)
- Fixed Grafana dashboard Prometheus endpoint configuration for Datanodes and added missing Recon endpoint. (HDDS-10433)
- Fixed Datanode Maintenance failing early incorrectly (logic refined). (HDDS-10463)
- Fixed OM potentially crashing or failing requests if the configured S3 secret storage (Vault) is unavailable. (HDDS-10469)
- Fixed audit log for key creation missing EC replication config details (parity, chunk size, codec). (HDDS-10472)
- Fixed potential
NullPointerException
in OmUtils.getAllOMHAAddresses
if OM HA config keys are missing. (HDDS-10508)
- Fixed S3 GetObject ETag header returning
"null"
for objects without an ETag, causing issues with AWS SDK validation. Now omits the header if ETag is missing. (HDDS-10521)
- Fixed
MessageDigest
instance in S3 endpoint potentially not being reset after exceptions (e.g., client cancellation), leading to incorrect ETags on subsequent requests using the same thread. (HDDS-10587)
- Fixed issue where client might attempt Ratis streaming for keys defaulted to EC replication if bucket replication isn’t explicitly set. (HDDS-10832)
- Fixed freon read/mixed operations failing with “Key not found” if prefix is unspecified; stopped adding random prefix. Fixed misleading random prefix log in
ommg
. (HDDS-10845)
- Fixed Ozone CLI not respecting default
ozone.om.service.id
when only one service ID is configured. (HDDS-10861)
- Fixed
ClosePipelineCommandHandler
potentially causing GroupMismatchException
by calling removeGroup
before getting peer list for propagation. (HDDS-10875)
- Fixed Recon
ReconContainerManager
potentially throwing DuplicatedPipelineIdException
when checking/adding containers due to race conditions or stale data. (HDDS-10880)
- Improved logging clarity in Recon’s
ReconNodeManager
regarding datanode finalization status checks during upgrades. (HDDS-10883)
- Fixed OM startup failure in single-node Docker container due to Ratis group directory mismatch when using default service ID. (HDDS-10909)
- Fixed Recon startup failing silently or logging incorrect errors in non-HA SCM scenarios due to inability to fetch SCM roles or snapshot. (HDDS-10937)
- Fixed OM decommission config (
ozone.om.decommissioned.nodes
) not working without service ID suffix when only one OM service ID is configured. (HDDS-10942)
- Fixed EC key read corruption potentially occurring if a container’s replica index on a DN mismatches the index expected by the client (e.g., after container move). Added validation. (HDDS-10983)
- Fixed S3 gateway potentially throwing exceptions (
javax.xml.xpath.XPathExpressionException
) during concurrent XML parsing (e.g., CompleteMultipartUpload, DeleteObjects). (HDDS-10777)
- Fixed
NullPointerException
in XceiverClientRatis.watchForCommit
when updateCommitInfosMap
encounters a new Datanode ID in the response after a previous timeout removed it from commitInfoMap
. (HDDS-10780)
- Fixed potential
OMLeaderNotReadyException
after leader switch if transactions were pending in the double buffer, preventing lastNotifiedTermIndex
update. (HDDS-10798)
- Fixed various HTTP server components (Recon, SCM, OM, DN) failing to start if configured with a wildcard Kerberos principal (
*
) due to missing kerb-core
dependency. (HDDS-10803)
- Fixed S3
setBucketAcl
causing UnsupportedOperationException
due to attempting to modify an immutable list returned by OzoneVolume.getAcls()
. (HDDS-11737)
- Fixed SCM leadership metric (
SCMLeader
) potentially being reset to null by HTTP server initialization after the Raft server has already determined leadership. (HDDS-11742)
- Fixed
SnapshotDiffManager
logging NativeLibraryNotLoadedException
as ERROR even when native tools are optional; changed to WARN. (HDDS-11486)
- Fixed potential
NullPointerException
when checking container balancer status (ozone admin containerbalancer status
) if balancer is started but not fully initialized (e.g., waiting for DU info). (HDDS-11350)
- Fixed
ozone fs -rm -r
prompt for volume deletion suggesting incorrect ozone sh volume delete
options (-skipTrash
, -id
). (HDDS-11346)
- Fixed
ozone sh key list -h
showing duplicate options (--all
, --length
) due to picocli version issue (reverted). (HDDS-11446) -> Reverted picocli upgrade.
- Fixed S3 CompleteMultipartUpload returning 500 Internal Server Error instead of S3-compliant InvalidRequest error when no parts are specified in the request body. (HDDS-11457)
- Fixed multiple
IOzoneAuthorizer
instances potentially being created and leaked if Ratis snapshot installation fails repeatedly after stopping the metadata manager. (HDDS-11472)
- Fixed
ozone sh volume delete
command line parsing error for -r
option. (HDDS-11535) -> Resolved as part of HDDS-11346 fix.
- Fixed
NullPointerException
in OM when overwriting an empty file using multipart upload in FSO buckets (versioning disabled). (HDDS-12131)
- Fixed Replication Manager (
hdds.scm.replication.thread.interval
) interval configuration description to correctly state milliseconds instead of seconds. (HDDS-12144) -> Resolved by removing unsupported types.
- Fixed Grafana dashboard for Chunk read/write rates using incorrect interval variable (
$__interval
instead of $__rate_interval
). (HDDS-12112)
- Fixed Replication Manager potentially expiring pending container deletes incorrectly instead of retrying them if the Datanode doesn’t confirm deletion within the deadline. (HDDS-12127)
- Fixed Replication Manager non-deterministically selecting replicas for deletion if preferred target nodes are overloaded, potentially deleting required replicas. (HDDS-12115)
- Fixed delete container commands potentially running indefinitely or past their deadline due to long lock waits or slow disk I/O; added lock timeout and moved ICR earlier. (HDDS-12114)
- Fixed Recon UI potentially switching from old UI to new UI automatically upon page refresh. (HDDS-12084)
- Fixed missing local refresh button in new Recon UI’s Disk Usage page to reload data for the current path without navigating back to root. (HDDS-12085)
- Fixed unnecessary parameters “Source Volume” & “Source Bucket” appearing in the metadata table for non-link buckets in the new Recon UI Disk Usage page. (HDDS-12073)
- Fixed Recon API endpoints
/api/v1/volumes
and /api/v1/buckets
missing from Swagger documentation. (HDDS-11300)
- Fixed potential
NullPointerException
in Recon /api/v1/volumes
and /api/v1/buckets
endpoints if accessed before Recon tables are fully initialized after startup. (HDDS-11349)
- Fixed
ozone freon cr
(closed container replication) command failing with NPE due to metrics map lookup failure in ReplicationSupervisor. (HDDS-12040)
- Fixed incorrect display of Ozone Service ID name in Recon UI (New UI showed “OM ID”, Old UI showed “OM Service”). Corrected to “Ozone Service ID”. (HDDS-12049)
- Fixed difference in Cluster Capacity % calculation (floor vs round) and Container Pre-Allocated Size display (committed vs 0) between new and old Recon UI. (HDDS-12042)
- Fixed long path names wrapping to the next line in the new Recon UI Disk Usage page; made it scrollable instead. (HDDS-11957)
- Fixed Recon failing to update
version_number
in RECON_SCHEMA_VERSION
table (always -1), causing upgrade actions to run unnecessarily on fresh installs. (HDDS-11846)
- Fixed serialization error (
Conflicting/ambiguous property name
) in Recon’s listKeys API due to Jackson ambiguity between key
field and isKey()
getter. Renamed getter. (HDDS-11848)
- Fixed potential deadlock in OM between DoubleBuffer flush thread (waiting for DeletedTable lock during snapshot checkpoint) and KeyDeletingService (holding DeletedTable lock, waiting for Ratis future). (HDDS-11124)
- Fixed OM crashing with
IOException: Rocks Database is closed
during SnapshotMoveDeletedKeys
request processing if the snapshot was purged concurrently. (HDDS-11152)
- Fixed containers potentially stuck in DELETING state after upgrade if they were affected by HDDS-8129 (incorrect block counts) and datanodes rejected delete commands due to negative counts. Added recovery logic. (HDDS-11136)
- Fixed
fs -mkdir
incorrectly creating directories in OBS buckets (bypassing layout validation added in HDDS-11235). Reverted the optimization for mkdir. (HDDS-11348)
- Fixed Datanode potentially failing heartbeats or other operations due to deadlock on
StateContext#pipelineActions
map under high load. Replaced with concurrent map. (HDDS-11331)
- Fixed S3 gateway returning 403 Forbidden instead of 302 Redirect for root path (
/
) requests containing Authorization: Negotiate
header (used by newer curl versions). (HDDS-11096)
- Fixed
DELETE_TENANT
request logging an unnecessary and uninformative UPDATE_VOLUME
audit entry, even on failure. (HDDS-11119)
- Fixed intermittent timeout in
TestBlockDeletion.testBlockDeletion
potentially caused by race conditions or slow command processing. (HDDS-9962)
- Fixed “Bad file descriptor” error in
TestOmSnapshotFsoWithNativeLib.testSnapshotCompactionDag
when using native RocksDB tools library. (HDDS-10149)
- Fixed
ManagedStatistics
objects not being closed properly in OM and DN when RocksDB statistics are enabled, leading to resource leaks. (HDDS-10184)
- Fixed race condition in
RocksDatabase
where a close operation could occur between assertClose()
check and database operation, causing JVM crash. (HDDS-9527)
- Fixed ReplicationManager metrics not being re-registered after RM restart via CLI (
stop
/start
), causing metrics to stop reporting. (HDDS-9235)
- Fixed infinite loop in
WritableRatisContainerProvider.getContainer
if SCM restarts and existing pipeline nodes are not found (e.g., DNs stopped), causing log flooding. (HDDS-8982)
- Fixed SCM follower nodes logging
NotLeaderException
errors when processing Pipeline Reports, which is expected behavior for followers. Suppressed error logging. (HDDS-11695)
- Fixed
FileNotFoundException: ... (Too many open files)
and subsequent DN crashes during OM+DN decommission under heavy Freon load, likely due to excessive open file handles. (HDDS-11391) -> Addressed potential causes.
- Fixed Recon showing incorrect (zero) count for DELETED containers in cluster state summary API (
/api/v1/clusterState
). (HDDS-11389)
- Fixed issue where OM could fail if
IOzoneAuthorizer
(e.g., Ranger plugin) fails to initialize during snapshot installation and reload attempts create multiple instances, leading to heap exhaustion. (HDDS-11472)
- Fixed
NoSuchUpload
error when aborting multipart uploads for keys where the parent directory was missing (potentially due to FSO-related bugs or cleanup issues). (HDDS-11784)
- Fixed secure acceptance tests failing on arm64 due to keytab checksum mismatch when using keytabs generated on amd64. Regenerated keytabs for multi-arch compatibility. (HDDS-11810)
- Fixed race condition in datanode VERSION file creation where multiple threads could attempt to write using the same temporary file via
AtomicFileOutputStream
. (HDDS-12608)
- Fixed SCM logging an error when updating sequence ID for a CLOSED container based on a replica report with higher BCSID; changed log level and added context. (HDDS-12409)
Security
- Disabled REST endpoint for S3 secret manipulation by username for non-admin users via S3Gateway Secret REST endpoint. (HDDS-11040)
For more details, check out Apache Ozone 2.0.0 JIRA list
This is a generally available (GA) release.
It represents a point of API stability and quality that we consider production-ready.
Generated-by: Google AI Studio + Gemini 2.5 Pro Preview 03-25, with input data from a filtered JIRA list using this prompt.
Image credit: Indiana Dunes National Lakeshore, Michigan City, Indiana, USA by Diego Delso, CC-BY-SA 3.0 / Text added to original