Release 2.0.0 available

Indiana-Dunes-haiku

Release Notes

Apache Ozone 2.0.0 adds 1708 new features, improvements and bug fixes on top of Ozone 1.4.


  • HDDS-10295 | Major | Provide an “ozone repair” subcommand to update the snapshot info in transactionInfoTable

A new command “ozone repair update-transaction” is added to update the highest index in OM transactionInfoTable.


  • HDDS-11258 | Blocker | [hsync] Add new OM layout version

A new layout version, HBASE_SUPPORT (7) is added to Ozone Manager that provides the guardrail for the full support of hsync, lease recovery and listOpenFiles APIs for HBase.


  • HDDS-11227 | Major | Use OM’s KMS from client side when connecting to a cluster and dealing with encrypted data

Ozone clients can now interact with multiple encrypted Ozone clusters. This improvement enables distcp to copy from one encrypted source Ozone cluster to another encrypted destination Ozone cluster.


  • HDDS-11375 | Major | DN Startup fails with Illegal configuration

Remove the predefined hdds.ratis.raft.grpc.message.size. Its default value is determined by hdds.container.ratis.log.appender.queue.byte-limit + 1MB = 33MB.


  • HDDS-11342 | Major | [hsync] Add a config as HBase-related features master switch

It is now required to toggle an extra config switch to allow HBase-related enhancements to be enabled.

Server-side (OM): Set ozone.hbase.enhancements.allowed to true. Client-side: Set ozone.client.hbase.enhancements.allowed to true.

For more details, see their respective config description.


  • HDDS-7593 | Major | Supporting HSync and lease recovery

Ozone 2.0 added support for output stream hsync/hflush API support. In addition, lease recovery (recoverLease()), setSafeMode(), file system API support are added.


  • HDDS-11329 | Major | Update Ozone images to Rocky Linux-based runner

Provide Rocky Linux-based convenience Ozone docker image


  • HDDS-11705 | Critical | Snapshot operations on linked buckets should work on actual underlying bucket

Ozone did not support snapshots on linked buckets before this release. However, a user could have inadvertently created snapshots on linked buckets. Hence when upgrading from an older version that doesn’t support snapshots on linked buckets to a newer version that supports snapshots on linked buckets, it is essential to ensure that there are no snapshots on linked buckets otherwise they will linger around. If there are any snapshots on linked buckets, those snapshots need to be deleted by using snapshot delete command:

ozone sh snapshot delete <vol>/<linked bucket name> <snapshot name>


Ozone’s Hadoop dependency version was updated from 3.3.6 to 3.4.1.


  • HDDS-8101 | Major | Add FSO repair tool to ozone CLI in read-only and repair modes

Added a new command “ozone repair om fso-tree” to detect and repair broken FSO trees caused by bugs such as HDDS-7592, which can orphan data in the OM.

Usage: ozone repair om fso-tree –db <dbPath> [–repair | –r] [–volume | -v <volName>] [–bucket | -b <bucketName>] [–verbose]


  • HDDS-7852 | Major | SCM Decommissioning Support

A Storage Container Manager can now be decommissioned from a set of SCM nodes. Check out user doc for usage and more details: https://ozone.apache.org/docs/edge/feature/decommission.html


  • HDDS-11753 | Blocker | Deprecate file per chunk layout from datanode code

FILE_PER_CHUNK container layout (ozone.scm.container.layout) is deprecated. Starting from Apache Ozone 2.0, users will not be able to create new FILE_PER_CHUNK containers.

The support will be removed in a future release.


  • HDDS-12488 | Major | S3G should handle the signature calculation with trailers

AWS Java SDK V2 2.30.0 introduced an incompatible protocol change that caused file upload to Ozone S3 Gateway to fail or append a trailer data silently. S3G is now updated to support AWS Java SDK V2 2.30.0 and later.


  • HDDS-12327 | Blocker | Restore non-HA (to HA) upgrade test

Non-HA 1.4.1 cluster (in a non-rolling fashion) upgrade to 2.0.0 is tested.


  • HDDS-11754 | Blocker | Drop support for non-Ratis OM and SCM

Ozone Manager and Storage Container Manager will always run in HA (Ratis) mode. Clusters upgrading from non-Ratis (Standalone) mode will automatically run in single node HA (Ratis) mode.


  • HDDS-12750 | Major | Move StorageTypeProto from ScmServerDatanodeHeartbeatProtocol.proto to hdds.proto

Moved StorageTypeProto from OmClientProtocol.proto and ScmServerDatanodeHeartbeatProtocol.proto to hdds.proto. As a result, the java_outer_classname changed from OzoneManagerProtocolProtos and StorageContainerDatanodeProtocolProtos to HddsProtos.

This change is

  • Proto wire/binary format: compatible (unchanged)
  • Proto text format: compatible (unchanged)
  • Java API: incompatible (changed java_outer_classname)

  • HDDS-9218 | Major | S3 secret managment through HTTP

A set of S3 REST API endpoints are available to manage S3 secrets: /secret for getting a secret. /revoke for revoking an existing secret. For more details, check out Securing S3 user document https://ozone.apache.org/docs/edge/security/securings3.html


Changelog

The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.

[2.0.0] - 2025-04-04

Added

  • New pipeline choosing policy: CapacityPipelineChoosePolicy. This policy randomly chooses pipelines with relatively lower utilization. To use, configure hdds.scm.pipeline.choose.policy.impl to org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.CapacityPipelineChoosePolicy. (HDDS-9345)
  • APIs to fetch single datanode specific information, reducing data transfer from server to client. (HDDS-9648)
  • Support for symmetric keys for delegation tokens. (HDDS-8829)
  • A Storage Container Manager (SCM) can now be decommissioned from a set of SCM nodes. (HDDS-7852)
  • Option to close all pipelines via CLI (ozone admin pipeline close --all). (HDDS-10742)
  • Metrics to monitor bucket state including usage, quota, and available space. (HDDS-10476)
  • Unit tests and documentation for creating keys/files with EC replication config using ofs/o3fs. (HDDS-10553)
  • Support for passing Kerberos credentials in GrpcOmTransport. (HDDS-11041)

Changed

  • S3 Gateway endpoints for static content and admin purposes (/prom, /logs, etc.) are now served on a separate port (default: 19878). Config keys are under ozone.s3g.webadmin. (HDDS-7307)
  • Improved logging for container not found in CloseContainerCommandHandler to INFO level. (HDDS-9958)
  • Container scanner (hdds.datanode.container.scrub.enabled) is now enabled by default. (HDDS-10485)
  • Upgraded jgrapht to 1.4.0. (HDDS-10503)
  • Bumped follow-redirects to 1.15.6 in Ozone Recon. (HDDS-10526)
  • Bumped axios to 0.28.0 in Ozone Recon. (HDDS-10669)
  • Bumped es5-ext to 0.10.64 in Ozone Recon. (HDDS-10673)
  • Bumped ip to 1.1.9 in Ozone Recon. (HDDS-10674)
  • Bumped browserify-sign to 4.2.3 in Ozone Recon. (HDDS-10676)
  • Bumped plotly.js to 2.25.2 in Ozone Recon. (HDDS-10677)
  • Replaced ConcurrentHashMap with HashMap protected by ReadWriteLock in NodeStateMap for potential performance improvement. (HDDS-10830)
  • Replaced ConcurrentHashMap with HashMap in PipelineStateMap as access is already protected by locks. (HDDS-10971)
  • Bumped express to 4.21.0 in Ozone Recon. (HDDS-11460)
  • Bumped vite to 4.5.5 in Ozone Recon. (HDDS-11467)
  • Improved array handling efficiency, avoiding legacy conversions and double conversions. (HDDS-11544)
  • Extracted common Kubernetes definitions for HttpFS and Recon from getting-started example. (HDDS-11845)
  • Reverted workaround added by HDDS-8715 for thread renaming, as the underlying Hadoop issue HDFS-13566 is fixed in the current Hadoop version. (HDDS-12470)
  • Migrated Ozone Recon UI build process from react-scripts/Jest to Vite/vitest. (HDDS-11017)
  • Added wrapper methods for getting/setting port details (Standalone, Ratis, Rest) in DatanodeDetails, replacing direct usage. (HDDS-117)
  • Refactored OMRequest building in TrashOzoneFileSystem to reduce code duplication. (HDDS-6796)
  • Switched chunk file reading in Datanode to use Netty’s ChunkedNioFile for potential performance improvement. (HDDS-7188)
  • Improved multipart upload part ETag generation to use MD5 hash of content for consistency. (HDDS-9680)
  • Pipeline failure now triggers an immediate heartbeat to SCM to minimize client impact. (HDDS-9823)
  • Improved performance of processing IncrementalContainerReport requests from DN in Recon by batching SCM lookups and reducing client timeouts. (HDDS-9883)
  • Changed Recon datanode ‘Last Heartbeat’ display to show relative time values (e.g., “2s ago”) instead of absolute timestamps. (HDDS-9933)
  • SCM UI now shows cluster storage usage percentage in addition to absolute values. (HDDS-9988)
  • Added functionality to freon OmMetadataGenerator (ommg) Test. (HDDS-10025)
  • Improved logs for SCMDeletedBlockTransactionStatusManager. (HDDS-10029)
  • OzoneManagerRatisServer.getServer() now returns the specific Ratis Division for the group. (HDDS-10036)
  • Reduced buffer copying in OMRatisHelper by using ByteBuffer. (HDDS-10037)
  • Consolidated and added tests for the Ratis write path for prefix ACL operations. (HDDS-10066)
  • Refined SCM start-up logs for clarity and reduced noise (removed duplicate balancer config, reduced cert info verbosity). (HDDS-10271)
  • Removed unnecessary sorting when excluding Datanodes during Ratis Pipeline Creation based on pipeline limits. (HDDS-10345)
  • CopyObjectResponse ETag is now based on the content hash of the copied key, consistent with PutObject. (HDDS-10403)
  • Avoided unnecessary creation of ChunkInfo objects in container-service code by directly accessing proto fields. (HDDS-10410)
  • Prefix ACL checks now correctly resolve bucket links. (HDDS-10412)
  • Refined audit logging for bucket property update operations to include quota and replication details. (HDDS-10460)
  • Implemented logic to fail Datanode decommission early if the cluster doesn’t have enough nodes to maintain replication requirements. (HDDS-10462)
  • Refined audit logging for bucket creation to include quota, owner, and replication details. (HDDS-10475)
  • Standardized byte array to String conversion for RocksDB LiveFileMetaData using UTF-8 and StringUtils.bytes2String, removing BouncyCastle dependency. (HDDS-10744)
  • Tool ozone admin find-ec-missing-padding-blocks added to detect keys affected by missing EC padding blocks (HDDS-10681). (HDDS-10751)
  • Improved logging for signature verification failures in OzoneDelegationTokenSecretManager to aid debugging. (HDDS-10802)
  • Implemented getHomeDirectory in OzoneFileSystem implementations to correctly return /user/<ugi user> in secure clusters, respecting impersonation. (HDDS-10905)
  • Reduced client watch requests by using CommitInfoProto from NotReplicatedException (requires Ratis 3.1.0+ and config tuning). (HDDS-10932)
  • Added Netty off-heap memory usage metrics to OM and SCM for better monitoring. (HDDS-11100)
  • Enhanced ozone admin containerbalancer status output with richer information including start time, parameters, progress details, and involved datanodes using -v or --verbose. (HDDS-11120)
  • Improved SCM WebUI display: formatted JVM properties, added DN version/UUID to list, formatted SCM HA info as a list. (HDDS-11196)
  • Added statistical indicators (min, max, median, stdev) for DataNode storage usage to SCM UI/metrics. (HDDS-11206)
  • Added statistics for Capacity, ScmUsed, Remaining, NonScmUsed storage space indicators. (HDDS-11252)
  • Improved CLI display for OM/SCM roles with a --table option. (HDDS-11268)
  • Added statistics for node status counts (Healthy, Dead, Decommissioning, EnteringMaintenance). (HDDS-11272)
  • Allowed disabling OM version-specific features via internal config (e.g., atomic rewrite key). (HDDS-11378)
  • Introduced schema versioning for Recon DB to handle upgrades and distinguish schema changes. (HDDS-11465)
  • Added statistics for Pipeline and Container counts/states to SCM UI/metrics. (HDDS-11469)
  • Improved --duration option handling in freon tests (ombg, ommg) for consistency with -n limit. (HDDS-11494)
  • Made SCMDBDefinition a singleton to reflect its immutability. (HDDS-11555)
  • Simplified DBColumnFamilyDefinition by removing redundant keyType/valueType fields (relying on Codec). (HDDS-11557)
  • Made ReconSCMDBDefinition a singleton. (HDDS-11589)
  • Clarified OM Ratis configuration change log message to avoid confusion about peer roles. (HDDS-11623)
  • Optimized OmUtils.normalizeKey to check isDebugEnabled before performing string comparison. (HDDS-11669)
  • Enhanced Recon metrics for background task status (lastRunStatus, currentTaskStatus) and queue monitoring. (HDDS-11680)
  • Implemented OM-side filtering for ranged GET requests for specific MPU parts to reduce network overhead. (HDDS-11699)
  • Refactored S3 request unmarshalling logic to reduce code duplication. (HDDS-11739)
  • Improved efficiency of BufferUtils.writeFully for ByteBuffer[] using GatheringByteChannel. (HDDS-11860)
  • The ozonefs-hadoop3-client jar may be optionally relocated to a different classpath fix by specifying the Maven properties proto.shaded.prefix. (HDDS-12116)
  • Changed default Replication Manager command deadline to 12 minutes (SCM) and Datanode offset to 6 minutes. (HDDS-12135)
  • Improved error messages in Ozone CLI for FileSystemExceptions (e.g., NoSuchFileException, AccessDeniedException) when not in verbose mode. (HDDS-12241)
  • Returned explicit QUOTA_EXCEEDED S3 error code instead of a generic 500 internal error. (HDDS-12329)
  • Optimized listMultipartUploads by removing duplicate key scanning in OmMetadataManagerImpl. (HDDS-12371)
  • Changed ContainerID to be a value-based class, enforcing factory methods and improving efficiency with cached proto/hash. (HDDS-12541)
  • Combined containerMap and replicaMap in SCM’s ContainerStateMap into a single map for simplicity and efficiency. (HDDS-12555)
  • Moved StorageTypeProto enum from OM/SCM specific proto files to the common hdds.proto. This is a Java API incompatible change for internal protocols but wire compatible. (HDDS-12750)
  • Added configuration (ozone.client.ratis.watch.type) to tune the replication level (ALL_COMMITTED or MAJORITY_COMMITTED) for client watch requests. (HDDS-2887)
  • SCM StateMachine now uses Ratis notifyLeaderReady API instead of relying solely on notifyTermIndexUpdated. (HDDS-10690)
  • Refactored OM request validateAndUpdateCache methods to pass ExecutionContext instead of just TermIndex. (HDDS-11975)
  • Reduced unnecessary object creation (RunningDatanodeState, EndpointTasks) during Datanode heartbeat processing when state is RUNNING. (HDDS-11083)
  • Improved replication metrics consistency across Datanode commands handled by ReplicationSupervisor and those handled directly. (HDDS-11376)
  • Improved logging in Container Balancer’s AbstractFindTargetGreedy to detail why potential targets are excluded. (HDDS-10198)
  • Refined ozone admin containerbalancer status output for better readability and detail, including time consumption and data units (MB/GB). (HDDS-11367)
  • Added Pipeline count to ozone admin datanode usageinfo output. (HDDS-11357)
  • Removed redundant CommandHandler thread pool size methods (already covered by ReplicationSupervisor metrics). (HDDS-11304)
  • Replaced clusterId parameter in KeyValueHandler with initialization via setClusterId to prevent potential NPE during concurrent container creation under high load. (HDDS-11396)
  • Added ozone.om.ratis.leader.election.minimum.timeout.duration.key config to OM RaftProperties for leader election timeout. (HDDS-10761)
  • Added configuration (ozone.om.rocksdb.max_open_files) to set RocksDB max_open_files option for OM DB. (HDDS-11191)
  • Standardized Datanode command metrics tracking across ReplicationSupervisor and direct command handlers. (HDDS-11444)
  • Optimized Recon List Keys API by reusing calculated path prefix for consecutive keys with the same parent ID. (HDDS-11668)
  • Optimized Recon List Keys API response generation by reducing object creation (avoiding OmKeyInfo) and memory buffering. (HDDS-11660)
  • Optimized Recon List Keys API filtering logic by replacing predicate lambdas with simple IF statements for performance. (HDDS-11649)
  • Added foundational schema upgrade action (InitialConstraintUpgradeAction) for Recon to handle constraints on existing tables (e.g., Unhealthy Containers) upon first upgrade to schema versioning. (HDDS-11615)
  • Added Ozone wrapper configurations (ozone.scm.ipc.server.read.threadpool.size, ozone.hdds.datanode.ipc.server.read.threadpool.size) to increase ipc.server.read.threadpool.size for SCM and Datanode RPC servers (default 10). (HDDS-11302)
  • Refactored ContainerStateMap to restrict ContainerAttribute generic type T to Enum, removing unused ownerMap/repConfigMap. (HDDS-12532)
  • Refactored ContainerStateManager interface to remove redundant ContainerID parameters when ContainerReplica (which contains the ID) is already passed. (HDDS-12572)
  • Refactored DB/Table classes to use the DB name as the thread name prefix implicitly, removing the explicit parameter. (HDDS-12590)
  • Included ContainerInfo within ContainerAttribute to avoid extra map lookups in ContainerStateManager methods. (HDDS-12591)
  • Enabled custom ValueCodec for TypedTable to allow performance optimizations like partial deserialization (e.g., OmKeyInfo without ACLs/locations). (HDDS-12582)
  • Made ozone admin scm safemode --verbose show rule status even when SCM is not in safe mode. (HDDS-12548)
  • Addressed thread safety issue in BlockOutputStream#failedServers by using a concurrent collection. (HDDS-12331)
  • Added DatanodeID validation for incoming ContainerCommandRequests and on Ratis group joins to prevent operations on incorrect nodes. (HDDS-11667)
  • Persisted the list of container IDs created on a Datanode to prevent recreation after volume failures, ensuring consistency for both Ratis and EC containers. (HDDS-11650)
  • Added check for rocks_tools native library in ozone checknative CLI command output. (HDDS-11347)
  • Added Ozone cluster growth rate metric (based on scm_node_manager_total_used rate) to Grafana dashboard using PromQL. (HDDS-12168)
  • Added robust error handling for Recon OM background tasks (e.g., NSSummary) to prevent data inconsistencies if Recon crashes during partial event processing. (HDDS-12062)

Deprecated

  • LegacyReplicationManager (hdds.scm.replication.enable.legacy=true) is removed and no longer supported. (HDDS-11759)
  • FILE_PER_CHUNK container layout (ozone.scm.container.layout) is deprecated. New containers cannot be created with this layout. Support will be removed in a future release. (HDDS-11753)

Removed

  • Removed LegacyReplicationManager implementation and the hdds.scm.replication.enable.legacy config property. (HDDS-11759)
  • Removed unused resultCache and getMatchingContainerIDs method from ContainerStateMap. (HDDS-12445)

Fixed

  • TriggerDBSyncEndpoint admin-only API handling in Recon fixed. (HDDS-11436)
  • Fixed potential NullPointerException in OzoneManagerProtocolClientSideTranslatorPB.listStatusLight when startKey is null (e.g., via s3a). (HDDS-10367)
  • Addressed memory leak caused by ThreadLocal usage in OMClientRequest (OMLockDetails). (HDDS-10385)
  • Fixed Container Balancer incorrectly selecting containers with 0 or negative size for moving. (HDDS-10483)
  • Fixed inability to write files when Datanode chunk data validation (hdds.datanode.chunk.data.validation.check) is enabled due to buffer position issue. (HDDS-10547)
  • Fixed Recon startup failure (“used space cannot be negative”) by handling Datanode reports with negative used space gracefully. (HDDS-10614)
  • Fixed IOException: ParentKeyInfo ... is null in Recon Namespace Summary task by handling cases where parent info might be missing. (HDDS-10855)
  • Fixed EC Reconstruction failure (IllegalArgumentException: The chunk list has X entries, but the checksum chunks has Y entries) potentially caused by out-of-order EC stripe writes leading to inaccurate chunk lists. (HDDS-10985)
  • Fixed OM crash (SnapshotChainManager: Failure while loading snapshot chain) caused by SstFilteringService directly updating snapshot info DB entries, potentially corrupting the chain if OM restarts before DoubleBuffer flush. (HDDS-11068)
  • Resolved ClassCastException (RepeatedOmKeyInfo to OmKeyInfo) in Recon’s FileSizeCountTask due to improper event handling in OMDBUpdatesHandler for conflicting keys across tables (e.g., file and directory with the same name). (HDDS-11187)
  • Fixed ContainerSizeCountTask in Recon logging ERROR for negative-sized containers; reduced log level as these are ignored functionally. (HDDS-12227)
  • Fixed duplicate key violation in Recon’s FileSizeCountTask by correctly handling the isDbTruncated flag to allow updates instead of only inserts. (HDDS-12228)
  • Made OzoneClientException extend IOException. (HDDS-64)
  • Fixed various S3 gateway issues including multipart upload and other improvements. (HDDS-1186)
  • Fixed SCM Decommissioning issue causing InvalidStateTransitionException after recommissioning the same SCM node. (HDDS-9608)
  • Fixed Recon Disk Usage page UI issues with large numbers of keys/buckets/volumes (pie chart usability, axis ticks, path overflow). (HDDS-9626)
  • Fixed Ozone admin namespace CLI du command printing incorrect validation error messages for root (""/"") or volume paths. (HDDS-9644)
  • Fixed Recon incorrectly including out-of-service (decommissioned, maintenance) nodes when checking container health status (over/under/mis-replication). (HDDS-9645)
  • Fixed potential NullPointerException in ContainerStateMap.ContainerAttribute due to race condition between update and get operations. (HDDS-9527)
  • Fixed Recon potentially showing duplicate DEAD datanodes after decommission/reformat/recommission cycles. (HDDS-10409) -> Now only allows removing DEAD nodes. (HDDS-11032)
  • Fixed potential memory overflow in Recon’s Container Health Task due to unbounded list growth. (HDDS-9819)
  • Reduced Ozone client heap memory utilization during writes by using pooled direct buffers for chunks. (HDDS-9843)
  • Fixed Pipeline.nodesInOrder using ThreadLocal, making it inaccessible to other threads after being set. (HDDS-9848)
  • Switched KeyValueContainerCheck.verifyChecksum to use direct/mapped buffers instead of heap buffers. (HDDS-9941)
  • Fixed TokenRenewer implementations (O3FS, OFS) not closing the created OzoneClient. Removed duplicate implementation. (HDDS-9943)
  • Fixed NSSummaryAdmin CLI commands not closing created OzoneClient instances and creating multiple instances unnecessarily. (HDDS-9944)
  • Fixed incorrect synchronization in RatisSnapshotInfo, potentially leading to inconsistent term/index values. Class removed as redundant to TransactionInfo. (HDDS-9984)
  • Fixed Options and ReadOptions instances not being closed properly in rocksdb-checkpoint-differ. (HDDS-10001)
  • Renamed ManagedSstFileReader in rocksdb-checkpoint-differ to SstFileSetReader to avoid name collision with the class in hdds-managed-rocksdb. (HDDS-10007)
  • Fixed potential NullPointerException in VolumeInfoMetrics.getCommitted() if HddsVolume.committedBytes is null. (HDDS-10027)
  • Refined SCM RPC handler counts to be configurable per protocol (Client, Block, Datanode) instead of a single global count. (HDDS-10088)
  • Removed static dbNameToCfHandleMap from RocksDatabase, using non-static columnFamilies map instead. (HDDS-10107)
  • Fixed potential NullPointerException in OMDBCheckpointServlet lock acquisition when SstFilteringService is accessed before initialization. (HDDS-10138)
  • Enabled Zero-Copy reads during container replication for improved performance. (HDDS-10144)
  • Corrected metric names createOmResoonseLatencyNs and validateAndUpdateCacneLatencyNs in OMPerformanceMetrics. (HDDS-10162)
  • Fixed OmMetadataManagerImpl creating a new S3Batcher instance for each S3 secret operation instead of reusing one. (HDDS-10202)
  • Ensured atomic updates in StateContext#updateCommandStatus using computeIfPresent to prevent race conditions. (HDDS-10210)
  • Fixed Grafana dashboards: removed UID/hostnames, included secure/unsecure ports, corrected datastore count. (HDDS-10229)
  • Prevented V3 Schema DatanodeStore from creating container DBs in incorrect locations under certain initialization paths. (HDDS-10230)
  • Fixed ContainerStateManager finalizing OPEN containers without a healthy pipeline on follower SCMs; moved logic to leader-only path via Ratis. (HDDS-10231)
  • Improved JSON response for Deleted Directories and Open Keys Insight Endpoints in Recon for better clarity (using actual names instead of Object IDs). (HDDS-10241)
  • Fixed ContainerReport admin command showing incorrect values immediately after SCM restart before Replication Manager runs. (HDDS-10272)
  • Fixed pagination on the OM DB Insights page in Recon. (HDDS-10282)
  • Added support for direct ByteBuffers in Checksum calculations, using reflection for Java 9+ API while maintaining Java 8 compatibility. (HDDS-10288)
  • Fixed ECReconstructionCoordinator ignoring ozone-site.xml client configurations and using default OzoneClientConfig. (HDDS-10294)
  • Fixed potential orphan blocks during key overwrite operations, especially involving the deleted key table. (HDDS-10296)
  • Fixed KeyManagerImpl#listKeys path normalization to correctly handle OBS/LEGACY buckets when ozone.om.enable.filesystem.paths is true. (HDDS-10319)
  • Fixed metadata not being updated when overwriting existing keys via S3 PutObject. (HDDS-10324)
  • Fixed SetTimes API not working with linked buckets due to missing link resolution. (HDDS-10369)
  • Fixed Recon not handling pre-existing MISSING_EMPTY containers correctly (introduced in HDDS-9695), leaving them marked as missing indefinitely. (HDDS-10370)
  • Fixed S3 listParts incompatibility for keys created before HDDS-9680 (missing ETag metadata) and NPE when ETag is null. (HDDS-10395)
  • Restricted directory deletion in LEGACY buckets via ozone sh key delete; users must use ozone fs interface. (HDDS-10397)
  • Fixed ArrayIndexOutOfBoundsException when listing keys in OBS buckets via S3/s3a under certain conditions. (HDDS-10399)
  • Fixed ozone admin CLI having hard-coded INFO log level, ignoring environment/config settings. (HDDS-10405)
  • Fixed Datanode startup failure (“Illegal configuration: raft.grpc.message.size.max must be 1m larger than …") when using latest Ratis due to default config mismatch. (HDDS-11375)
  • Fixed Datanode startup failure (“checksum size setting 1024 is not in expected format”) due to incorrect type validation for hdds.ratis.raft.server.snapshot.creation.gap. (HDDS-10423)
  • Fixed Grafana dashboard Prometheus endpoint configuration for Datanodes and added missing Recon endpoint. (HDDS-10433)
  • Fixed Datanode Maintenance failing early incorrectly (logic refined). (HDDS-10463)
  • Fixed OM potentially crashing or failing requests if the configured S3 secret storage (Vault) is unavailable. (HDDS-10469)
  • Fixed audit log for key creation missing EC replication config details (parity, chunk size, codec). (HDDS-10472)
  • Fixed potential NullPointerException in OmUtils.getAllOMHAAddresses if OM HA config keys are missing. (HDDS-10508)
  • Fixed S3 GetObject ETag header returning "null" for objects without an ETag, causing issues with AWS SDK validation. Now omits the header if ETag is missing. (HDDS-10521)
  • Fixed MessageDigest instance in S3 endpoint potentially not being reset after exceptions (e.g., client cancellation), leading to incorrect ETags on subsequent requests using the same thread. (HDDS-10587)
  • Fixed issue where client might attempt Ratis streaming for keys defaulted to EC replication if bucket replication isn’t explicitly set. (HDDS-10832)
  • Fixed freon read/mixed operations failing with “Key not found” if prefix is unspecified; stopped adding random prefix. Fixed misleading random prefix log in ommg. (HDDS-10845)
  • Fixed Ozone CLI not respecting default ozone.om.service.id when only one service ID is configured. (HDDS-10861)
  • Fixed ClosePipelineCommandHandler potentially causing GroupMismatchException by calling removeGroup before getting peer list for propagation. (HDDS-10875)
  • Fixed Recon ReconContainerManager potentially throwing DuplicatedPipelineIdException when checking/adding containers due to race conditions or stale data. (HDDS-10880)
  • Improved logging clarity in Recon’s ReconNodeManager regarding datanode finalization status checks during upgrades. (HDDS-10883)
  • Fixed OM startup failure in single-node Docker container due to Ratis group directory mismatch when using default service ID. (HDDS-10909)
  • Fixed Recon startup failing silently or logging incorrect errors in non-HA SCM scenarios due to inability to fetch SCM roles or snapshot. (HDDS-10937)
  • Fixed OM decommission config (ozone.om.decommissioned.nodes) not working without service ID suffix when only one OM service ID is configured. (HDDS-10942)
  • Fixed EC key read corruption potentially occurring if a container’s replica index on a DN mismatches the index expected by the client (e.g., after container move). Added validation. (HDDS-10983)
  • Fixed S3 gateway potentially throwing exceptions (javax.xml.xpath.XPathExpressionException) during concurrent XML parsing (e.g., CompleteMultipartUpload, DeleteObjects). (HDDS-10777)
  • Fixed NullPointerException in XceiverClientRatis.watchForCommit when updateCommitInfosMap encounters a new Datanode ID in the response after a previous timeout removed it from commitInfoMap. (HDDS-10780)
  • Fixed potential OMLeaderNotReadyException after leader switch if transactions were pending in the double buffer, preventing lastNotifiedTermIndex update. (HDDS-10798)
  • Fixed various HTTP server components (Recon, SCM, OM, DN) failing to start if configured with a wildcard Kerberos principal (*) due to missing kerb-core dependency. (HDDS-10803)
  • Fixed S3 setBucketAcl causing UnsupportedOperationException due to attempting to modify an immutable list returned by OzoneVolume.getAcls(). (HDDS-11737)
  • Fixed SCM leadership metric (SCMLeader) potentially being reset to null by HTTP server initialization after the Raft server has already determined leadership. (HDDS-11742)
  • Fixed SnapshotDiffManager logging NativeLibraryNotLoadedException as ERROR even when native tools are optional; changed to WARN. (HDDS-11486)
  • Fixed potential NullPointerException when checking container balancer status (ozone admin containerbalancer status) if balancer is started but not fully initialized (e.g., waiting for DU info). (HDDS-11350)
  • Fixed ozone fs -rm -r prompt for volume deletion suggesting incorrect ozone sh volume delete options (-skipTrash, -id). (HDDS-11346)
  • Fixed ozone sh key list -h showing duplicate options (--all, --length) due to picocli version issue (reverted). (HDDS-11446) -> Reverted picocli upgrade.
  • Fixed S3 CompleteMultipartUpload returning 500 Internal Server Error instead of S3-compliant InvalidRequest error when no parts are specified in the request body. (HDDS-11457)
  • Fixed multiple IOzoneAuthorizer instances potentially being created and leaked if Ratis snapshot installation fails repeatedly after stopping the metadata manager. (HDDS-11472)
  • Fixed ozone sh volume delete command line parsing error for -r option. (HDDS-11535) -> Resolved as part of HDDS-11346 fix.
  • Fixed NullPointerException in OM when overwriting an empty file using multipart upload in FSO buckets (versioning disabled). (HDDS-12131)
  • Fixed Replication Manager (hdds.scm.replication.thread.interval) interval configuration description to correctly state milliseconds instead of seconds. (HDDS-12144) -> Resolved by removing unsupported types.
  • Fixed Grafana dashboard for Chunk read/write rates using incorrect interval variable ($__interval instead of $__rate_interval). (HDDS-12112)
  • Fixed Replication Manager potentially expiring pending container deletes incorrectly instead of retrying them if the Datanode doesn’t confirm deletion within the deadline. (HDDS-12127)
  • Fixed Replication Manager non-deterministically selecting replicas for deletion if preferred target nodes are overloaded, potentially deleting required replicas. (HDDS-12115)
  • Fixed delete container commands potentially running indefinitely or past their deadline due to long lock waits or slow disk I/O; added lock timeout and moved ICR earlier. (HDDS-12114)
  • Fixed Recon UI potentially switching from old UI to new UI automatically upon page refresh. (HDDS-12084)
  • Fixed missing local refresh button in new Recon UI’s Disk Usage page to reload data for the current path without navigating back to root. (HDDS-12085)
  • Fixed unnecessary parameters “Source Volume” & “Source Bucket” appearing in the metadata table for non-link buckets in the new Recon UI Disk Usage page. (HDDS-12073)
  • Fixed Recon API endpoints /api/v1/volumes and /api/v1/buckets missing from Swagger documentation. (HDDS-11300)
  • Fixed potential NullPointerException in Recon /api/v1/volumes and /api/v1/buckets endpoints if accessed before Recon tables are fully initialized after startup. (HDDS-11349)
  • Fixed ozone freon cr (closed container replication) command failing with NPE due to metrics map lookup failure in ReplicationSupervisor. (HDDS-12040)
  • Fixed incorrect display of Ozone Service ID name in Recon UI (New UI showed “OM ID”, Old UI showed “OM Service”). Corrected to “Ozone Service ID”. (HDDS-12049)
  • Fixed difference in Cluster Capacity % calculation (floor vs round) and Container Pre-Allocated Size display (committed vs 0) between new and old Recon UI. (HDDS-12042)
  • Fixed long path names wrapping to the next line in the new Recon UI Disk Usage page; made it scrollable instead. (HDDS-11957)
  • Fixed Recon failing to update version_number in RECON_SCHEMA_VERSION table (always -1), causing upgrade actions to run unnecessarily on fresh installs. (HDDS-11846)
  • Fixed serialization error (Conflicting/ambiguous property name) in Recon’s listKeys API due to Jackson ambiguity between key field and isKey() getter. Renamed getter. (HDDS-11848)
  • Fixed potential deadlock in OM between DoubleBuffer flush thread (waiting for DeletedTable lock during snapshot checkpoint) and KeyDeletingService (holding DeletedTable lock, waiting for Ratis future). (HDDS-11124)
  • Fixed OM crashing with IOException: Rocks Database is closed during SnapshotMoveDeletedKeys request processing if the snapshot was purged concurrently. (HDDS-11152)
  • Fixed containers potentially stuck in DELETING state after upgrade if they were affected by HDDS-8129 (incorrect block counts) and datanodes rejected delete commands due to negative counts. Added recovery logic. (HDDS-11136)
  • Fixed fs -mkdir incorrectly creating directories in OBS buckets (bypassing layout validation added in HDDS-11235). Reverted the optimization for mkdir. (HDDS-11348)
  • Fixed Datanode potentially failing heartbeats or other operations due to deadlock on StateContext#pipelineActions map under high load. Replaced with concurrent map. (HDDS-11331)
  • Fixed S3 gateway returning 403 Forbidden instead of 302 Redirect for root path (/) requests containing Authorization: Negotiate header (used by newer curl versions). (HDDS-11096)
  • Fixed DELETE_TENANT request logging an unnecessary and uninformative UPDATE_VOLUME audit entry, even on failure. (HDDS-11119)
  • Fixed intermittent timeout in TestBlockDeletion.testBlockDeletion potentially caused by race conditions or slow command processing. (HDDS-9962)
  • Fixed “Bad file descriptor” error in TestOmSnapshotFsoWithNativeLib.testSnapshotCompactionDag when using native RocksDB tools library. (HDDS-10149)
  • Fixed ManagedStatistics objects not being closed properly in OM and DN when RocksDB statistics are enabled, leading to resource leaks. (HDDS-10184)
  • Fixed race condition in RocksDatabase where a close operation could occur between assertClose() check and database operation, causing JVM crash. (HDDS-9527)
  • Fixed ReplicationManager metrics not being re-registered after RM restart via CLI (stop/start), causing metrics to stop reporting. (HDDS-9235)
  • Fixed infinite loop in WritableRatisContainerProvider.getContainer if SCM restarts and existing pipeline nodes are not found (e.g., DNs stopped), causing log flooding. (HDDS-8982)
  • Fixed SCM follower nodes logging NotLeaderException errors when processing Pipeline Reports, which is expected behavior for followers. Suppressed error logging. (HDDS-11695)
  • Fixed FileNotFoundException: ... (Too many open files) and subsequent DN crashes during OM+DN decommission under heavy Freon load, likely due to excessive open file handles. (HDDS-11391) -> Addressed potential causes.
  • Fixed Recon showing incorrect (zero) count for DELETED containers in cluster state summary API (/api/v1/clusterState). (HDDS-11389)
  • Fixed issue where OM could fail if IOzoneAuthorizer (e.g., Ranger plugin) fails to initialize during snapshot installation and reload attempts create multiple instances, leading to heap exhaustion. (HDDS-11472)
  • Fixed NoSuchUpload error when aborting multipart uploads for keys where the parent directory was missing (potentially due to FSO-related bugs or cleanup issues). (HDDS-11784)
  • Fixed secure acceptance tests failing on arm64 due to keytab checksum mismatch when using keytabs generated on amd64. Regenerated keytabs for multi-arch compatibility. (HDDS-11810)
  • Fixed race condition in datanode VERSION file creation where multiple threads could attempt to write using the same temporary file via AtomicFileOutputStream. (HDDS-12608)
  • Fixed SCM logging an error when updating sequence ID for a CLOSED container based on a replica report with higher BCSID; changed log level and added context. (HDDS-12409)

Security

  • Disabled REST endpoint for S3 secret manipulation by username for non-admin users via S3Gateway Secret REST endpoint. (HDDS-11040)

For more details, check out Apache Ozone 2.0.0 JIRA list

This is a generally available (GA) release. It represents a point of API stability and quality that we consider production-ready.

Generated-by: Google AI Studio + Gemini 2.5 Pro Preview 03-25, with input data from a filtered JIRA list using this prompt.

Image credit: Indiana Dunes National Lakeshore, Michigan City, Indiana, USA by Diego Delso, CC-BY-SA 3.0 / Text added to original