Hive

Apache Hive has supported Apache Ozone since Hive 4.0. To enable Hive to work with Ozone paths, ensure that the ozone-filesystem-hadoop3 JAR is added to the Hive classpath.

Supported Access Protocols

Hive supports the following protocols for accessing Ozone data:

  • ofs
  • o3fs
  • s3a

Supported Replication Types

Hive is compatible with Ozone buckets configured with either:

  • RATIS (Replication)
  • Erasure Coding

Accessing Ozone Data in Hive

Hive provides two methods to interact with data in Ozone:

  • Managed Tables
  • External Tables

Managed Tables

Configuring the Hive Warehouse Directory in Ozone

To store managed tables in Ozone, update the following properties in the hive-site.xml configuration file:

<property>
  <name>hive.metastore.warehouse.dir</name>
  <value>ofs://ozone1/vol1/bucket1/warehouse/</value>
</property>

Creating a Managed Table

You can create a managed table with a standard CREATE TABLE statement:

CREATE TABLE myTable (
    id INT,
    name STRING
);

Loading Data into a Managed Table

Data can be loaded into a Hive table from an Ozone location:

LOAD DATA INPATH 'ofs://ozone1/vol1/bucket1/table.csv' INTO TABLE myTable;

Specifying a Custom Ozone Path

You can define a custom Ozone path for a database using the MANAGEDLOCATION clause:

CREATE DATABASE d1 MANAGEDLOCATION 'ofs://ozone1/vol1/bucket1/data';

Tables created in the database d1 will be stored under the specified path: ofs://ozone1/vol1/bucket1/data

Verifying the Ozone Path

You can confirm that Hive references the correct Ozone path using:

SHOW CREATE DATABASE d1;

Output Example:

+----------------------------------------------------+
|                   createdb_stmt                    |
+----------------------------------------------------+
| CREATE DATABASE `d1`                               |
| LOCATION                                           |
|   'ofs://ozone1/vol1/bucket1/external/d1.db'       |
| MANAGEDLOCATION                                    |
|   'ofs://ozone1/vol1/bucket1/data'                 |
+----------------------------------------------------+

External Tables

Hive allows the creation of external tables to query existing data stored in Ozone.

Creating an External Table

CREATE EXTERNAL TABLE external_table (
    id INT,
    name STRING
)
LOCATION 'ofs://ozone1/vol1/bucket1/table1';
  • With external tables, the data is expected to be created and managed by another tool.
  • Hive queries the data as-is.
  • Note: Dropping an external table in Hive does not delete the associated data.

To set a default path for external tables, configure the following property in the hive-site.xml file:

<property>
  <name>hive.metastore.warehouse.external.dir</name>
  <value>ofs://ozone1/vol1/bucket1/external/</value>
</property>

This property specifies the base directory for external tables when no explicit LOCATION is provided.

Verifying the External Table Path

To confirm the table’s metadata and location, use:

SHOW CREATE TABLE external_table;

Output Example:

+----------------------------------------------------+
|                   createtab_stmt                   |
+----------------------------------------------------+
| CREATE EXTERNAL TABLE `external_table`(            |
|   `id` int,                                        |
|   `name` string)                                   |
| ROW FORMAT SERDE                                   |
|   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'  |
| STORED AS INPUTFORMAT                              |
|   'org.apache.hadoop.mapred.TextInputFormat'       |
| OUTPUTFORMAT                                       |
|   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' |
| LOCATION                                           |
|   'ofs://ozone1/vol1/bucket1/table1'               |
| TBLPROPERTIES (                                    |
|   'bucketing_version'='2',                         |
|   'transient_lastDdlTime'='1734725573')            |
+----------------------------------------------------+

Using the S3A Protocol

In addition to ofs, Hive can access Ozone using the S3 Gateway via the S3A file system.

For more information, consult:

Next >>