Skip to main content

HttpFS Gateway

Ozone HttpFS can be used to integrate Ozone with other tools via REST API.

Introduction

HttpFS is a service that provides a REST HTTP gateway supporting File System operations (read and write). It is interoperable with the WebHDFS REST HTTP API.

HttpFS can be used to access data on an Ozone cluster behind of a firewall. For example, the HttpFS service acts as a gateway and is the only system that is allowed to cross the firewall into the cluster.

HttpFS can be used to access data in Ozone using HTTP utilities (such as curl and wget) and HTTP libraries Perl from other languages than Java.

The WebHDFS client FileSystem implementation can be used to access HttpFS using the Ozone filesystem command line tool (ozone fs) as well as from Java applications using the Hadoop FileSystem Java API.

Getting started

To try it out, follow the instructions to start the Ozone cluster with Docker Compose.

docker compose up -d --scale datanode=3

You can/should find now the HttpFS gateway in Docker with the name like ozone_httpfs, and it can be accessed through localhost:14000. HttpFS HTTP web-service API calls are HTTP REST calls that map to an Ozone file system operation.

Here's some example usage:

Create a volume

# creates a volume called `volume1`.
curl -i -X PUT "http://localhost:14000/webhdfs/v1/volume1?op=MKDIRS&user.name=hdfs"

Example Output:

HTTP/1.1 200 OK
Date: Sat, 18 Oct 2025 07:51:21 GMT
Cache-Control: no-cache
Expires: Sat, 18 Oct 2025 07:51:21 GMT
Pragma: no-cache
Content-Type: application/json
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Set-Cookie: hadoop.auth="u=hdfs&p=hdfs&t=simple-dt&e=1760809881100&s=OCdVOi8eyMguFySkmEJxm5EkRfj6NbAM9agi5Gue1Iw="; Path=/; HttpOnly
Content-Length: 17

{"boolean":true}

Create a bucket

# creates a bucket called `bucket1`.
curl -i -X PUT "http://localhost:14000/webhdfs/v1/volume1/bucket1?op=MKDIRS&user.name=hdfs"

Example Output:

HTTP/1.1 200 OK
Date: Sat, 18 Oct 2025 07:52:06 GMT
Cache-Control: no-cache
Expires: Sat, 18 Oct 2025 07:52:06 GMT
Pragma: no-cache
Content-Type: application/json
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Set-Cookie: hadoop.auth="u=hdfs&p=hdfs&t=simple-dt&e=1760809926682&s=yvOaeaRCVJZ+z+nZQ/rM/Y01pzEmS9Pe2mE9f0b+TWw="; Path=/; HttpOnly
Content-Length: 17

{"boolean":true}

Upload a file

echo "hello" >> ./README.txt
curl -i -X PUT "http://localhost:14000/webhdfs/v1/volume1/bucket1/user/foo/README.txt?op=CREATE&data=true&user.name=hdfs" -T ./README.txt -H "Content-Type: application/octet-stream"

Example Output:

HTTP/1.1 100 Continue

HTTP/1.1 201 Created
Date: Sat, 18 Oct 2025 08:33:33 GMT
Cache-Control: no-cache
Expires: Sat, 18 Oct 2025 08:33:33 GMT
Pragma: no-cache
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Set-Cookie: hadoop.auth="u=hdfs&p=hdfs&t=simple-dt&e=1760812413286&s=09t7xKu/p/fjCJiQNL3bvW/Q7mTw28IbeNqDGlslZ6w="; Path=/; HttpOnly
Location: http://localhost:14000/webhdfs/v1/volume1/bucket1/user/foo/README.txt
Content-Type: application/json
Content-Length: 84

{"Location":"http://localhost:14000/webhdfs/v1/volume1/bucket1/user/foo/README.txt"}

Read the file content

# returns the content of the key `/user/foo/README.txt`.
curl 'http://localhost:14000/webhdfs/v1/volume1/bucket1/user/foo/README.txt?op=OPEN&user.name=foo'
hello

Supported operations

Here are the tables of WebHDFS REST APIs and their state of support in Ozone.

File and Directory Operations

OperationSupport
Create and Write to a Filesupported
Append to a Filenot implemented in Ozone
Concat File(s)not implemented in Ozone
Open and Read a Filesupported
Make a Directorysupported
Create a Symbolic Linknot implemented in Ozone
Rename a File/Directorysupported (with limitations)
Delete a File/Directorysupported
Truncate a Filenot implemented in Ozone.
Status of a File/Directorysupported
List a Directorysupported
List a Filesupported
Iteratively List a Directoryunsupported

Other File System Operations

OperationSupport
Get Content Summary of a Directorysupported
Get Quota Usage of a Directorysupported
Set Quotanot implemented in Ozone FileSystem API
Set Quota By Storage Typenot implemented in Ozone
Get File Checksumunsupported (to be fixed)
Get Home Directoryunsupported (to be fixed)
Get Trash Rootunsupported
Set Permissionnot implemented in Ozone FileSystem API
Set Ownernot implemented in Ozone FileSystem API
Set Replication Factornot implemented in Ozone FileSystem API
Set Access or Modification Timenot implemented in Ozone FileSystem API
Modify ACL Entriesnot implemented in Ozone FileSystem API
Remove ACL Entriesnot implemented in Ozone FileSystem API
Remove Default ACLnot implemented in Ozone FileSystem API
Remove ACLnot implemented in Ozone FileSystem API
Set ACLnot implemented in Ozone FileSystem API
Get ACL Statusnot implemented in Ozone FileSystem API
Check accessnot implemented in Ozone FileSystem API

Proxy User Configuration

HttpFS supports proxy user (user impersonation) functionality, which allows a user to perform operations on behalf of another user. This is useful when HttpFS is used as a gateway and you want to allow certain users to impersonate other users.

To configure proxy users, you need to add the following properties to httpfs-site.xml.

Configuration Properties

For each user that should be allowed to perform impersonation, you need to configure two properties:

  1. httpfs.proxyuser.#USER#.hosts: List of hosts from which the user is allowed to perform impersonation operations.
  2. httpfs.proxyuser.#USER#.groups: List of groups whose users can be impersonated by the specified user.

Replace #USER# with the actual username of the user who should be allowed to perform impersonation.

Example Configuration

<property>
<name>httpfs.proxyuser.knoxuser.hosts</name>
<value>*</value>
<description>
List of hosts the 'knoxuser' user is allowed to perform 'doAs'
operations.

The value can be the '*' wildcard or a comma-separated list of hostnames.

For multiple users, copy this property and replace the user name
in the property name.
</description>
</property>

<property>
<name>httpfs.proxyuser.knoxuser.groups</name>
<value>*</value>
<description>
List of groups the 'knoxuser' user is allowed to impersonate users
from to perform 'doAs' operations.

The value can be the '*' wildcard or a comma-separated list of group names.

For multiple users, copy this property and replace the user name
in the property name.
</description>
</property>

In this example, the user knoxuser is allowed to impersonate any user from any host. For production environments, it's recommended to restrict these values to specific hosts and groups instead of using the wildcard *.

Troubleshooting

If you encounter an error like:

User: user/host @REALM is not allowed to impersonate user01

This indicates that the proxy user configuration is missing or incorrect. Ensure that:

  1. The httpfs.proxyuser.#USER#.hosts property is set with appropriate host values
  2. The httpfs.proxyuser.#USER#.groups property is set with appropriate group values
  3. The HttpFS service has been restarted after configuration changes

Hadoop user and developer documentation about HttpFS