Access Ozone using HTTPFS REST API (Docker + Python Requests)

This tutorial demonstrates how to access Apache Ozone using the HTTPFS REST API and Python’s requests library. It covers writing and reading a file via simple authentication.

Prerequisites

  • Docker and Docker Compose installed
  • Python 3.x environment

Steps

1️⃣ Start Ozone in Docker

Download the latest Docker Compose configuration file:

curl -O https://raw.githubusercontent.com/apache/ozone-docker/refs/heads/latest/docker-compose.yaml

Add httpfs container configurations and environment variable overrides at the bottom of docker-compose.yaml:

   httpfs:
     <<: *image
     ports:
       - 14000:14000
     environment:
       CORE-SITE.XML_fs.defaultFS: "ofs://om"
       CORE-SITE.XML_hadoop.proxyuser.hadoop.hosts: "*"
       CORE-SITE.XML_hadoop.proxyuser.hadoop.groups: "*"
       OZONE-SITE.XML_hdds.scm.safemode.min.datanode: ${OZONE_SAFEMODE_MIN_DATANODES:-1}
       <<: *common-config
     command: [ "ozone","httpfs" ]

Start the cluster:

docker compose up -d --scale datanode=3

2️⃣ Create a Volume and Bucket

Connect to the SCM container:

docker exec -it <your-scm-container-name-or-id> bash

Change the container id <your-scm-container-name-or-id> to your actual container id.

The rest of the tutorial will run on this container.

Create a volume and a bucket:

ozone sh volume create vol1
ozone sh bucket create vol1/bucket1

3️⃣ Install Required Python Library

Install the requests library:

pip install requests

4️⃣ Access Ozone HTTPFS via Python

Create a script (ozone_httpfs_example.py) with the following content:

#!/usr/bin/python
import requests

# Ozone HTTPFS endpoint and file path
host = "http://httpfs:14000"
volume = "vol1"
bucket = "bucket1"
filename = "hello.txt"
path = f"/webhdfs/v1/{volume}/{bucket}/{filename}"
user = "ozone"  # can be any value in simple auth mode

# Step 1: Initiate file creation (responds with 307 redirect)
params_create = {
    "op": "CREATE",
    "overwrite": "true",
    "user.name": user
}

print("Creating file...")
resp_create = requests.put(host + path, params=params_create, allow_redirects=False)

if resp_create.status_code != 307:
    print(f"Unexpected response: {resp_create.status_code}")
    print(resp_create.text)
    exit(1)

redirect_url = resp_create.headers['Location']
print(f"Redirected to: {redirect_url}")

# Step 2: Write data to the redirected location with correct headers
headers = {"Content-Type": "application/octet-stream"}
content = b"Hello from Ozone HTTPFS!\n"

resp_upload = requests.put(redirect_url, data=content, headers=headers)
if resp_upload.status_code != 201:
    print(f"Upload failed: {resp_upload.status_code}")
    print(resp_upload.text)
    exit(1)
print("File created successfully.")

# Step 3: Read the file back
params_open = {
    "op": "OPEN",
    "user.name": user
}

print("Reading file...")
resp_read = requests.get(host + path, params=params_open, allow_redirects=True)
if resp_read.ok:
    print("File contents:")
    print(resp_read.text)
else:
    print(f"Read failed: {resp_read.status_code}")
    print(resp_read.text)

Run the script:

python ozone_httpfs_example.py

✅ If everything is configured correctly, this will create a file in Ozone using the REST API and read it back.

Troubleshooting Tips

  • 401 Unauthorized: Make sure user.name is passed as a query parameter and that proxy user settings are correct in core-site.xml.
  • 400 Bad Request: Add Content-Type: application/octet-stream to the request header.
  • Connection Refused: Ensure httpfs container is running and accessible at port 14000.
  • Volume or Bucket Not Found: Confirm you created vol1/bucket1 in step 2.

References

Next >>