Skip to main content

One post tagged with "erasure-coding"

View All Tags

Apache Ozone Best Practices at Didi: Scaling to Tens of Billions of Files

· 5 min read
Kaun-Hung (Rich) Huang
Apache Ozone Contributor
The Apache Ozone Community
Apache Ozone Project
Shilun Fan
Apache Ozone Contributor
Hongbing Wang
Apache Ozone Contributor
JiangHua Zhu
Apache Ozone Contributor
Ming Wei
Apache Ozone Contributor

Guest post by the Didi Engineering Team. For the full story with detailed slides, see Apache Ozone Best Practices at Didi (PDF).

As Didi's volume of unstructured data surged into the hundreds of petabytes, comprising tens of billions of files, their traditional storage architecture faced severe scalability bottlenecks. This post summarizes how they migrated from HDFS to Apache Ozone, the optimizations they implemented for high-performance reads, and their journey in contributing these improvements back to the community.