/ Files

A distributed file system with scalability and consistency

A distributed file system with scalability and consistency

ChubaoFS

CFS(Chubao File System) is a distributed fle system that is designed to natively support large scale container platforms.

ChubaoFS (aka CFS) is a distributed file system with the following features:

  • scale-out metadata management

  • strong replication consistency

  • multiple volumes

  • POSIX-compatible

For more details, please refer to our SIGMOD 2019 paper "CFS: A Distributed File System for Large Scale Container Platforms".

CFS consists of a metadata subsystem, a data subsystem, and a resource manager, and can be accessed by different clients (as a set of application processes) hosted on the containers through different file system instances called volumes.

The metadata subsystem stores the file metadata, and consists of a set of meta nodes. Each meta node consists of a set of meta partitions.

The data subsystem stores the file contents, and consists of a set of data nodes. Each data node consists of a set of data partitions.

The volume is a logical concept in CFS and consists of one or multiple meta partitions and one or multiple data partitions. Each partition can only be assigned to a single volume. From a client’s perspective, the volume can be viewed as a file system instance that contains data accessible by the containers. A volume can be mounted to multiple containers so that files can be shared among different clients simultaneously, and needs to be created at the very beginning before the any file operation. A CFS cluster deployed at each data center can have hundreds of thousands of volumes, whose data sizes vary from a few gigabytes to several terabytes.

Generally speaking, the resource manager periodically communicates with the metadata subsystem and data subsystem to manage the meta nodes and data nodes, respectively. Each client periodically communicates with the resource manager to obtain the up-to-date view of the mounted volume. A file operation usually initiates the communications from the client to the corresponding meta node and data node directly, without the involvement of the resource manager. The updated view of the mounted volume, as well as the file metadata are usually cached at the client side to reduce the communication overhead.

GitHub