Arvados is an open source platform for managing, processing, and sharing genomic and other large scientific and biomedical data. With Arvados, bioinformaticians run and scale compute-intensive workflows, developers create biomedical applications, and IT administrators manage large compute and storage resources.

The key components of Arvados are:

  • Keep: Keep is the Arvados storage system for managing and storing large
    collections of files. Keep combines content addressing and a
    distributed storage architecture resulting in both high reliability
    and high throughput. Every file stored in Keep can be accurately
    verified every time it is retrieved. Keep supports the creation of
    collections as a flexible way to define data sets without having to
    re-organize or needlessly copy data. Keep works on a wide range of
    underlying filesystems and object stores.

  • Crunch: Crunch is the orchestration system for running Common Workflow Language workflows. It is
    designed to maintain data provenance and workflow
    reproducibility. Crunch automatically tracks data inputs and outputs
    through Keep and executes workflow processes in Docker containers. In
    a cloud environment, Crunch optimizes costs by scaling compute on demand.

  • Workbench: The Workbench web application allows users to interactively access
    Arvados functionality. It is especially helpful for querying and
    browsing data, visualizing provenance, and tracking the progress of

  • Command Line tools: The command line interface (CLI) provides convenient access to Arvados
    functionality in the Arvados platform from the command line.

  • API and SDKs: Arvados is designed to be integrated with existing infrastructure. All
    the services in Arvados are accessed through a RESTful API. SDKs are
    available for Python, Go, R, Perl, Ruby, and Java.

Quick start

To try out Arvados on your local workstation, you can use Arvbox, which
provides Arvados components pre-installed in a Docker container (requires
Docker 1.9+). After cloning the Arvados git repository:

$ cd arvados/tools/arvbox/bin
$ ./arvbox start localdemo

In this mode you will only be able to connect to Arvbox from the same host. To
configure Arvbox to be accessible over a network and for other options see for details.