NumaFlow

Go Report Card GoDoc License

Summary

NumaFlow is a Kubernetes-native platform to run massive parallel data processing or streaming jobs.

Each pipeline is specified as a Kubernetes custom resource which consists of one or more source vertices, data processing vertices and sink vertices. Each vertex runs zero or more pods with auto scaling.

NumaFlow targets to achieve Exactly-Once semantics, which means from the data from the source vertex to the sink vertex will be processed Exactly-Once.

Core Principles

  • Easy to use for an engineer in any language
  • Install and up and running in < 1 min (minimal setup)
  • Cheaper than Flink, Samza, etc. when TPS is < 10K TPS

Quick Start

Check QUICK START to try it out.

Development

Refer to DEVELOPMENT to set up development environment.

Contributing

Refer to CONTRIBUTING document.

GitHub

View Github