/ Database tool

Go package for sharding databases

Go package for sharding databases

Octillery

Octillery is a Go package for sharding databases.
It can use with every OR Mapping library ( xorm , gorp , gorm , dbr ...) implementing database/sql interface, or raw SQL.

Currently supports MySQL (for product) and SQLite3 (for testing) .

Motivation

We need database sharding library in Go. Of course, we know some libraries like ( https://github.com/evalphobia/wizard , https://github.com/go-pg/sharding ). But OR Mapping library they support are restricted and we want to write sharding configuration declaratively, also expect to pluggable for sharding algorithm or database adapter, and expect to configurable sharding key or whether use sequencer or not.

Features

  • Supports every OR Mapping library implementing database/sql interface ( xorm , gorp , gorm , dbr , ... )
  • Supports using database/sql ( raw SQL ) directly
  • Pluggable sharding algorithm ( preinstalled algorithms are modulo and hashmap )
  • Pluggable database adapter ( preinstalled adapters are mysql and sqlite3 )
  • Declarative describing for sharding configuration in YAML
  • Configurable sharding algorithm, database adapter, sharding key, whether use sequencer or not.
  • Supports capture read/write queries just before passing to database driver
  • Supports database migration by CLI ( powered by schemalex )
  • Supports import seeds from CSV

Install

Install as a CLI tool

go get -u go.knocknote.io/octillery/cmd/octillery

Install as a library

go get -u go.knocknote.io/octillery

How It Works

1. How database sharding works

We explain by using posts table.

posts table schema is

CREATE TABLE `posts` (
  `id` bigint unsigned NOT NULL AUTO_INCREMENT,
  `user_id` bigint unsigned NOT NULL,
  `created_at` datetime NOT NULL,
  `updated_at` datetime NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `uq_posts_01` (`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

And we want to shard this table to four databases for load distribution.

In this case, we can try to two approach according to requirements.

1. Using Sequencer

If you want id to be unique in all databases, you should use this approach.
Architecture of this approach would be the following.

architecture

Application create SQL ( like insert into posts (id, user_id, ...) values (null, 1, ...) ), in this point, id value is null because still not decide. In accordance with the above graph, insert this query to one of the databases.

  1. Application requests id value to sequencer
  2. Sequencer generates next unique id in all shards
  3. Sequencer returns id value to application ( ex. id = 1 )
  4. Replace id value from null to 1 (ex. insert into posts (id, user_id, ...) values (1, 1, ...) )
  5. Decide target based of id value by sharding algorithm ( default modulo ) and insert record to selected database.

By using sequencer approach, you can get unique id value in all databases.
Therefore, if you insert multiple records, database records should looks like the following.

posts_shard_1 posts_shard_2 posts_shard_3 posts_shard_4
id user_id id user_id id user_id id user_id
1 1 2 2 3 3 4 4
5 5 6 6 7 7 8 8

2. Using Sharding Key ( without Sequencer )

If you don't care about uniqueness of id, you can use sharding key approach.
Architecture of this appraoch would be the following.

architecture2

  1. Decide target based of user_id value by sharding algorithm ( default modulo ) and insert record to selected database.

By using sharding key approach, same id value will appear in multiple databases.
Therefore, if you insert multiple records, database record should looks like the following.

posts_shard_1 posts_shard_2 posts_shard_3 posts_shard_4
id user_id id user_id id user_id id user_id
1 1 1 2 1 3 1 4
2 5 2 6 2 7 2 8

2. Requirements of database sharding library

We explained how to sharding database at section 1.
From this we define requirements of database sharding library.

  • Know about database sharding configuration
  • Capture query just before passing to database driver
  • Parse query and find sharding key
  • If use sequencer, requests id value to sequencer and replace value of id column by it
  • Select sharding target based of sharding key by sharding algorithm

3. How Octillery works

How To Capture Query

Octillery CLI tool supports transpose command.
It replace import statement of database/sql to go.knocknote.io/octillery/database/sql.

go.knocknote.io/octillery/database/sql package has compatible interface of database/sql.

Therefore, OR Mapping library call Octillery's interface. and it can capture all queries.

How To Parse SQL

Octillery use github.com/knocknote/vitess-sqlparser as SQL parser. It implements powered by vitess and tidb .

How To Use New Database Adapter

Octillery supports mysql and sqlite3 adapter by default.
If you want to use new database adapter, need to the following two steps.

  1. Write DBAdapter interface. ( see https://godoc.org/go.knocknote.io/octillery/connection/adapter )
  2. Put new adapter file to go.knocknote.io/octillery/plugin directory

How To Use New Database Sharding Algorithm

Octillery supports modulo and hashmap algorithm by default.
If you want to use new algorithm, need to the following two steps.

  1. Write ShardingAlgorithm interface. ( see https://godoc.org/go.knocknote.io/octillery/algorithm )
  2. Put new algorithm file to go.knocknote.io/octillery/algorithm directory

Usage

1. Install CLI tool

$ go get -u go.knocknote.io/octillery/cmd/octillery

2. Install library

$ go get -u go.knocknote.io/octillery

3. Replace already imported database/sql statement

$ octillery transpose

--dry-run option confirms without overwriting

4. Install database adapter

$ octillery install --mysql

5. Describe database cofiguration in YAML

databases.yml

default: &default
  adapter: mysql
  encoding: utf8mb4
  username: root
  master:
    - localhost:3306

tables:
  posts:
    shard: true
    shard_key: user_id
    shards:
      - post_shard_1:
          <<: *default
          database: posts_shard_1
      - post_shard_2:
          <<: *default
          database: posts_shard_2

6. Migrate database

$ octillery migrate --config databases.yml /path/to/schema

--dry-run option confirms migration plan

7. Load configuration file

package main

import (
	"go.knocknote.io/octillery"
	"go.knocknote.io/octillery/database/sql"
)

func main() {
	if err := octillery.LoadConfig("databases.yml"); err != nil {
		panic(err)
	}
	db, _ := sql.Open("mysql", "")
	db.QueryRow("...")
}

Document

See GoDoc

Development

Install dependencies

$ make deps

If update dependencies, the following

  1. Modify glide.yaml
  2. Run make update-deps
  3. Commit glide.yaml and glide.lock

Run tests

$ make test

GitHub