[Go] ChainKV: A Semantics-Awared Key-Value Store for Blockchain Systems
Features
- Limitations
- CPU with 2+ cores
- 4GB RAM
- 1TB free storage space to sync the Mainnet
- 8 MBit/sec download Internet service
- Fast CPU with 4+ cores
- 16GB+ RAM
- High-performance SSD with at least 1TB of free space
- 25+ MBit/sec download Internet service
- For the synchronization workload, we synchronize 4 groups of real workloads (
1.6M
,2.3M
,3.4M
,4.6M
blocks). - For the query workloads, we use 3 distributions to simulator all access behaviors.
- Building Data
Hardware Requirements
In practice, replay blocks is limited by the performance of storage, thus, using SSD as the storage is necessary. Meanwhile, a RAM with a big capacity also can accelerate the synchronization speed.
Minimum:
Recommended:
Full node on the main Ethereum network
In my evaluation, to obtain all state transition, thus, we must synchronize all transactions in the past.
geth --syncmode "full"
--cache ""
--trie.cache.gens ""
--datadir ""
Usage of DB
- State Separation
ChainKV divide the whole store space into two independent zones, including memory components and disk components. As a result, ChainKV implement different interfaces to achieve CRUD operations. For a instance, ChainKV writes data to the two isolated zones by calling Put()
, Put_s()
.
// Note that the data structure involved in these two calls() are different.
db,_ := ethdb.NewLDBDatabase("PATH")
...
db.Put(key(non-state), value(nono-state))
db.Put_s(key(state), value(state))
Prefix MPT
Prefix MPT can aggregate nodes that are strongly spatial & temporal. In practice, the Prefix MPT scheme is a encoding strategy, which can manually assign different prefixes to different KV pairs to achieve lexicographic sort of different KV pairs. See /MPT/trie
for more details.
SGC
There are two cache structures in SGC for each type of data: a Real Cache and a virtual Ghost Cache. The real cache is used to cache hot KV items. The virtual ghost cache does not occupy real memory space, which only holds the metadata of the KV items evicted from the real cache. The ghost cache provides a possible hint for enlarging or squeezing the real cache.
A hit in the ghost cache means that it could have a real cache hit if the corresponding real cache was larger. By using the ghost caches, the size of the corresponding real caches can be adjusted dynamically. Based on the data, the cache space is further subdivided into the non-state data real cache (r
), the state data real cache (r1
), the non-state data ghost cache (f
), and the state data ghost cache (f1
), respectively.
The basic data structure is as follows.
type SGC struct{
mu sync.Mutex // mutex, concurrently access
capacity int // the sum of r and r1
trriger int // the target triggering the silde window
rused, fused, r1used, f1used int // record the usage of r, r1, f, f1
recent lruNode // Header
frequent lruNode // Header
r1, f1 lruNode // Header
}
See /goleveldb/leveldb/cache
for more details.
Lightweight Node-failure
In the lightweight node-failure recovery design, we maintain a safe block
for both the in-memory state memtable
and the Non-s memtable
, and both are written to disk together with the original memtable flush operations. The purpose is to place a “marker” in the persistent storage to indicate the progress of the latest flush operation. The safe block
is a special-purpose KV item. Its key is a pre-defined array of bytes, and its value indicates the latest block number stored in the SSTs. Therefore, to retrieve the latest successfully synchronized block number during the data recovery start-up phase, we simply query the two SST zones using the corresponding key.
Setup
How to use ChainKV to reproduce the experiment
The exact location of the code is as follows:
WRITE: Using interface() InsertChain() to replay all historail blocks
READ: All tests are located in /MPT/trie/exper_test.go
Before running read tests, you must botain all historial transactions hash and all accounts registered.
Contribution
Thank you for considering helping out with the source code! We welcome contributions from anyone on the internet, and are grateful for even the smallest of fixes!