building-a-database-in-golang

building a simple database in golang

Introduction

As a web developer, I use relational databases every day at my job, but they’re a black box to me. Some questions I have:

What format is data saved in? (in memory and on disk)
When does it move from memory to disk?
Why can there only be one primary key per table?
How does rolling back a transaction work?
How are indexes formatted?
When and how does a full table scan happen?
What format is a prepared statement saved in?

In other words, how does a database work?

To figure things out, I’m writing a database from scratch. It’s modeled off sqlite because it is designed to be small with fewer features than MySQL or PostgreSQL, so I have a better hope of understanding it. The entire database is stored in a single file!

Sqlite

Tokenizer —> Parser —> Code Generator —> Virtual Machine —> B-Tree —> Page —> OS Interface

A query goes through a chain of components in order to retrieve or modify data. The front-end consists of the:

  • tokenizer
  • parser
  • code generator

The input to the front-end is a SQL query. The output is a sqlite virtual machine bytecode ( essentially a compiled program that can operate on the database ).

The back-end consist of the:

  • virtual machine
  • B-tree
  • pager
  • os interface

The virtual machine takes the bytecode generated by the front-end as the instructions. It can then perform operations on one or more tables or indexes, each of which is stored in a data structure called a B-tree. The VM is essentially a big switch statement on the type of bytecode instruction.

Each B-Tree consists of many nodes. Each node is one page in length. The B-tree can retrieve a page from disk or save it back to disk by issuing commands to the pager.

The pager recieves commands to read or write pages of data. It is responsible for reading/ writing at appropriate offsets in the database file. It also keeps a cache of recently-accessed pages in memory, and determines when those pages need to be written back.

The os interface is the layer that differs depending on which operating system sqlite was compiled for. In this tutorial, I’m not going to support multiple platforms.

Making a simple Read-Execute-Print-Loop

Sqlite starts a read-execute-print loop when you start it from the command line

~ sqlite3 SQLite version 3.16.0 2016-11-04 19:09:39 Enter ".help" for usage hints. Connected to a transient in-memory database. Use ".open FILENAME" to reopen on a persistent database. sqlite> create table users (id int, username varchar(255), email varchar(255)); sqlite> .tables users sqlite> .exit ~ To do that, our main function will have an infinite loop that prints the prompt, gets a line of input, then processes that line of input

showing the c version that i found on the internet right now, but will surely convert to golang later

int main(int argc, char* argv[]) {
  InputBuffer* input_buffer = new_input_buffer();
  while (true) {
    print_prompt();
    read_input(input_buffer);
    if (strcmp(input_buffer->buffer, ".exit") == 0) {
      close_input_buffer(input_buffer);
      exit(EXIT_SUCCESS);
    } else {
      printf("Unrecognized command '%s'.\n", input_buffer->buffer);
    }
  }
}

typedef struct {
  char* buffer;
  size_t buffer_length;
  ssize_t input_length;
} InputBuffer;

InputBuffer* new_input_buffer() {
  InputBuffer* input_buffer = (InputBuffer*)malloc(sizeof(InputBuffer));
  input_buffer->buffer = NULL;
  input_buffer->buffer_length = 0;
  input_buffer->input_length = 0;

  return input_buffer;
}

void read_input(InputBuffer* input_buffer) {
  ssize_t bytes_read =
      getline(&(input_buffer->buffer), &(input_buffer->buffer_length), stdin);

  if (bytes_read <= 0) {
    printf("Error reading input\n");
    exit(EXIT_FAILURE);
  }

  // Ignore trailing newline
  input_buffer->input_length = bytes_read - 1;
  input_buffer->buffer[bytes_read - 1] = 0;
}

void close_input_buffer(InputBuffer* input_buffer) {
    free(input_buffer->buffer);
    free(input_buffer);
}


World’s Simplest SQL Compiler and Virtual Machine

We’re making a clone of sqlite. The “front-end” of sqlite is a SQL compiler that parses a string and outputs an internal representation called bytecode.

This bytecode is passed to the virtual machine, which executes it

for reference, see the image sqlite-architecture.png in the root directory

Breaking things into two steps like this has a couple advantages:

  • Reduces the complexity of each part (e.g. virtual machine does not worry about syntax errors)
  • Allows compiling common queries once and caching the bytecode for improved performance

With this in mind, let’s refactor our main function and support two new keywords in the process:

int main(int argc, char* argv[]) {
   InputBuffer* input_buffer = new_input_buffer();
   while (true) {
     print_prompt();
     read_input(input_buffer);

-    if (strcmp(input_buffer->buffer, ".exit") == 0) {
-      exit(EXIT_SUCCESS);
-    } else {
-      printf("Unrecognized command '%s'.\n", input_buffer->buffer);
+    if (input_buffer->buffer[0] == '.') {
+      switch (do_meta_command(input_buffer)) {
+        case (META_COMMAND_SUCCESS):
+          continue;
+        case (META_COMMAND_UNRECOGNIZED_COMMAND):
+          printf("Unrecognized command '%s'\n", input_buffer->buffer);
+          continue;
+      }
     }
+
+    Statement statement;
+    switch (prepare_statement(input_buffer, &statement)) {
+      case (PREPARE_SUCCESS):
+        break;
+      case (PREPARE_UNRECOGNIZED_STATEMENT):
+        printf("Unrecognized keyword at start of '%s'.\n",
+               input_buffer->buffer);
+        continue;
+    }
+
+    execute_statement(&statement);
+    printf("Executed.\n");
   }
 }


Non-SQL statements like .exit are called “meta-commands”. They all start with a dot, so we check for them and handle them in a separate function.

Next, we add a step that converts the line of input into our internal representation of a statement. This is our hacky version of the sqlite front-end.

Lastly, we pass the prepared statement to execute_statement. This function will eventually become our virtual machine.

Notice that two of our new functions return enums indicating success or failure:

typedef enum {
  META_COMMAND_SUCCESS,
  META_COMMAND_UNRECOGNIZED_COMMAND
} MetaCommandResult;

typedef enum { PREPARE_SUCCESS, PREPARE_UNRECOGNIZED_STATEMENT } PrepareResult;

“Unrecognized statement”? That seems a bit like an exception. I prefer not to use exceptions (and C doesn’t even support them), so I’m using enum result codes wherever practical. The C compiler will complain if my switch statement doesn’t handle a member of the enum, so we can feel a little more confident we handle every result of a function. Expect more result codes to be added in the future.

do_meta_command is just a wrapper for existing functionality that leaves room for more commands:

MetaCommandResult do_meta_command(InputBuffer* input_buffer) {
  if (strcmp(input_buffer->buffer, ".exit") == 0) {
    exit(EXIT_SUCCESS);
  } else {
    return META_COMMAND_UNRECOGNIZED_COMMAND;
  }
}

Our “prepared statement” right now just contains an enum with two possible values. It will contain more data as we allow parameters in statements:

typedef enum { STATEMENT_INSERT, STATEMENT_SELECT } StatementType;

typedef struct {
  StatementType type;
} Statement;

prepare_statement (our “SQL Compiler”) does not understand SQL right now. In fact, it only understands two words:

PrepareResult prepare_statement(InputBuffer* input_buffer,
                                Statement* statement) {
  if (strncmp(input_buffer->buffer, "insert", 6) == 0) {
    statement->type = STATEMENT_INSERT;
    return PREPARE_SUCCESS;
  }
  if (strcmp(input_buffer->buffer, "select") == 0) {
    statement->type = STATEMENT_SELECT;
    return PREPARE_SUCCESS;
  }

  return PREPARE_UNRECOGNIZED_STATEMENT;
}

Note that we use strncmp for “insert” since the “insert” keyword will be followed by data. (e.g. insert 1 cstack [email protected])

Lastly, execute_statement contains a few stubs:

void execute_statement(Statement* statement) {
  switch (statement->type) {
    case (STATEMENT_INSERT):
      printf("This is where we would do an insert.\n");
      break;
    case (STATEMENT_SELECT):
      printf("This is where we would do a select.\n");
      break;
  }
}

Note that it doesn’t return any error codes because there’s nothing that could go wrong yet.

With these refactors, we now recognize two new keywords!

~ ./db
db > insert foo bar
This is where we would do an insert.
Executed.
db > delete foo
Unrecognized keyword at start of 'delete foo'.
db > select
This is where we would do a select.
Executed.
db > .tables
Unrecognized command '.tables'
db > .exit
~

GitHub

View Github