Skip to main content

Command Palette

Search for a command to run...

Write-Ahead Logging (WAL)

Published
6 min read

Modern databases need to guarantee that your data is not lost, even when things go wrong: crashes, power failures, or abrupt shutdowns. At the heart of this guarantee lies a simple yet powerful mechanism : Write-Ahead Logging (WAL).

This blog is the first in a series on how databases work under the hood, and we begin by addressing a fundamental challenge: how can a database ensure durability and consistency in the face of crashes?


The Problem: Data Loss on Crashes

Let’s say you’re building a simple key-value store that writes to disk like this:

void put(const char* key, const char* value) {
    FILE* f = fopen("store.txt", "w");
    fprintf(f, "%s=%s\n", key, value);
    fclose(f);
}

Every time you put, it overwrites the store file.

Seems simple, but what happens if your application crashes in the middle of writing? You might end up with:

  • A partially written file
  • A corrupted store
  • Loss of previously stored keys

Even with fsync() or fflush(), things can go wrong. For instance:

  • What if power is lost before the file is fully written?
  • What if a write partially succeeds, corrupting the file?

Clearly, we need a safer method.


The Need for a Safer Write Path

Overwriting the entire store every time is wasteful and risky. Instead, we want a method that:

  1. Does not destroy the current state on each update
  2. Allows us to recover from crashes
  3. Can be flushed to disk reliably

This leads us to a design that’s central to database internals: the append-only log.


Enter Write-Ahead Logging (WAL)

Write-Ahead Logging turns the problem on its head:

Instead of writing directly to the main store, we first write the update to a separate log file and only then apply it to the in-memory or persistent state.

In other words, before mutating any data structure, you record the intention to mutate it in a log.

Example Flow

  1. You want to put("name", "alice")
  2. You first append {"put": {"key": "name", "value": "alice"}} to a file called wal.log
  3. You flush this file to disk
  4. Then you apply the operation to the main key-value store

If your program crashes at any point after step 2, you can simply replay the log on restart and reconstruct the state.


Why WAL Works

The strength of WAL lies in two properties:

1. Append-Only Writes

Log entries are always written at the end of the file. This avoids partial overwrites and allows for atomic disk operations.

2. Replay Capability

Since the log contains the full history of changes, the current state can be rebuilt from scratch at any time by replaying the log.

3. Separation of Concerns

The WAL serves as a durable history of changes. The main store (e.g., a file or in-memory structure) can be optimized for performance and updated asynchronously.

WAL isn’t just for databases. Modern filesystems like XFS and Btrfs use a similar journaling mechanism for reliability, and even Microsoft Word uses a form of WAL to recover your unsaved documents.


Where WAL Is Used

Write-Ahead Logging is used in almost every major database:

  • PostgreSQL: Uses WAL to support crash recovery and replication
  • LevelDB / RocksDB: Use WAL before flushing to SSTables
  • SQLite: Switches to WAL mode for performance and durability
  • Kafka: Uses an append-only log as its core design

It’s not just a low-level hack, it’s a deliberate design for correctness.


DIY Time - Build Your Own Basic WAL (in C)

Let’s roll up our sleeves and build a minimal Write-Ahead Logging (WAL) system from scratch in C.

We’ll create a simple key-value database that:

  • Logs every put operation to an append-only file
  • Replays the log to recover the latest state
  • Can survive crashes using the WAL

You can replicate this in Python, Rust, or Go, the principles remain the same.


System Components

Following are the three core parts of our WAL system:

1. WAL File – wal.log

This file captures every write (aka put) operation, line by line:

name=Alice
age=30
  • Append-only: Never delete or modify old entries.
  • Durable: Persisted to disk immediately.

This is our reliable source of truth, used to replay state on recovery.


2. State File – state.txt

This represents the current snapshot of the database.

It is updated every time we call replay_log(), which processes all entries in the WAL and writes the latest value of each key.

  • Easy to parse
  • Rewritten fully on each update (for simplicity)

3. Crash Recovery Logic

Simulates recovery after a crash by reconstructing the full state from just the WAL file.

This is the essence of WAL:

If the app crashes before saving state, the WAL can replay it exactly.


Full C Implementation

Here's the code. We'll explain it block by block below:

// ------------------------------
// Write-Ahead Log (WAL) in C
// ------------------------------

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define LOG_FILE "wal.log"
#define STATE_FILE "state.txt"
#define MAX_LINE 256

These includes and macros set up the environment:

  • wal.log: Our log file
  • state.txt: Our latest state file

append_log(): Appending to WAL

void append_log(const char *key, const char *value) {
    FILE *log = fopen(LOG_FILE, "a");
    if (!log) {
        perror("Failed to open WAL");
        exit(1);
    }
    fprintf(log, "%s=%s\n", key, value);
    fclose(log);
}

What it does:

  • Appends a single key=value line to the WAL.
  • Simple and durable. If you fsync() after fclose(), it becomes crash-resilient.

Fun Fact: Almost every database, from SQLite to Postgres, has an equivalent log append function under the hood!


replay_log(): Rebuilding the State

void replay_log() {
    FILE *log = fopen(LOG_FILE, "r");
    if (!log) {
        perror("No WAL found. Starting fresh.");
        return;
    }

    FILE *state = fopen(STATE_FILE, "w");
    if (!state) {
        perror("Failed to open state file");
        exit(1);
    }

    char line[MAX_LINE];
    while (fgets(line, sizeof(line), log)) {
        fprintf(state, "%s", line); // naive overwrite
    }

    fclose(log);
    fclose(state);
}

Purpose:

  • Reads each log line.
  • Writes it to the state.txt file, not very smart yet, since duplicates aren’t removed.

Later, you could replace this with a hashmap to keep only the latest value per key!


get_value(): Querying the State File

void get_value(const char *key) {
    FILE *state = fopen(STATE_FILE, "r");
    if (!state) {
        printf("No state available.\n");
        return;
    }

    char line[MAX_LINE];
    while (fgets(line, sizeof(line), state)) {
        char fkey[64], fval[128];
        sscanf(line, "%[^=]=%s", fkey, fval);
        if (strcmp(fkey, key) == 0) {
            printf("%s => %s\n", fkey, fval);
            fclose(state);
            return;
        }
    }

    printf("Key '%s' not found.\n", key);
    fclose(state);
}

Why it matters:

  • Allows querying values from the rebuilt state.txt.
  • Uses sscanf() to parse key=value strings.
  • Currently shows the first match, which works because we overwrite the whole state each time.

main(): Tying It All Together

int main() {
    printf("Replaying WAL to reconstruct state...\n");
    replay_log();  // crash recovery

    char command[10];
    char key[64], value[128];

    while (1) {
        printf("\nEnter command (put/get/exit): ");
        scanf("%s", command);

        if (strcmp(command, "put") == 0) {
            printf("Key: ");
            scanf("%s", key);
            printf("Value: ");
            scanf("%s", value);

            append_log(key, value);
            replay_log();  // naive checkpointing
        } else if (strcmp(command, "get") == 0) {
            printf("Key: ");
            scanf("%s", key);
            get_value(key);
        } else if (strcmp(command, "exit") == 0) {
            break;
        } else {
            printf("Unknown command\n");
        }
    }

    return 0;
}

What's happening:

  • On start, we simulate a crash recovery by calling replay_log().
  • Then we enter a REPL (read-eval-print loop) for put, get, and exit.

What’s Next?

Here are some ways to evolve this system:

  • Use an in-memory map while replaying so only the latest key-values are kept
  • Add a delete command with tombstones (like key=__deleted__)
  • Add a fsync() for real durability
  • Implement checkpointing after N writes to flush WAL

Thanks for reading! I hope this hands-on introduction to Write-Ahead Logging gave you a solid understanding of how databases handle durability and recovery behind the scenes. Stay tuned and happy coding :)