ParadeDB recently launched!

"ParadeDB is a modern Elasticsearch alternative built on Postgres. Built for real-time, update-heavy workloads."

Founded by Philippe Noël

A New Postgres Block Storage Layout for Full Text Search

Image Credits: ParadeDB

ParadeDB recently completed one of their biggest engineering bets to date: migrating pg_search, a Postgres extension for full text search and analytics, to Postgres' block storage system. In doing so, pg_search is the first-ever extension1 to port an external file format to Postgres block storage.

For context - block storage is Postgres’ storage API that backs all of Postgres’ tables and built-in index types. Prior to this migration, pg_search operated outside of block storage. This means that the extension created files which were not managed by Postgres and could read the contents of those files directly from disk. While it’s not uncommon2 for Postgres extensions to do this, block storage has enabled pg_search to simultaneously achieve:

  1. Postgres write-ahead log (WAL) integration, which is necessary for physical replication of the index
  2. Crash and point-in-time recovery
  3. Full support for Postgres MVCC (multi-version concurrency control)
  4. Integration with Postgres’ buffer cache, which has led to massive improvements in index creation times and write throughput
Image Credits: ParadeDB

At first, they weren’t sure if reconciling the data access patterns and concurrency model of Postgres and Tantivy pg_search's underlying search library — was possible without drastic changes to Tantivy3. In this blog post, we’ll briefly dive into how they architected pg_search's new block storage layout and data access patterns.

In the near future, ParadeDB will release two more posts: one to discuss how they designed and tested pg_search to be MVCC-safe in update-heavy scenarios, and another to dive into how they customized the block storage layout for analytical workloads (e.g. faceted search, aggregates).

What Is Block Storage?

The fundamental unit of block storage is a block: a chunk of 8192 bytes. When executing a query, Postgres reads blocks into buffers, which are stored in Postgres’ buffer cache.

DML (INSERT, UPDATE, DELETE, COPY) statements do not modify the physical block. Instead, their changes are written to the underlying buffers, which are later flushed to disk when evicted from the buffer cache or during a checkpoint.

If Postgres crashes, modifications to buffers that have not been flushed can become lost. To guard against this, any changes to the index must be written to the write-ahead log (WAL). During crash recovery, Postgres replays the WAL to restore the database to its most recent state.

What Is pg_search?

pg_search is a Postgres extension that implements a custom index for full text search and analytics. The extension is powered by Tantivy, a search library written in Rust and inspired by Lucene.

Why Migrate To Block Storage?

A custom Postgres index has two choices for persistence: use Postgres block storage or the filesystem. At first, using the filesystem may seem like the easier option. Integrating with block storage requires solving a series of problems:

  1. Some data structures may not fit within a single 8KB block. Splitting data across multiple blocks can create lock contention, garbage collection, and concurrency challenges.
  2. Once a block has been allocated to a Postgres index, it cannot be physically deleted — only recycled. This means that the size of an index strictly increases until a VACUUM FULL or REINDEX is run. The index must be careful to return blocks that have been tombstoned by deletes or vacuums to Postgres’ free space map for reuse.
  3. In update-heavy scenarios, the index can become dominated by space that once belonged to dead (i.e. deleted) rows. This may increase the number of I/O operations required for searches and updates, which degrades performance. The index must find ways to reorganize and compact the index during vacuums.
  4. Because Postgres is single-threaded, multiple threads cannot concurrently read from block storage (this will be covered in detail in Parts 2 and 3 of ParadeDBs blog post). The index may need to leverage Postgres’ parallel workers.

Once the index overcomes these hurdles, however, Postgres block storage does an incredible amount of heavy lifting. After a year of working with the filesystem, it became clear that block storage was the way forward.

  1. Being able to use the buffer cache means a huge reduction in disk I/O and massive improvements to read and write throughput.
  2. Postgres provides a simple, battle-tested API to write buffers to the WAL. Without block storage, extensions must define custom WAL record types and implement their own WAL replay logic, which drastically increases complexity and surface area for bugs.
  3. Postgres handles the physical creation and cleanup of files for us. The index doesn’t need to clean up after aborted transactions or DROP INDEX statements.

The complete blog post by ParadeDB can be read here. In the following parts of the blog post series, they will dive into more exciting challenges faced with block storage, with a focus on concurrency and analytical performance.

Learn More

🌐 Visit www.paradedb.com to learn more.
⭐ Check out the Github repo for more details or join the ParadeDB community with questions. And please don’t hesitate to give them a star!
👣 Follow ParadeDB on LinkedIn & X.

Posted 
February 27, 2025
 in 
Launch
 category
← Back to all posts  

Join Our Newsletter and Get the Latest
Posts to Your Inbox

No spam ever. Read our Privacy Policy
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.