Product >
The OcientAIQ™ Unified Data Platform brings AI directly to petabyte-scale enterprise data so agents, analysts, and applications get trusted answers without moving data across fragmented systems.
Solutions >
OcientAIQ™ Solutions deliver trusted, production-grade agentic AI outcomes described in the language of your industry, built for the scale your operations require.
Company >
Founded in 2016, Ocient delivers trusted agentic AI solutions through OcientAIQ™, for the organizations that can't afford to get AI wrong.
Resources >
Explore in depth resources and perspectives, and learn how to get started with OcientAIQ™.
Published April 13, 2026

Compute-Adjacent Storage at Work: 24 Drives, 9.6 TB, 1 Query

Dave BoutcherBy Dave Boutcher, Distinguished Engineer at Ocient

A colleague from the scientific computing world asked me for a baseline to compare against other parallel file I/O benchmarks. My knee-jerk reaction? That’s backwards.

As a relational database engineer, I’ve spent my career optimizing systems to avoid disk I/O—through indexes, compression, schema design. Why would I benchmark the very thing I’m trying to minimize?

But the question stuck with me. Ocient’s architecture is fundamentally different from traditional databases. We don’t treat storage as something separate from compute. Our “compute-adjacent storage” model pushes processing down to the drive level itself. Data gets filtered and transformed right where it lives, not retrieved and then processed. For a distributed system handling trillions of rows, this matters. A lot.

So naturally, I got distracted by the challenge. I convinced our test team to let me borrow a small cluster—3 storage nodes, 8 NVMe drives each, 24 drives total. Ice Lake Xeon processors, DDR4 memory. Nothing fancy. The goal was simple: saturate all 24 drives with parallel reads and see what happens when you actually want maximum I/O.

Attempt 1: The Database Outsmarts Me

My plan was straightforward. Generate a table with 1.2 trillion random 64-bit integers. Query for a specific value. Force a full table scan. Random integers should be hard to compress, right?

Wrong.

Ocient has a convenient data generation table called sys.dummyN that produces N rows of monotonically incrementing integers. I wrote a quick pseudo-random number function and generated my 1.2 trillion rows. The database immediately optimized my “random” data during segment creation using delta-delta encoding. My carefully crafted random integers compressed beautifully.

This was not the result I wanted.

Attempt 2: Defeating My Own Optimizer

The fix required a proper rand() function and a bit of database judo. I added a monotonically increasing cluster key as the first column, forcing the database to sort on that key while leaving my random integer column unordered and uncompressed. Since Ocient is columnar, queries only touch the columns they need—the cluster key stays on disk, untouched.Result: 1.2 trillion genuinely random, uncompressed 64-bit integers. Plus one extra column I didn’t care about but needed for the trick to work.

The Benchmark Itself

The query was dead simple:

This spawned 24 parallel engines—eight per node, one per drive—each streaming and filtering data independently. Only matching results propagate upward. In this case, each of those 24 storage-adjacent engines sends a single 8-byte value up the stack: the count of matching entries. Network bandwidth? Not a factor. All the work happens at the drive level.

Runtime: 268 seconds. Consistent across multiple runs.

The Final Math

  • Total data: 1.2T records × 8 bytes = 9.6 TB
  • Aggregate throughput: 35.82 GB/s
  • Per-node throughput: 11.94 GB/s (3 nodes)
  • Per-drive throughput: 1.49 GB/s (24 drives)

For context, the Samsung NVMe drives we used are rated for 5.2 GB/s sequential reads and 850K IOPS for random 4KB reads (roughly 3.4 GB/s). We achieved between 29% and 44% of theoretical maximum across all 24 drives in parallel, sustained over the full 4.5-minute runtime.

What This Actually Means

Achieving 29–44% of theoretical max for sustained parallel reads across two dozen drives is solid. Not theoretical-maximum-in-a-lab solid, but real-world-under-load solid. This is exactly what compute-adjacent storage is designed for: maximize I/O throughput.

This was a fun exercise and I’m pleased with the result. I’m sure the storage team will read this and see it as a challenge to make it go even faster, but compared to competing solutions, I’ll take this as a massive win. As a database guy, of course, I itch to do:

I bet the query would run faster…

You can learn more about Compute-Adjacent Storage architecture on YouTube here.