Product
Ocient Favicon
The Ocient Hyperscale Data Warehouse

To deliver next-generation data analytics, Ocient completely reimagined data warehouse design to deliver real-time analysis of complex, hyperscale datasets.

Learn More
Pricing Icon
Pricing

Ocient is uniquely designed for maximum performance and flexibility with always-on analytics, maximizing your hardware, cloud, or data warehouse as a service spend. You get predictable, lower costs (and absolutely zero headaches).

See How
Solutions
Customer Solutions and Workload Services Icon
Customer Solutions and Workload Services

Ocient offers the only solutions development approach that enables customers to try a production-ready solution tailored to their business requirements before investing capital and resources.

Explore
Management Services Icon
Management Services

Tap into the deep experience of the Ocient Management Services team to set up, manage, and monitor your Ocient solution.

Learn More
Company
Ocient Favicon
About Ocient

In 2016 our team of industry veterans began building a hyperscale data warehouse to tackle large, complex workloads.

Learn More
Ocient Sustainability Icon
Sustainability

Our goal at Ocient is to minimize the energy demands and carbon footprint from analyzing large-scale data sets that require continuous, compute-intensive processing.

Learn More
Published March 23, 2026

Inside the Ocient Engine: A Deep Dive into Tuples

Unlock more with your data using tuples in Ocient

Co-Founder & Distinguished Engineer Jason ArnoldBy Jason Arnold, Co-Founder and Distinguished Engineer

Welcome back. In my previous post, we looked at Ocient’s approach to arrays, treating them not as an afterthought, but as a core, high-performance part of the data model.

Today, we are taking the next logical step and looking at a close sibling to the array: the tuple.

In Ocient, the tuple data type acts like an anonymous struct. Just like arrays, fields are accessed by position starting with 1. Currently, there is no field-name-based access—the internal fields are simply ordered elements. But as we’ll see, the way Ocient processes, compares, and stores these structures opens up some incredibly powerful querying techniques.

Let’s dig in.

Constructing Tuples and Type Interface

First, let’s look at how to create a tuple and access its elements.

Accessing a specific element is done using standard bracket notation:

A couple of things to notice here. The tuple’s inner types in the DDL are specified using <<...>>. It feels a bit like C++ templates or Java generics, except the angle brackets are doubled up. (Also, if you are fetching Ocient tuple types with the JDBC driver, they give you back a standard java.sql.Struct instance).

When constructing the tuple, you can either let the database auto-determine the types based on input, or you can explicitly define them:

Under the Hood: Virtual vs. Materialized

It’s worth briefly peeking under the hood at how Ocient stores these.

In tables on disk, all elements of a tuple are generally stored as separate columns. However, the view presented back to the user is a single, cohesive tuple column.

Essentially, tuples can be virtual (separate columns that the engine knows are related) or materialized (jammed into a single binary blob).

  • Tuples inside arrays are always materialized.

  • Tuples not in arrays are virtualized on disk, but might be materialized during intermediate query execution.

The optimizer is smart here: it tries to delay materializing the tuple for as long as possible, often only doing so right before sending rows back to the client.

Operations and Lexicographical Sorting

All standard operations work with tuples. You can sort by them, group by them, and join with them.

Comparison is done lexicographically. Ocient compares the first element; if they are equal, it moves on to comparing the second element, and so on. Let’s look at a simple ORDER BY to see this in action:

Notice how (1, "Z") comes before (2, "A"). The first element dictates the primary sort.

Nesting Arrays and Tuples

There are no restrictions on what types can go inside a tuple. Tuples can contain other nested tuples, tuples can contain arrays, and arrays can contain tuples.

If we tie this back to last month’s post on arrays, we can use array_agg to build an ordered array of tuples. This is a great way to group paired data (like Region + Revenue) deterministically:

Indexing Tuples

When it comes to performance, you can index tuples, but with one important caveat: you cannot index the entire tuple object at once. You have to index the specific elements you actually want to search against.

Pro-Tip: The “Tuple Max” Trick for Point-in-Time Queries

Finally, let’s look at how we can use the lexicographical sorting of tuples to solve a very common, very painful real-world problem.

Common customer workloads involve append-only tables that represent Slowly Changing Dimensions (SCD). Finding the “latest value” as of a specific point in time usually requires expensive historical scans, self-joins, or complex window functions.

But because we know MAX() on a tuple evaluates the first element before the second, we can pack a timestamp and a value into a tuple, take the MAX(), and easily extract the associated value without a self-join!

(Note: “balance” is a reserved word, so we quote it in our query!)

By wrapping the timestamp and the balance together tuple(ts, "balance"), the MAX function correctly identifies the row with the highest timestamp. We then simply ask for element [2] of that winning tuple. It’s an incredibly elegant workaround for extracting point-in-time state.

Tuples in Ocient are far more than just anonymous structs for holding disparate data types. Because of how they are inherently structured, compared, and nested alongside arrays, they unlock elegant solutions for complex aggregations and time-series challenges.

Experiment with these structures in sys.dummy and try wrapping your own state data into tuples. You might find it saves you a self-join or two!

As always, you can find full documentation of Ocient here.