By Jason Arnold, Co-Founder and Distinguished Engineer
Welcome back. In my previous post, we looked at Ocient’s approach to arrays, treating them not as an afterthought, but as a core, high-performance part of the data model.
Today, we are taking the next logical step and looking at a close sibling to the array: the tuple.
In Ocient, the tuple data type acts like an anonymous struct. Just like arrays, fields are accessed by position starting with 1. Currently, there is no field-name-based access—the internal fields are simply ordered elements. But as we’ll see, the way Ocient processes, compares, and stores these structures opens up some incredibly powerful querying techniques.
Let’s dig in.
Constructing Tuples and Type Interface
First, let’s look at how to create a tuple and access its elements.

Accessing a specific element is done using standard bracket notation:
A couple of things to notice here. The tuple’s inner types in the DDL are specified using <<...>>. It feels a bit like C++ templates or Java generics, except the angle brackets are doubled up. (Also, if you are fetching Ocient tuple types with the JDBC driver, they give you back a standard java.sql.Struct instance).
When constructing the tuple, you can either let the database auto-determine the types based on input, or you can explicitly define them:
Under the Hood: Virtual vs. Materialized
It’s worth briefly peeking under the hood at how Ocient stores these.
In tables on disk, all elements of a tuple are generally stored as separate columns. However, the view presented back to the user is a single, cohesive tuple column.
Essentially, tuples can be virtual (separate columns that the engine knows are related) or materialized (jammed into a single binary blob).
-
Tuples inside arrays are always materialized.
-
Tuples not in arrays are virtualized on disk, but might be materialized during intermediate query execution.
The optimizer is smart here: it tries to delay materializing the tuple for as long as possible, often only doing so right before sending rows back to the client.
Operations and Lexicographical Sorting
All standard operations work with tuples. You can sort by them, group by them, and join with them.
Comparison is done lexicographically. Ocient compares the first element; if they are equal, it moves on to comparing the second element, and so on. Let’s look at a simple ORDER BY to see this in action:

Notice how (1, "Z") comes before (2, "A"). The first element dictates the primary sort.
Nesting Arrays and Tuples
There are no restrictions on what types can go inside a tuple. Tuples can contain other nested tuples, tuples can contain arrays, and arrays can contain tuples.
If we tie this back to last month’s post on arrays, we can use array_agg to build an ordered array of tuples. This is a great way to group paired data (like Region + Revenue) deterministically:

Indexing Tuples
When it comes to performance, you can index tuples, but with one important caveat: you cannot index the entire tuple object at once. You have to index the specific elements you actually want to search against.

Pro-Tip: The “Tuple Max” Trick for Point-in-Time Queries
Finally, let’s look at how we can use the lexicographical sorting of tuples to solve a very common, very painful real-world problem.
Common customer workloads involve append-only tables that represent Slowly Changing Dimensions (SCD). Finding the “latest value” as of a specific point in time usually requires expensive historical scans, self-joins, or complex window functions.
But because we know MAX() on a tuple evaluates the first element before the second, we can pack a timestamp and a value into a tuple, take the MAX(), and easily extract the associated value without a self-join!
(Note: “balance” is a reserved word, so we quote it in our query!)

By wrapping the timestamp and the balance together tuple(ts, "balance"), the MAX function correctly identifies the row with the highest timestamp. We then simply ask for element [2] of that winning tuple. It’s an incredibly elegant workaround for extracting point-in-time state.
Tuples in Ocient are far more than just anonymous structs for holding disparate data types. Because of how they are inherently structured, compared, and nested alongside arrays, they unlock elegant solutions for complex aggregations and time-series challenges.
Experiment with these structures in sys.dummy and try wrapping your own state data into tuples. You might find it saves you a self-join or two!
As always, you can find full documentation of Ocient here.

