Recursive Model Index (RMI)
- Recursive Model Index (RMI) is a learned indexing method that employs a hierarchy of models to approximate key-to-position mappings and narrow search ranges.
- It uses multi-layered predictive models where each level refines the approximation, thereby achieving sub-logarithmic average lookup performance.
- Benchmarks indicate that RMIs can deliver competitive lookup latencies and memory trade-offs compared to both traditional and other learned index structures.
RadixSpline is a learned index structure designed to approximate the mapping from sorted keys to their array positions with high efficiency. It combines a piecewise-linear error-bounded spline with a compact radix table to facilitate single-pass construction, competitive lookup performance, and simple parameter tuning. RadixSpline has been demonstrated to be competitive in size and lookup latency with state-of-the-art learned indexes, such as Recursive Model Indexes (RMIs), while being substantially simpler to implement and build (Kipf et al., 2020).
1. Formal Specification
A RadixSpline index is defined over a sorted list of key/position pairs
and consists of two components:
(a) Error-bounded spline approximation:
The spline is a piecewise-linear function comprising knots where are keys, are positions, and for moderate error bound . For each , For all data points, .
(b) Flat radix table:
A table maps the most significant bits (after removing any fixed leading prefix) of a query key to an interval in the knot array. For a radix value ,
- : smallest index such that 's top--bit prefix
- : likewise for prefix At lookup, the spline segment for is bracketed between knots and .
2. Single-Pass Construction
RadixSpline supports single-pass construction by employing the Greedy Spline Corridor algorithm. The spline and radix table are built together in a single scan over the sorted keys.
Key steps:
- Initialize the spline with the first data point.
- Track a corridor (range of feasible slopes) that maintains the error bound .
- When the corridor is violated by a new data point, emit a new knot, and update the radix table for the affected prefix region.
- The process continues, emitting knots and filling the radix table on-the-fly.
- After the last point, fill any unassigned entries in with the most recent value.
Complexity:
- Time: ; each key is examined once.
- Space: , where for uniform data and is user-specified.
Pseudocode Overview:
The construction pseudocode, as presented in the source, is as follows (abbreviated here for clarity; exact code and formulas in (Kipf et al., 2020)):
1 2 3 4 5 6 7 8 9 10 |
def BuildRadixSpline(D, E, r): Initialize state, knots, T for each (k, p) in D[1:]: Update slope corridor if violated: emit knot, fill radix table reset corridor emit final knot, fill radix table finalize T return (knots, T) |
3. Lookup Procedure and Complexity
Index lookup consists of three phases:
- Radix Table Bracketing: Extract the top- bits of the query, obtain , .
- Binary Search on Knots: Identify segment such that within knots .
- Final Binary Search in Array: Compute predicted position via spline interpolation, then search in for the true key.
Complexity:
- for radix table access.
- for the knot range binary search; typically near constant with appropriate .
- for the final search in the array segment.
The worst-case overall complexity is , but with tuned parameters typical queries achieve sub-logarithmic average performance.
4. Parameter Space and Trade-off Analysis
RadixSpline has two parameters:
- : Error bound on the spline interpolation.
- : Number of radix bits used ( table entries).
Trade-offs:
- Decreasing yields more knots (), increasing spline memory but reducing the refinement window during lookup.
- Increasing enlarges the radix table but reduces the average length of knot ranges for each radix slot, thus limiting the binary search span over knots.
Heuristic:
Select so matches the available memory for the spline, then set .
Concrete Example (face dataset, ):
- : knots, total index size 650 MiB, lowest latency.
- : knots, index size 200 MiB, and lookup only 11.5% slower; space/time tradeoff is significantly improved for modest latency inflation.
5. Experimental Performance and Evaluation
Evaluation was performed using the SOSD benchmark with 200M 64-bit keys on an AWS c5.4xlarge (single-threaded) (Kipf et al., 2020).
- Competing methods: Binary Search (BS), STX B+-tree (stride=32), Adaptive Radix Tree (ART), Recursive Model Index (RMI), and RadixSpline (RS).
- Datasets: amzn, face, logn, osmc, wiki, etc.
Summary Table: Performance Metrics
| Index | Build Time (s) | Lookup Latency (ns/op) | Size (MiB) |
|---|---|---|---|
| BS | none | ~850 | ~100 |
| BTree | ~0.7 | ~600 | ~100 |
| ART | ~0.7 | ~300 | ~100 |
| RMI | 3–6 | 120–250 | ~100 |
| RS (tuned) | ~0.9 | 130–280 | 100–650 (dataset) |
Key findings highlight that RS achieves:
- Single-pass build time (0.9s), faster than RMI (3–6s).
- Competitive lookup latency (130–280ns).
- Tunable index size, matching or exceeding the compactness of non-learned and learned competitors depending on .
Trade-off trends (face):
- As increases (): build time lowers (1.2s0.4s), memory drops (650 MiB≤100 MiB), latency moderately degrades (180ns300ns).
- As increases (): table memory increases (4MiB256MiB), with only moderate latency reduction due to better knot bracketing.
LSM-Tree Case Study (RocksDB, osmc):
- Dropping B+-tree per SSTable for RS () on a 400M operation mixed workload:
- Read latency decreases by 20%
- Write latency increases by 4% (owing to one-pass builds on compaction)
- Total time reduced from 712s to 521s (−27%)
- Index memory reduced by 45% (freeing space for larger Bloom filters)
6. Implementation Recommendations and Optimization
- Minimalistic codebase (100 lines of idiomatic C++), without any third-party ML dependencies.
- Drop common most significant key prefixes prior to radix table construction to minimize and table size.
- Store knot arrays contiguously; use 32-bit integers for table entries (guaranteed for ).
- Fill the radix table in a left-to-right pass interleaved with knot emission for improved cache performance.
- In highly skewed data, radix slots may span more knots than average; fallback strategies (e.g., local tree index) may be applied per slot.
- The sole required algorithmic dependency is a fast binary search over integers.
7. Context and Related Work
RadixSpline leverages the GreedySplineCorridor algorithm of I. Apostolico et al. for efficient, one-pass , error-bounded spline approximation. It differs fundamentally from RMIs by:
- Being trainable in a single pass, versus multi-epoch or multi-model training for RMI
- Using only two interpretable parameters for tuning
- Supporting direct analytical reasoning about space/time trade-offs
The design intent is to enable system programmers to reliably tune and deploy single-pass learned indexes competitive with the best multi-pass techniques, using only standard library operations and algorithmic constructs (Kipf et al., 2020).