Lock-Free Learned Index

Updated 15 January 2026

Lock-Free Learned Index is a concurrent in-memory structure that uses ML predictions and atomic operations to optimize search and update tasks.
Frameworks like BLI and Kanva blend statistical models with lock-free techniques, enabling efficient data access in multi-core environments.
The design guarantees progress and correctness through linearizability and lock-freedom, backed by rigorous empirical performance metrics.

A lock-free learned index is a concurrent in-memory data structure leveraging machine learning models to accelerate search and update operations, while guaranteeing progress without blocking threads through lock-free mechanisms such as atomic operations and Read-Copy-Update (RCU) pointer swapping. This class of index integrates the efficiency of learned indexes—where predictions from statistical models replace or augment classical comparison logic—with the scalability and compositional advantages of lock-free data structures, enabling significant performance gains on modern multi-core architectures. Recent designs include the Bucket-based Learned Index (BLI) (Dong et al., 14 Feb 2025) and the Kanva framework (Bhardwaj et al., 2023), both of which systematically relax traditional total-order constraints to enable lock-free updates and concurrent access.

1. Structural Principles of Lock-Free Learned Indexes

BLI and Kanva exemplify a shift from strictly sorted data structures to hybrid organizations that are “globally ordered, locally unsorted” or employ shallow trees of model-guided nodes linked to lock-free bins.

In BLI (Dong et al., 14 Feb 2025):

The index is organized as a hierarchy of Segments, each covering a disjoint $[K_{min}, K_{max})$ key range. Each Segment maintains a linear regression model $f(x) = a \cdot x + b$ to predict the child bucket likely to contain a query key and a sorted array of child pointers grouped into segment-buckets (S-Buckets).
The leaf Segments point to Data-Buckets (D-Buckets): arrays of unsorted (key, value) pairs. While the set of D-Buckets is globally ordered by their minimum keys, internal slot order is arbitrary. Location within a D-Bucket is determined quickly via a hint function $h(key)$ , typically a hash or a linear mapping.
Since insertion does not disrupt within-bucket order, and only requires finding a free slot using CAS or atomic set of a valid bit, concurrent updates avoid the need for locking during both lookup and update.

In Kanva (Bhardwaj et al., 2023):

The structure consists of Modelled Nodes (MNodes) organized as a shallow, possibly unbalanced tree. Each node uses an array of split points (keys), a parallel array of simple linear models, and a set of child pointers.
The leaves are “Bins,” realized as non-blocking structures (single-level or two-level linked lists) admitting lock-free updates via atomic pointer manipulations (e.g., Harris’s CAS-based linked list).
Bins dynamically upgrade themselves to model nodes (“freezing” and conversion) as their local size or structure dictates.

By decoupling global addressability from local order-invariance, these structures achieve a high degree of concurrency, minimize coordination overhead, and enable in-place lock-free mutation.

2. Lock-Free Algorithms for Query and Update

Lookup and insert operations are executed via strictly lock-free routines, employing atomic instructions for synchronization and ensuring progress even under contention.

BLI Lookup (Dong et al., 14 Feb 2025):

Descend the Segment hierarchy using linear regression models and neighbor scans to route to the appropriate leaf segment and ultimately to a D-Bucket.
In the D-Bucket, compute $h_0 = h(key)$ , then probe slots $(h_0 + i) \mod C$ for $i \in [0, C-1]$ , returning the value if found, or aborting on encountering an invalid slot.
All probes are purely reads; writers only atomically set the valid bit post-insertion.

BLI Insert (Dong et al., 14 Feb 2025):

Locate the target D-Bucket as in lookup, and search for an unoccupied slot to claim using an atomic store on the valid bit.
If full, trigger a D-Bucket split, partitioning keys and installing new buckets via RCU-style pointer swap, with the old bucket garbage-collected post-drain.
Segment/array rewriting is effected by off-line allocation and atomic pointer swaps, never requiring locks.

Kanva Operations (Bhardwaj et al., 2023):

Insert and delete traverse the MNode hierarchy to the correct Bin; Bin-level insert utilizes CAS-based linked list operations or triggers Bin-to-MNode conversion via atomic pointer replacement if threshold conditions are surpassed.
Search operations are non-blocking, traversing models and split-point arrays to route queries.

This lock-free protocol is realized via strictly atomic stores/compares on slot validity or pointer fields, with structured helping and retry for progress.

3. Mathematical Models and Error Control

Both BLI and Kanva bind prediction-modeling error with explicit error bounds, ensuring finite, predictable search intervals for learned index traversal.

In BLI (Dong et al., 14 Feb 2025):

Each Segment fits a linear function $f(x) = a \cdot x + b$ over key-pivot ranges, minimizing average squared error. The modeling error is upper-bounded by $\epsilon_S$ :

$|f(pivot_i) - i| \leq \epsilon_S \quad \forall i \in [0, N-1]$

The D-Bucket hint function can take the form

$h(x) = \left\lfloor\frac{(x-\min_{key}) \cdot C}{\max_{key}-\min_{key}}\right\rfloor$

Split and merge operations for buckets, and re-segmentation for inner nodes, are triggered by fill ratio or error-threshold conditions.

In Kanva (Bhardwaj et al., 2023):

Each model $f(x) = a \cdot x + b$ 0 satisfies:

$f(x) = a \cdot x + b$ 1

Bin thresholds are set such that the cost of in-bin search remains subdominant to that of failure in the model prediction.
All time complexities are explicitly characterized in terms of the number of model levels $f(x) = a \cdot x + b$ 2, average models per node $f(x) = a \cdot x + b$ 3, error bound $f(x) = a \cdot x + b$ 4, and Bin threshold $f(x) = a \cdot x + b$ 5:

$f(x) = a \cdot x + b$ 6

4. Lock-Free Maintenance: Splits, Merges, and Retraining

Lock-free learned indexes require safe, concurrent support for structure-modification operations (SMOs), including splits, merges, and model retraining.

In BLI (Dong et al., 14 Feb 2025):

D-Bucket splits are triggered when all slots are filled; the contents are repartitioned around the median key, new buckets are constructed off-line, and installed via CAS pointer swaps.
Segment scaling, split, and merge operations are controlled via error thresholds and executed by computing new regression models and rebuilding affected segments in parallel, with atomic pointer swaps (RCU) replacing segment or bucket arrays in the parent.
Readers are protected by a grace period, ensuring memory safety, and never require locks.

In Kanva (Bhardwaj et al., 2023):

Bin-to-MNode promotions (“helpMakeModel”) employ a freeze bit to signal structure conversion. All key-version pairs are recopied into new nodes and models built offline, with a CAS pointer replacement as the linearization point.
The window of “helping” ensures any thread encountering a frozen Bin can participate in and complete the conversion.

These strategies ensure atomicity and linearizability by restricting all shared-memory updates to atomic swaps, maintaining invariants of path uniqueness and no-key-loss throughout concurrent SMOs.

5. Multi-Threaded Performance and Empirical Findings

Both BLI and Kanva demonstrate that relaxing local ordering and replacing lock-based synchronization with lock-free primitives yields substantial performance gains under concurrent workloads.

Performance Metrics (BLI) (Dong et al., 14 Feb 2025):

Single-thread (mixed 1:1 read/write, facebook dataset): BLI achieves approximately 2.21× higher throughput than state-of-the-art (ALEX or LIPP).
24-core, 7:3 read/write: BLI reaches 3.91× the throughput of ALEX+ or LIPP+, with measured speedups exceeding 3.9×.
Measured latencies: Segment lookup per level ≈ 20 ns; D-Bucket lookup ≈ 70 ns; 99th percentile leaf-node scan ≈ 100–150 ns.

Performance Metrics (Kanva) (Bhardwaj et al., 2023):

Read-heavy, uniform distribution (64 threads): Kanva (no GC) achieves 18 million ops/sec, outpacing FINEdex (lock-based learned index), C-IST, and LFABT.
Update-heavy: Kanva maintains superior performance, with linear scaling demonstrated up to 64 threads.
Cache behavior: Kanva reduces LLC misses compared to classical lock-free trees.

These empirical results confirm that lock-free learned indexes afford both high raw throughput and robust scalability, substantially outclassing both classical and lock-based learned structures under realistic multi-core conditions.

6. Correctness and Theoretical Guarantees

Both frameworks establish formal correctness by demonstrating linearizability and lock-freedom.

In Kanva (Bhardwaj et al., 2023):

Unique path invariance ensures every key query follows a single deterministic root-to-leaf path.
No-key-loss is guaranteed through atomic collect-and-reinstall during Bin-to-MNode conversions.
Linearization points correspond to successful atomic operations (CAS), with operations ordered by these events yielding a sequentially legal history.
Lock-freedom is ensured by the property that some thread always completes an operation after a finite number of retries/helping steps.

In BLI (Dong et al., 14 Feb 2025), analogous reasoning based on RCU safety, versioning via atomic valid bits, and pointer swaps ensures concurrent safety and progress.

7. Limitations and Open Directions

Lock-free learned indexes exhibit several known limitations and open areas for future research:

Under high skew/hotspot contention, especially with shallow trees (Kanva), cache and coordination contention may degrade performance.
In adversarial or pathological key distributions, learned models may incur larger error bounds or necessitate re-segmentation, impacting worst-case operational efficiency.
Fine-tuning parameters such as model error threshold ( $f(x) = a \cdot x + b$ 7), bin threshold ( $f(x) = a \cdot x + b$ 8), and merge policies remain workload-dependent and may require adaptive strategies.
Future directions include more expressive piecewise-linear or spline regression models, dynamically adaptive thresholds based on observed contention, and hybrid synchronization for bins under varying workloads.

Continued advances in model-guided data partitioning, combined with advances in lock-free memory reclamation and atomic SMO orchestration, are likely to further extend the applicability and efficiency of lock-free learned index structures in high-concurrency storage and retrieval applications (Dong et al., 14 Feb 2025, Bhardwaj et al., 2023).

Markdown Report Issue Upgrade to Chat

References (2)

BLI: A High-performance Bucket-based Learned Index with Concurrency Support (2025)

Learned Lock-free Search Data Structures (2023)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Lock-Free Learned Index.

Lock-Free Learned Index

1. Structural Principles of Lock-Free Learned Indexes

2. Lock-Free Algorithms for Query and Update

3. Mathematical Models and Error Control

4. Lock-Free Maintenance: Splits, Merges, and Retraining

5. Multi-Threaded Performance and Empirical Findings

6. Correctness and Theoretical Guarantees

7. Limitations and Open Directions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics