Multi-dimensional iterators over numeric ranges with configurable access patterns.
- Distribution-based iteration: Sequential, uniform random, Zipfian, and more
- Distribution trait: Standardized sampling interface with struct-based implementations
- Multi-dimensional: Iterate over Cartesian products with per-dimension distributions
- Group splitting: Split ranges into overlapping groups for concurrent workloads
- Bitmap filtering: Filter iteration based on atomic bitmap state
- Constant memory: O(dimensions) memory regardless of range sizes
- Thread-safe: Lock-free atomic bitmap for concurrent access
- Zero-allocation iteration: SmallVec-based coordinates for ≤4 dimensions
- Optimized strides: Const-generic stride computation for common dimension counts
This crate uses the following external dependencies for optimized performance:
| Dependency | Version | Purpose |
|---|---|---|
fastrand |
2.0 | High-performance PRNG for stochastic distributions |
smallvec |
1.11 | Stack-allocated vectors for coordinate storage (≤4 dims) |
num-integer |
0.1 | Optimized GCD computation for coprime multipliers |
use iterators_rs::{MultiDimIterator, Dist, Coords};
// Create a 2D iterator with sequential access on both dimensions
let iter = MultiDimIterator::new(vec![
(0..10, Dist::Sequential),
(0..5, Dist::Sequential),
]);
// Iterate over all 50 combinations
// Coords is SmallVec<[u64; 4]> - stack-allocated for ≤4 dimensions
for coords in iter {
println!("({}, {})", coords[0], coords[1]);
}use iterators_rs::{Dist, DistState};
// Check distribution ordering
assert!(Dist::Sequential.is_ordered());
assert!(!Dist::Uniform.is_ordered());
// Use DistState for sampling
let mut state = DistState::new(Dist::Sequential, 10, 12345);
assert_eq!(state.sample(), 0);
assert_eq!(state.sample(), 1);
// Use DistState for random sampling
let mut uniform = DistState::new(Dist::Uniform, 1000, 12345);
let value = uniform.sample();
assert!(value < 1000);The Distribution trait provides a standardized interface for sampling:
use iterators_rs::{Distribution, SequentialDist, PermuteDist};
use fastrand::Rng;
// Use struct-based distributions with the Distribution trait
let mut seq = SequentialDist::new(10);
let mut rng = Rng::with_seed(42);
assert_eq!(seq.sample(&mut rng), 0);
assert_eq!(seq.sample(&mut rng), 1);
assert!(seq.is_ordered());
// Compose distributions using wrappers
let mut permuted = PermuteDist::with_coprime(SequentialDist::new(100));
let sample = permuted.sample(&mut rng);
assert!(sample < 100);Available distribution structs:
SequentialDist- Sequential: 0, 1, 2, ..., N-1 (ordered)SemiSequentialDist- Permuted sequential (ordered)UniformDist- Uniform random (unordered)ZipfianDist- Zipfian distribution (unordered)LatestDist- Latest-N bias (unordered)ExponentialDist- Exponential decay (unordered)NormalDist- Normal/Gaussian (unordered)HotspotDist- Hotspot distribution (unordered)PermuteDist<D>- Permutation wrapperReverseDist<D>- Reverse wrapper
The Coords type is a SmallVec<[u64; 4]> that provides stack allocation for common 1-4 dimension cases:
use iterators_rs::{MultiDimIterator, Dist, Coords};
let iter = MultiDimIterator::new(vec![
(0..10, Dist::Sequential),
(0..5, Dist::Sequential),
]);
// Coords is stack-allocated for ≤4 dimensions
for coords in iter.take(3) {
// Access coordinates like a slice
println!("({}, {})", coords[0], coords[1]);
}Split iteration ranges into overlapping groups for concurrent workloads:
use iterators_rs::{MultiDimIterator, Dist};
let iter = MultiDimIterator::new(vec![
(0..100, Dist::Sequential),
(0..50, Dist::Sequential),
]);
// Split into 4 groups with 25% overlap on dim0, no overlap on dim1
let groups = iter.split_into_groups(4, vec![0.25, 0.0]);
assert_eq!(groups.len(), 4);
// Each group is an independent iterator
for (i, group) in groups.into_iter().enumerate() {
println!("Group {} has {} dimensions", i, group.num_dimensions());
}When splitting an iterator with a bitmap filter attached, the filter configuration is preserved for each group. All groups share the same underlying bitmap via Arc, enabling concurrent filtered iteration.
Filter iteration based on atomic bitmap state:
use iterators_rs::{MultiDimIterator, Dist, FilterCondition, AtomicBitmap};
use std::sync::Arc;
let bitmap = Arc::new(AtomicBitmap::with_capacity(100));
bitmap.set(42);
bitmap.set(67);
// Create iterator with bitmap filter - only return coords where bit is set
let iter = MultiDimIterator::new(vec![(0..100, Dist::Sequential)])
.with_bitmap_filter(bitmap, FilterCondition::Set, 1);
// Will only yield coordinates 42 and 67
let results: Vec<_> = iter.collect();
assert_eq!(results.len(), 2);Atomically claim unique entries across multiple threads:
use iterators_rs::{MultiDimIterator, Dist, FilterCondition, UpdateAction, AtomicBitmap};
use std::sync::Arc;
use std::thread;
let bitmap = Arc::new(AtomicBitmap::with_capacity(1000));
// Create iterator with bitmap filter attached BEFORE splitting
// The split_into_groups method preserves the bitmap filter for each group
let iter = MultiDimIterator::new(vec![(0..1000, Dist::Uniform)])
.with_limit(100)
.with_bitmap_filter(bitmap.clone(), FilterCondition::Unset, 1);
// Split into 4 groups - each group inherits the bitmap filter configuration
let groups = iter.split_into_groups(4, vec![0.0]);
// Each thread receives a pre-configured iterator with bitmap filter
let handles: Vec<_> = groups.into_iter().map(|mut group| {
thread::spawn(move || {
let mut claimed = 0u64;
while let Some(_coords) = group.next_and_update(UpdateAction::Set) {
claimed += 1;
}
claimed
})
}).collect();
let total: u64 = handles.into_iter().map(|h| h.join().unwrap()).sum();
// Each claimed entry is unique across all threadsThe split_into_groups method preserves bitmap filters, so you can configure the iterator once and split it for concurrent use. All groups share the same underlying bitmap via Arc, enabling thread-safe atomic operations.
Filter iteration based on multiple bitmap conditions using AND logic. This is useful for scenarios like "delete entries that exist AND are NOT protected":
use iterators_rs::{MultiDimIterator, Dist, FilterCondition, UpdateAction, AtomicBitmap};
use std::sync::Arc;
let exists_bitmap = Arc::new(AtomicBitmap::with_capacity(1000));
let protected_bitmap = Arc::new(AtomicBitmap::with_capacity(1000));
// Initialize: some entries exist, some are protected
for i in 0..500 {
exists_bitmap.set(i);
}
for i in [10, 20, 30, 100, 200] {
protected_bitmap.set(i);
}
// Create iterator with composite filter:
// - Primary: exists_bitmap is SET (entry exists) - can be updated
// - Secondary: protected_bitmap is UNSET (entry is NOT protected) - read-only
let mut iter = MultiDimIterator::new(vec![(0..1000, Dist::Sequential)])
.with_bitmap_filter(exists_bitmap.clone(), FilterCondition::Set, 1)
.with_secondary_filter(protected_bitmap.clone(), FilterCondition::Unset);
// Atomically delete non-protected entries
let mut deleted = 0u64;
while let Some(coords) = iter.next_and_update(UpdateAction::Clear) {
deleted += 1;
// Entry is now cleared from exists_bitmap
}
// Protected entries (10, 20, 30, 100, 200) remain in exists_bitmap
assert!(exists_bitmap.test(10));
assert!(exists_bitmap.test(20));Key points:
- The first bitmap (primary) can be updated via
next_and_update() - Secondary filters are read-only and provide additional filtering
- All conditions must match (AND logic)
- Chain multiple
.with_secondary_filter()calls for complex conditions
The MultiDimIterator implements standard Rust iterator traits:
Iterator- Core iterationFusedIterator- GuaranteesNoneafter exhaustionExactSizeIterator- For fully-ordered iterators without bitmap filter
use iterators_rs::{MultiDimIterator, Dist};
let iter = MultiDimIterator::new(vec![
(0..10, Dist::Sequential),
(0..5, Dist::Sequential),
]);
// ExactSizeIterator: know exact remaining count
assert_eq!(iter.len(), 50);
// Standard adapters work as expected
let first_10: Vec<_> = iter.take(10).collect();
assert_eq!(first_10.len(), 10);Choose distributions based on your workload characteristics:
| Distribution | Use Case | Parameters |
|---|---|---|
Sequential |
Bulk load, scans, ordered iteration | - |
SemiSequential |
Deterministic scatter (reproducible) | multiplier |
Uniform |
Random access, uniform load | - |
Zipfian |
Web cache (~80/20 rule), hot keys | skew: 0.99 typical |
Hotspot |
Explicit hot/cold split | hot_pct, hot_prob |
Latest |
Append-heavy (logs, streams) | recent_pct, recent_prob |
Exponential |
Time-decay (sessions, TTL) | lambda |
Normal |
Range queries, centered access | mean_pct, std_pct |
When splitting ranges into groups, overlap controls how much adjacent groups share:
No overlap (O=0.0) - Disjoint partitions:
Range 0..100, 4 groups, overlap=0.0
Group 0: 0..25
Group 1: 25..50
Group 2: 50..75
Group 3: 75..100
Partial overlap (O=0.25) - 25% shared between adjacent groups:
Range 0..100, 4 groups, overlap=0.25
Group 0: 0..31 (31% of range)
Group 1: 23..54 (25% overlap with group 0)
Group 2: 46..77 (25% overlap with group 1)
Group 3: 69..100 (25% overlap with group 2, wraps to group 0)
Full overlap (O=1.0) - All groups see entire range (for contention testing).
The iterator supports hierarchical key spaces common in databases:
group_i = {prefix_i:k | k ∈ range_i} # Keys by prefix
group_ik = {prefix_i:k:j | j ∈ range_ik} # Fields within key k
Example:
user:0 → {field:0, field:1, field:2} # user:0 has 3 fields
user:1 → {field:0, field:1} # user:1 has 2 fields
hash:0 → {name:0, name:1, name:2}
Model this with multi-dimensional iteration:
// 1000 keys × 100 fields each
let iter = MultiDimIterator::new(vec![
(0..1000, Dist::Zipfian { skew: 0.99 }), // key dimension (hot keys)
(0..100, Dist::Uniform), // field dimension
]);Run the example workloads:
# Database workload patterns (KV, hash, vector, time-series, graph)
cargo run --example workload_demo --release
# Advanced simulations (hot/cold tiers, GT protection, multi-threaded)
cargo run --example benchmark_simulations --releaseDemonstrates common database benchmark patterns:
| Scenario | Description | Key Features |
|---|---|---|
| 1. Simple KV | Prefill → overwrite → get | Sequential + Zipfian |
| 2. Hash Keys | 3M keys × 100 fields | 2D iteration |
| 3. Multi-threaded | Disjoint prefill, overlapping queries | split_into_groups |
| 4. Delete Cycle | Prefill → delete → rewrite → query | Bitmap filtering |
| 5. Vector Search | 1M vectors × 128 dims | Batch insert, hotspot |
| 6. Time-Series | Append-heavy with range queries | Latest + Exponential |
| 7. YCSB Workloads | Standard A-F patterns | Various distributions |
| 8. MT Vector Search | Concurrent insert/search/delete | Atomic claiming |
| 9. Graph DB | Vertices + edges, traversals | Power-law (Zipfian 1.2) |
| 10. Benchmark Suite | Bulk → warmup → steady → spike → cleanup | Mixed patterns |
Advanced scenarios with bitmap state management:
| Simulation | Description | Key Features |
|---|---|---|
| 1. Hot/Cold Tiers | 1%/9%/90% tier distribution | Hotspot distribution |
| 2. Vector GT Protection | 50K ground-truth vectors never deleted | with_secondary_filter |
| 3. MT GT Protection | Multi-threaded with protected entries | Split preserves filters |
| 4. Continuous Cycles | Load/delete/query steady-state | Atomic next_and_update |
| 5. Tiered Storage | L0/L1/L2 with promotion tracking | Multiple bitmaps |
Run benchmarks with:
cargo benchBenchmarks cover:
- PRNG performance (fastrand)
- SmallVec vs Vec allocation
- Distribution sampling
- Nested distribution performance
- Bitmap-filtered iteration
See LICENSE file for details.