Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Introduces a new hash-sorted-map crate implementing the first part of an insertion-only HashSortedMap with SIMD-accelerated group scanning and overflow chaining, plus accompanying benchmarks and design documentation.
Changes:
- Adds core
HashSortedMapimplementation (insert/get/entry API) with growth and overflow chaining. - Adds platform-specific group scanning primitives (SSE2/NEON/scalar) and a basic test suite.
- Adds a Criterion benchmarks crate and initial README/optimization docs.
Show a summary per file
| File | Description |
|---|---|
| crates/hash-sorted-map/src/lib.rs | Exposes group_ops and hash_sorted_map modules. |
| crates/hash-sorted-map/src/hash_sorted_map.rs | Implements the HashSortedMap data structure, growth logic, and Entry API, plus tests. |
| crates/hash-sorted-map/src/group_ops.rs | Adds SIMD/scalar helpers for ctrl-byte matching and mask iteration. |
| crates/hash-sorted-map/benchmarks/performance.rs | Adds Criterion benchmarks comparing multiple map implementations. |
| crates/hash-sorted-map/benchmarks/lib.rs | Benchmark utilities (trigram generation, identity hasher). |
| crates/hash-sorted-map/benchmarks/Cargo.toml | Defines the benchmarks crate and its dependencies. |
| crates/hash-sorted-map/README.md | Documents motivation/design and includes benchmark results + running instructions. |
| crates/hash-sorted-map/OPTIMIZATIONS.md | Design/optimization analysis and rationale. |
| crates/hash-sorted-map/Cargo.toml | Declares the new hash-sorted-map crate metadata. |
| Cargo.toml | Adds the benchmarks crate to the workspace members. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comments suppressed due to low confidence (3)
crates/hash-sorted-map/src/hash_sorted_map.rs:560
insert_after_growtreatsInsertion::NeedsOverflowas unreachable after a resize, but it can still happen for adversarial / extremely-colliding hash distributions (the table may still need overflow groups, even immediately after growing). Instead ofunreachable!, handleNeedsOverflowby allocating/linking an overflow group (with the same capacity check as the normal insert path) or by retrying via the normalinsert_hashedlogic.
// After grow, the new primary group for `key` cannot be full (see
// function docs), and the key wasn't in the table before grow.
FindResult::Vacant(Insertion::NeedsOverflow { .. }) | FindResult::Found(_) => {
unreachable!("post-grow walk must hit an empty slot")
}
crates/hash-sorted-map/README.md:90
- The benchmark invocation doesn’t match the benchmark that’s defined in
benchmarks/Cargo.toml([[bench]] name = "performance").cargo bench --bench hashmap_insertwill fail; update the README to the correct command (likelycargo bench -p hash-sorted-map-benchmarks --bench performance, or run from thebenchmarks/crate).
```sh
cargo bench --bench hashmap_insert
**crates/hash-sorted-map/src/hash_sorted_map.rs:367**
* `insert_for_grow` allocates overflow groups without checking `self.num_groups` against `self.groups.len()`. Under heavy hash collisions during rehash, `new_gi` can reach `self.groups.len()` and the subsequent indexing will panic. Consider mirroring the capacity check used in `insert_hashed` (e.g., trigger another `grow()` / allocate more groups) so growth remains panic-free even for pathological hash distributions.
} else {
let new_gi = self.num_groups as usize;
self.groups[gi].overflow = new_gi as u32;
self.num_groups += 1;
group = &mut self.groups[new_gi];
break;
</details>
- **Files reviewed:** 10/10 changed files
- **Comments generated:** 3
| │ Vec<Group<K,V>> where each Group (AoS): │ | ||
| │ { ctrl: [u8; 8], keys: [MaybeUninit<K>; 8], │ | ||
| │ values: [MaybeUninit<V>; 8], overflow: u32 } │ | ||
| │ │ | ||
| │ • Overflow chaining (linked groups) │ | ||
| │ • 8-byte groups with NEON/SSE2/scalar SIMD scan │ | ||
| │ • EMPTY / FULL tag states only (insertion-only, no deletion) │ |
There was a problem hiding this comment.
This architecture diagram hard-codes 8-slot groups (ctrl: [u8; 8], keys: ...; 8], etc.) and says “8-byte groups with NEON/SSE2”, but the implementation uses GROUP_SIZE = 16 on x86_64. Please update the documentation to reflect the 8-or-16 group size (or describe it as GROUP_SIZE).
| [dependencies] | ||
| hash-sorted-map = { path = ".." } | ||
| criterion = "0.7" | ||
| rand = "0.9" | ||
| rustc-hash = "2" | ||
| ahash = "0.8" | ||
| hashbrown = "0.15" | ||
| foldhash = "0.1" | ||
| fnv = "1" |
There was a problem hiding this comment.
The workspace appears to standardize on rand = "0.10" and criterion = "0.8" in other benchmark/test crates, but this benchmarks crate uses criterion = "0.7" and rand = "0.9". This will pull in multiple versions and may break compilation if the rand APIs used in benchmarks/lib.rs expect the 0.10 API. Consider aligning these versions with the rest of the repository.
| group = &mut self.groups[overflow as usize]; | ||
| } else { | ||
| let new_gi = self.num_groups as usize; | ||
| self.groups[gi].overflow = new_gi as u32; |
There was a problem hiding this comment.
insert_for_grow links a newly allocated overflow group from self.groups[gi] (the primary group index) instead of from the current tail group. If the chain already had overflow groups, this will overwrite the primary’s overflow pointer and break the chain. Update this to set overflow on the current group being extended (the one whose overflow was NO_OVERFLOW).
This issue also appears in the following locations of the same file:
- line 362
- line 556
| self.groups[gi].overflow = new_gi as u32; | |
| group.overflow = new_gi as u32; |
jorendorff
left a comment
There was a problem hiding this comment.
This is cool.
Can I see the merge/iteration operations you have in mind, before stamping this?
| scanning the group. Gives a direct hit on most inserts at low load. | ||
| - **SIMD group scanning** — uses NEON on aarch64, SSE2 on x86\_64, and a | ||
| scalar fallback elsewhere to scan 8–16 control bytes in parallel. | ||
| - **AoS group layout** — each group stores its control bytes, keys, and values |
There was a problem hiding this comment.
Suggest linking Wikipedia here: https://en.wikipedia.org/wiki/AoS_and_SoA#Array_of_structures_of_arrays
Wikipedia calls this AoSoA (since each Group is a struct of arrays) or "tiled AoS"
| } | ||
|
|
||
| pub fn with_capacity_and_hasher(capacity: usize, hash_builder: S) -> Self { | ||
| let adjusted = capacity.checked_mul(8).unwrap_or(usize::MAX) / 7; |
There was a problem hiding this comment.
I think we need to adjust more than that, because the expected load factor is not as high as hashbrown.
I did a few simulations and we need to multiply by about 1.30 here, or 1.42 if GROUP_SIZE is 8. That is assuming alloc_groups overallocates by a further 1/8 and that we want with_capacity to be an honest effort to accommodate that many entries without growing.
There was a problem hiding this comment.
👍
(I had copilot compute those numbers as well, or at least something similar, just forgot to change the code)
Co-authored-by: Jason Orendorff <jorendorff@github.com>
|
Here is the very simple sorting of each group: In the ported version, we will need to introduce a new type, since the hash map will be broken after the sorting. Merging is implemented here: We could provide a helper function in the rust crate for this. Feel free to merge/make changes, since I will be OOO tomorrow. |
In a subsequent step, merging functions and sorted iterators will be added.