Skip to content

First part of a HashSortedMap#107

Open
aneubeck wants to merge 16 commits intomainfrom
aneubeck/prefixmap
Open

First part of a HashSortedMap#107
aneubeck wants to merge 16 commits intomainfrom
aneubeck/prefixmap

Conversation

@aneubeck
Copy link
Copy Markdown
Collaborator

In a subsequent step, merging functions and sorted iterators will be added.

Copilot AI review requested due to automatic review settings April 29, 2026 13:25
@aneubeck aneubeck requested a review from a team as a code owner April 29, 2026 13:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces a new hash-sorted-map crate implementing the first part of an insertion-only HashSortedMap with SIMD-accelerated group scanning and overflow chaining, plus accompanying benchmarks and design documentation.

Changes:

  • Adds core HashSortedMap implementation (insert/get/entry API) with growth and overflow chaining.
  • Adds platform-specific group scanning primitives (SSE2/NEON/scalar) and a basic test suite.
  • Adds a Criterion benchmarks crate and initial README/optimization docs.
Show a summary per file
File Description
crates/hash-sorted-map/src/lib.rs Exposes group_ops and hash_sorted_map modules.
crates/hash-sorted-map/src/hash_sorted_map.rs Implements the HashSortedMap data structure, growth logic, and Entry API, plus tests.
crates/hash-sorted-map/src/group_ops.rs Adds SIMD/scalar helpers for ctrl-byte matching and mask iteration.
crates/hash-sorted-map/benchmarks/performance.rs Adds Criterion benchmarks comparing multiple map implementations.
crates/hash-sorted-map/benchmarks/lib.rs Benchmark utilities (trigram generation, identity hasher).
crates/hash-sorted-map/benchmarks/Cargo.toml Defines the benchmarks crate and its dependencies.
crates/hash-sorted-map/README.md Documents motivation/design and includes benchmark results + running instructions.
crates/hash-sorted-map/OPTIMIZATIONS.md Design/optimization analysis and rationale.
crates/hash-sorted-map/Cargo.toml Declares the new hash-sorted-map crate metadata.
Cargo.toml Adds the benchmarks crate to the workspace members.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (3)

crates/hash-sorted-map/src/hash_sorted_map.rs:560

  • insert_after_grow treats Insertion::NeedsOverflow as unreachable after a resize, but it can still happen for adversarial / extremely-colliding hash distributions (the table may still need overflow groups, even immediately after growing). Instead of unreachable!, handle NeedsOverflow by allocating/linking an overflow group (with the same capacity check as the normal insert path) or by retrying via the normal insert_hashed logic.
        // After grow, the new primary group for `key` cannot be full (see
        // function docs), and the key wasn't in the table before grow.
        FindResult::Vacant(Insertion::NeedsOverflow { .. }) | FindResult::Found(_) => {
            unreachable!("post-grow walk must hit an empty slot")
        }

crates/hash-sorted-map/README.md:90

  • The benchmark invocation doesn’t match the benchmark that’s defined in benchmarks/Cargo.toml ([[bench]] name = "performance"). cargo bench --bench hashmap_insert will fail; update the README to the correct command (likely cargo bench -p hash-sorted-map-benchmarks --bench performance, or run from the benchmarks/ crate).

```sh
cargo bench --bench hashmap_insert
**crates/hash-sorted-map/src/hash_sorted_map.rs:367**
* `insert_for_grow` allocates overflow groups without checking `self.num_groups` against `self.groups.len()`. Under heavy hash collisions during rehash, `new_gi` can reach `self.groups.len()` and the subsequent indexing will panic. Consider mirroring the capacity check used in `insert_hashed` (e.g., trigger another `grow()` / allocate more groups) so growth remains panic-free even for pathological hash distributions.
        } else {
            let new_gi = self.num_groups as usize;
            self.groups[gi].overflow = new_gi as u32;
            self.num_groups += 1;
            group = &mut self.groups[new_gi];
            break;
</details>


- **Files reviewed:** 10/10 changed files
- **Comments generated:** 3


Comment on lines +34 to +40
│ Vec<Group<K,V>> where each Group (AoS): │
│ { ctrl: [u8; 8], keys: [MaybeUninit<K>; 8], │
│ values: [MaybeUninit<V>; 8], overflow: u32 } │
│ │
│ • Overflow chaining (linked groups) │
│ • 8-byte groups with NEON/SSE2/scalar SIMD scan │
│ • EMPTY / FULL tag states only (insertion-only, no deletion) │
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This architecture diagram hard-codes 8-slot groups (ctrl: [u8; 8], keys: ...; 8], etc.) and says “8-byte groups with NEON/SSE2”, but the implementation uses GROUP_SIZE = 16 on x86_64. Please update the documentation to reflect the 8-or-16 group size (or describe it as GROUP_SIZE).

Copilot uses AI. Check for mistakes.
Comment on lines +15 to +23
[dependencies]
hash-sorted-map = { path = ".." }
criterion = "0.7"
rand = "0.9"
rustc-hash = "2"
ahash = "0.8"
hashbrown = "0.15"
foldhash = "0.1"
fnv = "1"
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workspace appears to standardize on rand = "0.10" and criterion = "0.8" in other benchmark/test crates, but this benchmarks crate uses criterion = "0.7" and rand = "0.9". This will pull in multiple versions and may break compilation if the rand APIs used in benchmarks/lib.rs expect the 0.10 API. Consider aligning these versions with the rest of the repository.

Copilot uses AI. Check for mistakes.
group = &mut self.groups[overflow as usize];
} else {
let new_gi = self.num_groups as usize;
self.groups[gi].overflow = new_gi as u32;
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

insert_for_grow links a newly allocated overflow group from self.groups[gi] (the primary group index) instead of from the current tail group. If the chain already had overflow groups, this will overwrite the primary’s overflow pointer and break the chain. Update this to set overflow on the current group being extended (the one whose overflow was NO_OVERFLOW).

This issue also appears in the following locations of the same file:

  • line 362
  • line 556
Suggested change
self.groups[gi].overflow = new_gi as u32;
group.overflow = new_gi as u32;

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@jorendorff jorendorff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool.

Can I see the merge/iteration operations you have in mind, before stamping this?

Comment thread crates/hash-sorted-map/src/lib.rs Outdated
scanning the group. Gives a direct hit on most inserts at low load.
- **SIMD group scanning** — uses NEON on aarch64, SSE2 on x86\_64, and a
scalar fallback elsewhere to scan 8–16 control bytes in parallel.
- **AoS group layout** — each group stores its control bytes, keys, and values
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest linking Wikipedia here: https://en.wikipedia.org/wiki/AoS_and_SoA#Array_of_structures_of_arrays

Wikipedia calls this AoSoA (since each Group is a struct of arrays) or "tiled AoS"

}

pub fn with_capacity_and_hasher(capacity: usize, hash_builder: S) -> Self {
let adjusted = capacity.checked_mul(8).unwrap_or(usize::MAX) / 7;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to adjust more than that, because the expected load factor is not as high as hashbrown.

I did a few simulations and we need to multiply by about 1.30 here, or 1.42 if GROUP_SIZE is 8. That is assuming alloc_groups overallocates by a further 1/8 and that we want with_capacity to be an honest effort to accommodate that many entries without growing.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
(I had copilot compute those numbers as well, or at least something similar, just forgot to change the code)

Co-authored-by: Jason Orendorff <jorendorff@github.com>
@aneubeck
Copy link
Copy Markdown
Collaborator Author

Here is the very simple sorting of each group:
https://github.com/github/blackbird/blob/aneubeck/reindex4/crates/index/src/indexer/prefix_map.rs#L288
and here the sorted iterator:
https://github.com/github/blackbird/blob/aneubeck/reindex4/crates/index/src/indexer/prefix_map.rs#L390

In the ported version, we will need to introduce a new type, since the hash map will be broken after the sorting.
But one might want to iterate multiple times.

Merging is implemented here:
https://github.com/github/blackbird/blob/aneubeck/reindex4/crates/index/src/indexer/memory_index.rs#L191-L204

We could provide a helper function in the rust crate for this.

Feel free to merge/make changes, since I will be OOO tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants