Skip to content

keychain: cache derived priv key in PubKeyECDH to avoid per-call DB txn#10779

Open
erickcestari wants to merge 2 commits intolightningnetwork:masterfrom
erickcestari:improve-onion-processing-performance
Open

keychain: cache derived priv key in PubKeyECDH to avoid per-call DB txn#10779
erickcestari wants to merge 2 commits intolightningnetwork:masterfrom
erickcestari:improve-onion-processing-performance

Conversation

@erickcestari
Copy link
Copy Markdown
Collaborator

@erickcestari erickcestari commented Apr 29, 2026

PubKeyECDH.ECDH is on the hot path of every onion-message hop (the
sphinx router holds it as its onionKey and calls into it from
ProcessOnionPacket, DecryptBlindedHopData, and NextEphemeral
three calls per message in the deliver path). On the local-keyring
backend each call dispatched to BtcWalletKeyRing.ECDH, which calls
DerivePrivKey and because nodeKeyDesc.PubKey is set in
server.go, the in-memory waddrmgr fast path is skipped and the
call falls through to a walletdb.Update read-write transaction.
Each transaction forces a bbolt meta-page write and an fdatasync per
ECDH operation.

The wrapped key descriptor is immutable for the lifetime of a
PubKeyECDH instance, so this PR derives the private key once at
construction time and reuses it for every subsequent ECDH operation.
When the underlying ring cannot expose private keys (e.g. a remote
signer), cachedPriv stays nil and calls are forwarded to the ring
as before. The in-memory ECDH math is also extracted into a shared
helper so PubKeyECDH (after caching) and PrivKeyECDH go through
the same code path.

Measured impact

Microbenchmark (BenchmarkOnionMessagePipeline, deliver path,

real BtcWalletKeyRing + keychain.PubKeyECDH, single goroutine)

Metric Before After Delta
ns/op 3,008,424 726,040 4.14× faster
B/op 110,574 27,552 4.0× smaller
allocs/op 1,177 141 8.3× fewer
msg/s/goroutine ~332 ~1,377 +315%

The 1,036 alloc/op reduction is the structural fingerprint of three
bbolt R/W transactions per message disappearing, confirming the three
keychain-routed ECDH call sites collapse to the in-memory fast path.

Raw go test output

Before:

goos: linux
goarch: amd64
pkg: github.com/lightningnetwork/lnd/onionmessage
cpu: 12th Gen Intel(R) Core(TM) i5-12500H
BenchmarkOnionMessagePipeline        799           3008424 ns/op          110574 B/op       1177 allocs/op
PASS
ok      github.com/lightningnetwork/lnd/onionmessage    2.732s

1 / 3008424 ns ≈ 332 msg/s

After:

goos: linux
goarch: amd64
pkg: github.com/lightningnetwork/lnd/onionmessage
cpu: 12th Gen Intel(R) Core(TM) i5-12500H
BenchmarkOnionMessagePipeline       3278            726040 ns/op           27552 B/op        141 allocs/op
PASS
ok      github.com/lightningnetwork/lnd/onionmessage    2.480s

1 / 726040 ns ≈ 1,377 msg/s

End-to-end (regtest forwarding scenario, monitor-drops.sh alice,

sustained ~28 s window, 0.0% drops in both runs)

Metric Before After Delta
recv/s (avg) ~69 ~870 ~12.6× faster
Sustained range 63–73 recv/s 830–930 recv/s
Raw monitor-drops.sh alice output

Before:

elapsed   time                     recv/s      fwd/s fwd_fail/s   drop_b/s  drop_nb/s   drop%         Δrecv        Δfwd   Δfwd_fail     Δdrop_b    Δdrop_nb
1.0    s  2026-04-29 15:13:59       69.38      69.38       0.00       0.00       0.00    0.0%             70           70            0            0            0
2.0    s  2026-04-29 15:14:00       62.96      62.96       0.00       0.00       0.00    0.0%            134          134            0            0            0
3.0    s  2026-04-29 15:14:01       73.23      73.23       0.00       0.00       0.00    0.0%            208          208            0            0            0
4.0    s  2026-04-29 15:14:02       71.13      71.13       0.00       0.00       0.00    0.0%            280          280            0            0            0
5.1    s  2026-04-29 15:14:03       69.98      69.98       0.00       0.00       0.00    0.0%            351          351            0            0            0
6.1    s  2026-04-29 15:14:04       69.74      69.74       0.00       0.00       0.00    0.0%            422          422            0            0            0
7.1    s  2026-04-29 15:14:05       69.06      69.06       0.00       0.00       0.00    0.0%            492          492            0            0            0
8.1    s  2026-04-29 15:14:06       68.96      68.96       0.00       0.00       0.00    0.0%            562          562            0            0            0
9.1    s  2026-04-29 15:14:08       70.14      70.14       0.00       0.00       0.00    0.0%            634          634            0            0            0
10.1   s  2026-04-29 15:14:09       73.03      73.03       0.00       0.00       0.00    0.0%            708          708            0            0            0
11.2   s  2026-04-29 15:14:10       69.21      69.21       0.00       0.00       0.00    0.0%            778          778            0            0            0
12.2   s  2026-04-29 15:14:11       65.07      65.07       0.00       0.00       0.00    0.0%            844          844            0            0            0
13.2   s  2026-04-29 15:14:12       71.36      71.36       0.00       0.00       0.00    0.0%            916          916            0            0            0
14.2   s  2026-04-29 15:14:13       67.07      67.07       0.00       0.00       0.00    0.0%            984          984            0            0            0
15.2   s  2026-04-29 15:14:14       69.21      69.21       0.00       0.00       0.00    0.0%           1054         1054            0            0            0
16.2   s  2026-04-29 15:14:15       67.06      67.06       0.00       0.00       0.00    0.0%           1122         1122            0            0            0
17.2   s  2026-04-29 15:14:16       67.91      67.91       0.00       0.00       0.00    0.0%           1191         1191            0            0            0
18.3   s  2026-04-29 15:14:17       69.07      69.07       0.00       0.00       0.00    0.0%           1261         1261            0            0            0
19.3   s  2026-04-29 15:14:18       66.01      66.01       0.00       0.00       0.00    0.0%           1328         1328            0            0            0
20.3   s  2026-04-29 15:14:19       63.13      63.13       0.00       0.00       0.00    0.0%           1392         1392            0            0            0
21.3   s  2026-04-29 15:14:20       68.81      68.81       0.00       0.00       0.00    0.0%           1462         1462            0            0            0
22.3   s  2026-04-29 15:14:21       69.26      69.26       0.00       0.00       0.00    0.0%           1532         1532            0            0            0
23.3   s  2026-04-29 15:14:22       68.12      68.12       0.00       0.00       0.00    0.0%           1601         1601            0            0            0
24.3   s  2026-04-29 15:14:23       71.15      71.15       0.00       0.00       0.00    0.0%           1673         1673            0            0            0
25.3   s  2026-04-29 15:14:24       67.01      67.01       0.00       0.00       0.00    0.0%           1741         1741            0            0            0
26.4   s  2026-04-29 15:14:25       70.26      70.26       0.00       0.00       0.00    0.0%           1813         1813            0            0            0
27.4   s  2026-04-29 15:14:26       67.10      67.10       0.00       0.00       0.00    0.0%           1881         1881            0            0            0

After:

./lightning-scenarios/monitor-drops.sh alice
elapsed   time                     recv/s      fwd/s fwd_fail/s   drop_b/s  drop_nb/s   drop%         Δrecv        Δfwd   Δfwd_fail     Δdrop_b    Δdrop_nb
1.0    s  2026-04-29 14:11:30      865.12     865.12       0.00       0.00       0.00    0.0%            873          873            0            0            0
2.0    s  2026-04-29 14:11:31      930.86     930.86       0.00       0.00       0.00    0.0%           1813         1813            0            0            0
3.0    s  2026-04-29 14:11:32      875.45     875.45       0.00       0.00       0.00    0.0%           2698         2698            0            0            0
4.0    s  2026-04-29 14:11:33      872.01     872.01       0.00       0.00       0.00    0.0%           3579         3579            0            0            0
5.1    s  2026-04-29 14:11:34      883.89     883.89       0.00       0.00       0.00    0.0%           4472         4472            0            0            0
6.1    s  2026-04-29 14:11:35      867.01     867.01       0.00       0.00       0.00    0.0%           5348         5348            0            0            0
7.1    s  2026-04-29 14:11:36      836.89     836.89       0.00       0.00       0.00    0.0%           6194         6194            0            0            0
8.1    s  2026-04-29 14:11:37      905.46     905.46       0.00       0.00       0.00    0.0%           7109         7109            0            0            0
9.1    s  2026-04-29 14:11:38      878.38     878.38       0.00       0.00       0.00    0.0%           7997         7997            0            0            0
10.1   s  2026-04-29 14:11:39      888.11     888.11       0.00       0.00       0.00    0.0%           8894         8894            0            0            0
11.1   s  2026-04-29 14:11:40      845.53     845.53       0.00       0.00       0.00    0.0%           9747         9747            0            0            0
12.1   s  2026-04-29 14:11:41      885.22     885.22       0.00       0.00       0.00    0.0%          10640        10640            0            0            0
13.1   s  2026-04-29 14:11:42      874.99     874.99       0.00       0.00       0.00    0.0%          11524        11524            0            0            0
14.1   s  2026-04-29 14:11:43      889.28     889.28       0.00       0.00       0.00    0.0%          12423        12423            0            0            0
15.2   s  2026-04-29 14:11:44      832.37     832.37       0.00       0.00       0.00    0.0%          13264        13264            0            0            0
16.2   s  2026-04-29 14:11:45      878.71     878.71       0.00       0.00       0.00    0.0%          14152        14152            0            0            0
17.2   s  2026-04-29 14:11:46      863.66     863.66       0.00       0.00       0.00    0.0%          15025        15025            0            0            0
18.2   s  2026-04-29 14:11:47      882.69     882.69       0.00       0.00       0.00    0.0%          15916        15916            0            0            0
19.2   s  2026-04-29 14:11:48      831.92     831.92       0.00       0.00       0.00    0.0%          16757        16757            0            0            0
20.2   s  2026-04-29 14:11:49      835.85     835.85       0.00       0.00       0.00    0.0%          17602        17602            0            0            0
21.2   s  2026-04-29 14:11:50      832.98     832.98       0.00       0.00       0.00    0.0%          18444        18444            0            0            0
22.2   s  2026-04-29 14:11:51      841.40     841.40       0.00       0.00       0.00    0.0%          19293        19293            0            0            0
23.2   s  2026-04-29 14:11:52      887.89     887.89       0.00       0.00       0.00    0.0%          20190        20190            0            0            0
24.2   s  2026-04-29 14:11:53      845.71     845.71       0.00       0.00       0.00    0.0%          21044        21044            0            0            0
25.3   s  2026-04-29 14:11:54      864.52     864.52       0.00       0.00       0.00    0.0%          21916        21916            0            0            0
26.3   s  2026-04-29 14:11:55      885.28     885.28       0.00       0.00       0.00    0.0%          22809        22809            0            0            0
27.3   s  2026-04-29 14:11:56      868.00     868.00       0.00       0.00       0.00    0.0%          23685        23685            0            0            0
28.3   s  2026-04-29 14:11:57      888.71     888.71       0.00       0.00       0.00    0.0%          24583        24583            0            0            0

The end-to-end gain exceeds the microbenchmark gain because the
bench's b.TempDir() sits on tmpfs, where fdatasync is effectively
a no-op. Prod runs on durable storage, so every bbolt read-write
transaction pays a real fdatasync syscall and with the bbolt
single-writer lock serializing the three per-message ECDH txns
against concurrent wallet writers (chain sync, channel state), the
syscall overhead dominates. Caching the derived priv key removes
those transactions entirely, which is why prod sees a much larger
speedup than the in-memory microbench.

Syscall profile during the onion-message blast

Before:

Screenshot From 2026-04-28 16-55-44

After:

Screenshot From 2026-04-29 13-57-19

Thanks to @morehouse for reporting the performance issues with LND onion message routing.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request optimizes the performance of onion-message processing by caching the derived private key within the PubKeyECDH structure. Previously, every ECDH operation triggered a read-write database transaction, causing significant overhead due to disk I/O. By caching the key at construction time, these operations are now performed in-memory, resulting in substantial throughput improvements in production environments.

Highlights

  • Performance Optimization: Implemented caching of the derived private key in PubKeyECDH to eliminate redundant read-write database transactions on the onion-message hot path.
  • Code Consolidation: Extracted ECDH math into a shared helper function, ecdhFromPriv, to unify the code path for both cached and non-cached ECDH operations.
  • Benchmarking: Added a new benchmark, BenchmarkOnionMessagePipeline, to measure the performance impact of the changes, demonstrating a significant reduction in allocations and latency.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions github-actions Bot added the severity-critical Requires expert review - security/consensus critical label Apr 29, 2026
@github-actions
Copy link
Copy Markdown

🔴 PR Severity: CRITICAL

Classified from changed files | 1 non-test file | 59 lines changed

🔴 Critical (1 file)
  • keychain/ecdh.go - Private key derivation and management (keychain/*)
🟢 Low / Excluded (1 file)
  • onionmessage/actor_bench_test.go - Test file, excluded from severity classification

Analysis

This PR modifies keychain/ecdh.go, which falls under the keychain/* package responsible for private key derivation and management — a CRITICAL security-sensitive area. Changes here affect ECDH operations which are foundational to cryptographic operations in the node (e.g., key exchange, onion encryption).

The second file (onionmessage/actor_bench_test.go) is a benchmark test file and is excluded from severity classification per policy.

Severity bump check: Only 1 non-test file and 59 lines changed — no bump conditions triggered.

Expert review is required given the cryptographic sensitivity of the keychain package.


To override, add a severity-override-{critical,high,medium,low} label.
<!-- pr-severity-bot -->

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request optimizes ECDH operations by caching derived private keys in PubKeyECDH when a local secret key ring is used, which avoids redundant database transactions. It also refactors shared ECDH logic into a new helper function and adds a comprehensive benchmark for the onion message pipeline. I have no feedback to provide.

@erickcestari erickcestari self-assigned this Apr 29, 2026
Each call to PubKeyECDH.ECDH on the local-keyring path opened a
read-write wallet DB transaction to derive the private key, forcing a
bbolt meta-page write and fdatasync per call. Since the wrapped key
descriptor is immutable for the lifetime of a PubKeyECDH, derive the
private key once at construction and reuse it for every subsequent
ECDH operation. When the underlying ring cannot expose private keys
(e.g. a remote signer), cachedPriv stays nil and calls are forwarded
to the ring as before.

Also extract the in-memory ECDH computation into a shared helper used
by both PubKeyECDH (after caching) and PrivKeyECDH.
@erickcestari erickcestari force-pushed the improve-onion-processing-performance branch from dacf4a8 to 6579171 Compare April 29, 2026 19:59
@yyforyongyu yyforyongyu added the enhancement Improvements to existing features / behaviour label Apr 30, 2026
@yyforyongyu yyforyongyu added this to the v0.22.0 milestone Apr 30, 2026
Measures the end-to-end deliver path through OnionPeerActor.Receive
using the same wiring as server.go: a real BtcWalletKeyRing backing
keychain.PubKeyECDH as the sphinx router's onion key. This gives a
prod-shaped per-goroutine throughput ceiling and exposes the cost
(or savings) of changes to the keychain ECDH path on a realistic
load, not a synthetic in-memory shortcut.
@erickcestari erickcestari force-pushed the improve-onion-processing-performance branch from 6579171 to 25b406c Compare April 30, 2026 13:00
@saubyk saubyk added this to lnd v0.22 Apr 30, 2026
@github-project-automation github-project-automation Bot moved this to Backlog in lnd v0.22 Apr 30, 2026
@saubyk saubyk moved this from Backlog to In progress in lnd v0.22 Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Improvements to existing features / behaviour severity-critical Requires expert review - security/consensus critical

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

3 participants