Skip to content

Conversation

@TheBlueMatt
Copy link
Collaborator

Reading `ChannelMonitor`s on startup is one of the slowest parts of
LDK initialization. Now that we have an async `KVStore`, there's no
need for that, we can simply paralellize their loading, which we do
here.

Sadly, because Rust futures are pretty unergonomic, we have to add
some `unsafe {}` here, but arguing its fine is relatively
straightforward.

Based on #4146 just cause I don't want to resolve conflicts.

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Oct 7, 2025

👋 Hi! This PR is now in draft status.
I'll wait to assign reviewers until you mark it as ready for review.
Just convert it out of draft status when you're ready for review!

@codecov
Copy link

codecov bot commented Oct 7, 2025

Codecov Report

❌ Patch coverage is 65.62500% with 132 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.29%. Comparing base (e42e74e) to head (0503f6b).

Files with missing lines Patch % Lines
lightning/src/util/persist.rs 49.39% 37 Missing and 5 partials ⚠️
lightning-background-processor/src/lib.rs 0.00% 20 Missing ⚠️
lightning-block-sync/src/gossip.rs 0.00% 14 Missing ⚠️
lightning/src/util/native_async.rs 56.25% 14 Missing ⚠️
lightning-block-sync/src/rest.rs 27.77% 13 Missing ⚠️
lightning-block-sync/src/rpc.rs 27.77% 13 Missing ⚠️
lightning/src/util/test_utils.rs 25.00% 6 Missing ⚠️
lightning-persister/src/fs_store.rs 90.90% 4 Missing ⚠️
lightning/src/util/async_poll.rs 85.00% 3 Missing ⚠️
lightning-net-tokio/src/lib.rs 60.00% 2 Missing ⚠️
... and 1 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4147      +/-   ##
==========================================
- Coverage   89.33%   89.29%   -0.04%     
==========================================
  Files         180      180              
  Lines      138055   138188     +133     
  Branches   138055   138188     +133     
==========================================
+ Hits       123326   123393      +67     
- Misses      12122    12191      +69     
+ Partials     2607     2604       -3     
Flag Coverage Δ
fuzzing 33.53% <7.94%> (-0.04%) ⬇️
tests 88.68% <65.62%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ldk-reviews-bot ldk-reviews-bot requested a review from jkczyz October 7, 2025 23:28
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch from 4a773c9 to 31fbd16 Compare October 8, 2025 10:54
@TheBlueMatt TheBlueMatt marked this pull request as draft October 8, 2025 11:40
@TheBlueMatt TheBlueMatt removed the request for review from jkczyz October 8, 2025 11:40
@TheBlueMatt
Copy link
Collaborator Author

This doesn't actually work. monitor loading is (at least for a semi-local disk) CPU-bound, so we really need to spawn each monitor load task rather than having one task.

@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch 3 times, most recently from 64412a5 to 2c807b9 Compare October 9, 2025 17:58
@TheBlueMatt TheBlueMatt marked this pull request as ready for review October 9, 2025 19:54
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch from 2c807b9 to 0a2547a Compare October 9, 2025 19:54
@TheBlueMatt
Copy link
Collaborator Author

This requires GATs which requires an MSRV bump (to 1.64), but we're planning on doing that soon anyway.

@ldk-reviews-bot ldk-reviews-bot requested a review from jkczyz October 9, 2025 19:55
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch from 0a2547a to a4a778c Compare October 12, 2025 23:02
@ldk-reviews-bot
Copy link

🔔 1st Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch from a4a778c to 3346cba Compare October 13, 2025 13:27
@ldk-reviews-bot
Copy link

🔔 2nd Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 3rd Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 4th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 5th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 6th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 7th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 8th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 9th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 10th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 11th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 12th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link

🔔 13th Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

Now that it has the same MSRV as everything else in the workspace,
it doesn't need to live on its own.
Now that our MSRV is above 1.68 we can use the `pin!` macro to
avoid having to `Box` various futures, avoiding some allocations,
especially in `lightning-net-tokio`, which happens in a tight loop.
Now that our MSRV is 1.75, we can return `impl Trait` from trait
methods. Here we use this to clean up `KVStore` methods, dropping
the `Pin<Box<dyn ...>>` we had to use to have trait methods return
a concrete type. Sadly, there's two places where we can't drop a
`Box::pin` until we switch to edition 2024.
Now that our MSRV is 1.75, we can return `impl Trait` from trait
methods. Here we use this to clean up `lightning-block-sync` trait
methods, dropping the `Pin<Box<dyn ...>>` we had to use to have
trait methods return a concrete type.
Now that our MSRV is 1.75, we can return `impl Trait` from trait
methods. Here we use this to clean up `lightning` crate trait
methods, dropping the `Pin<Box<dyn ...>>`/`AsyncResult` we had to
use to have trait methods return a concrete type.
Now that we have an MSRV that supports returning `impl Trait` in
trait methods, we can use it to avoid the `Box<dyn ...>` we had
spewed all over our BOLT 11 invoice serialization.
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch from 3346cba to ba55b75 Compare November 10, 2025 15:54
Reading `ChannelMonitor`s on startup is one of the slowest parts of
LDK initialization. Now that we have an async `KVStore`, there's no
need for that, we can simply paralellize their loading, which we do
here.

Sadly, because Rust futures are pretty unergonomic, we have to add
some `unsafe {}` here, but arguing its fine is relatively
straightforward.
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch from ba55b75 to 5e45e0e Compare November 10, 2025 16:10
@TheBlueMatt
Copy link
Collaborator Author

Rebased on #4175, which avoids some further conflicts.

`tokio::spawn` can be use both to spawn a forever-running
background task or to spawn a task which gets `poll`ed
independently and eventually returns a result which the callsite
wants.

In LDK, we have only ever needed the first, and thus didn't bother
defining a return type for `FutureSpawner::spawn`. However, in the
next commit we'll start using `FutureSpawner` in a context where we
actually do want the spawned future's result. Thus, here, we add a
result output to `FutureSpawner::spawn`, mirroring the
`tokio::spawn` API.
`MonitorUpdatingPersister::read_all_channel_monitors_with_updates`
was made to do the IO operations in parallel in a previous commit,
however in practice this doesn't provide material parallelism for
large routing nodes. Because deserializing `ChannelMonitor`s is the
bulk of the work (when IO operations are sufficiently fast), we end
up blocked in single-threaded work nearly the entire time.

Here, we add an alternative option - a new
`read_all_channel_monitors_with_updates_parallel` method which uses
the `FutureSpawner` to cause the deserialization operations to
proceed in parallel.
When reading `ChannelMonitor`s from a `MonitorUpdatingPersister` on
startup, we have to make sure to load any `ChannelMonitorUpdate`s
and re-apply them as well. For users of async persistence who don't
have any `ChannelMonitorUpdate`s (e.g. because they set
`maximum_pending_updates` to 0 or, in the future, we avoid
persisting updates for small `ChannelMonitor`s), this means two
round-trips to the storage backend, one to load the
`ChannelMonitor` and one to try to read the next
`ChannelMonitorUpdate` only to have it fail.

Instead, here, we use `KVStore::list` to fetch the list of stored
`ChannelMonitorUpdate`s, which for async `KVStore` users allows us
to parallelize the list of update fetching and the
`ChannelMonitor` loading itself. Then we know exactly when to stop
reading `ChannelMonitorUpdate`s, including reading none if there
are none to read. This also avoids relying on `KVStore::read`
correctly returning `NotFound` in order to correctly discover when
to stop reading `ChannelMonitorUpdate`s.
When reading `ChannelMonitor`s from a `MonitorUpdatingPersister` on
startup, we have to make sure to load any `ChannelMonitorUpdate`s
and re-apply them as well. Now that we know which
`ChannelMonitorUpdate`s to load from `list`ing the entries from the
`KVStore` we can parallelize the reads themselves, which we do
here.

Now, loading all `ChannelMonitor`s from an async `KVStore` requires
only three full RTTs - one to list the set of `ChannelMonitor`s,
one to both fetch the `ChanelMonitor` and list the set of
`ChannelMonitorUpdate`s, and one to fetch all the
`ChannelMonitorUpdate`s (with the last one skipped when there are
no `ChannelMonitorUpdate`s to read).
@TheBlueMatt TheBlueMatt force-pushed the 2025-10-parallel-reads branch from 5e45e0e to 0503f6b Compare November 10, 2025 18:44
@TheBlueMatt
Copy link
Collaborator Author

Marking draft until the parent is marged.

@TheBlueMatt TheBlueMatt marked this pull request as draft November 11, 2025 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants