[NFC] Refactor delta debugging to use coroutines#8657
Conversation
Add a generator utility in a new support/coroutine.h header and use it to refactor away the callback in the delta debugging utility. Now the utility is a struct providing access to the test and working sets as well as `accept()` and `reject()` methods that cause the test and working sets to be updated appropriately. Rather than being refactored into an explicit state machine, the implementation of the delta debugging algorithm remains readable straight-line code the does a co_yield whenever it is ready to return control to the user. It co_yields a pointer to local state object that exposes all the information that the delta debugging utility exposes in its public API. This local object stays alive across suspend points. When the delta debugging algorithm is complete, we suspend the coroutine one final time and make sure never to resume it, which ensures the state remains alive and available after delta debugging has finished. It will ultimately be cleaned up when the outer `DeltaDebugger` struct is cleaned up.
|
This is an alternative to #8651. It took some iteration, but I'm pretty happy with how it turned out. The arcane C++ coroutine nonsense is pretty well encapsulated in coroutine.h and the delta debugging implementation is essentially just as readable as before. |
|
What about compiler support for coroutines - https://en.cppreference.com/cpp/compiler_support suggests clang on windows may not be done yet, but perhaps that page is out of date? |
|
Looks like this is still a problem :( https://clang.llvm.org/cxx_status.html#:~:text=Clang%2017-,Coroutines,-P0912R5. But I think the ABI problem only affects 32-bit x86, and it doesn't look like we do any 32-bit releases at all. Maybe this is good enough for us? Here's the relevant LLVM bug: llvm/llvm-project#59382 |
|
If it only affects 32-bit windows I think we are ok here. But the wording in those links is a little ambiguous to me if that is the case? |
|
The opening post on the LLVM issue says "The 32-bit Windows ABI passes objects of non-trivially-copyable class type by value on the stack" and never mentions any other problems, so I think we're good. (And I asked an expert internally and he confirmed this understanding.) |
kripken
left a comment
There was a problem hiding this comment.
Sounds good about compiler support!
| if (working.empty()) { | ||
| finished = true; | ||
| co_yield &state; | ||
| co_return; |
There was a problem hiding this comment.
Why does this need to yield before returning? Isn't the output in the right place already?
There was a problem hiding this comment.
A tricky thing here is that we need to prevent the coroutine from ever returning because we depend on its local state staying live for the lifetime of the outer DeltaDebugger. So we yield before returning here and below, then make sure we never resume the coroutine again.
There was a problem hiding this comment.
I see, thanks, that's what I was missing. Please document that, it is indeed tricky...
There was a problem hiding this comment.
Or, could we std::move the final state from the coroutine?
There was a problem hiding this comment.
Unfortunately there is not a great way to do that. This is by far the simplest approach I tried. Will add comments.
| return false; | ||
| } | ||
| PromiseType* await_resume() const noexcept { return promise; } | ||
| }; |
There was a problem hiding this comment.
Maybe add some docs for these classes? I'm not really sure what "GetPromise" means or does just from this code (which seems so generic as to do almost nothing but store a "promise"..?)
There was a problem hiding this comment.
All these methods are well-known to the compiler and configure the suspending and resuming behavior of our Generator utility. Unfortunately this is just a bunch of unavoidable boilerplate that doesn't do anything interesting (or comprehensible to non-experts). I'll document the interesting user-exposed methods, but for most of this there's not anything more to say than // Unavoidable boilerplate.
|
I tested current LLVM
Ideally, it would be nice if Binaryen could take a goal that it would build wherever LLVM builds. I.e. target Clang-13 at the moment? Testing briefly, I see that Binaryen no longer builds with Visual Studio 2019, or older Visual Studio 2022 on Windows. Do you know what macOS user version requirements coroutines would bring? Would it only be a build-time requirement to have Xcode 16, or would it also mean that e.g. macOS 14 would be required on user systems at minimum? Are the coroutines support only needed for the wasm-reduce test case reducer tool? If so, would it be possible to restrict coroutines use to wasm-reduce, and e.g. have a |
|
On the runtime vs compile time point, I believe all new C++ features are build time only. i.e. don't effect the version of macOS you can target with Or at least, binaryen hardcodes |
|
BTW, thanks for taking the time to time to do that comprehensive survey on real machines. I imagine that must have take a while, and its great to have hard evidence. |
In general this is stronger than LLVM's own policy, since LLVM may build on compilers older than their documented minimum requirements. I would strongly object to adopting stricter requirements than LLVM has. LLVM's policy for updating their minimum requirements is that they can adopt LLVM and GCC versions as recent as three years old as minimum requirements. There is an RFC in flight that would raise the minimum to clang 15, released on September 6, 2022, which corresponds to XCode version to 14.3, which was released March 30, 2023. The first LLVM version to have good enough support for coroutines is 17, which was released on September 9, 2023. We are only four months away from this meeting LLVM's three-year-old requirement. The first XCode version to be based on clang 17 is XCode 16.0, released on September 16, 2024. From my point of view, folks building emsdk will have to drop support for older compilers eventually as LLVM updates its minimum requirements, so this is already a problem such folks have to be prepared to solve. The question is how soon folks will have to solve it. I would very much like to use new language features, but I'd like to understand the other side of that trade off better. |
|
Thanks for detailing the links. I have two main motivations with backwards compatibility:
All that being said, I don't think we want to conclude that "shouldn't use C++20 since it's not available on a Mac from 2013". I wonder instead, if we should plan on introducing an intermediate bootstrap compiler build into e.g. the Emsdk ecosystem, so that these old systems could first acquire a newer compiler, and then use that newer compiler to compile LLVM & Binaryen. When that mechanism is in place, we would be guaranteed to have C++20 support available for example in the Emsdk scripts. That would resolve emscripten-core/emsdk#1704. So:
|
|
@juj, thanks for all the details. It's not urgent that we start using C++20, but having a plan to allow us to use it soon would be great. Adding a bootstrap step to the emsdk build sounds like a good solution to me, and I would be happy to land #8668 and #8669 and hold off on introducing new C++20 usage while you develop that. cc @sbc100 and @dschuff for their thoughts as well. |
|
Yeah, I like the idea of a bootstrap compiler in emsdk. Do we want/need this on platforms other than MacOS? If no, then we maybe still want to figure out minimum supported versions of MSVC and gcc/clang/libc++/libstdc++. If yes, then I guess it will just take more work to build the bootstrap (including a libc++ if necessary) but I think it should be doable. |
|
.. deleted my comment.. sorry I didn't finish reading you earlier reply @juj. |
Add a generator utility in a new support/coroutine.h header and use it to refactor away the callback in the delta debugging utility. Now the utility is a struct providing access to the test and working sets as well as
accept()andreject()methods that cause the test and working sets to be updated appropriately. Rather than being refactored into an explicit state machine, the implementation of the delta debugging algorithm remains readable straight-line code the does a co_yield whenever it is ready to return control to the user. It co_yields a pointer to local state object that exposes all the information that the delta debugging utility exposes in its public API. This local object stays alive across suspend points. When the delta debugging algorithm is complete, we suspend the coroutine one final time and make sure never to resume it, which ensures the state remains alive and available after delta debugging has finished. It will ultimately be cleaned up when the outerDeltaDebuggerstruct is cleaned up.