fix: resolve lost-wakeup races in InMemoryTaskStore.wait_for_update by blackwell-systems · Pull Request #2536 · modelcontextprotocol/python-sdk

blackwell-systems · 2026-05-02T14:57:19Z

Summary

Two races in InMemoryTaskStore.wait_for_update that can cause indefinite hangs when polling task status.

Race 1: Concurrent waiters

wait_for_update overwrites _update_events[task_id] with a fresh event on every call. If two clients poll the same task, the second overwrites the first's event. notify_update sets only the latest event, so the first waiter hangs forever.

Fix: Use a list of events per task_id. Each waiter appends its own event. notify_update sets and removes all events for the task.

Race 2: Notify before wait

If update_task calls notify_update before any wait_for_update is active, the signal is lost (no event exists to set). The next wait_for_update creates a fresh event and waits for an update that already happened.

Fix: Track pending updates in a set. notify_update adds to the set when no waiters exist. wait_for_update checks and consumes the pending flag before creating an event.

Reproducer

Both races are reproducible with a 30-line script (included in issue #2535). Output before fix:

Race 1: concurrent waiters
  waiter a: HUNG
  waiter b: woke
  FAIL: lost wakeup

Race 2: notify before wait
  FAIL: signal lost, waiter hung forever

After fix:

Race 1: concurrent waiters
  waiter a: woke
  waiter b: woke
  PASS

Race 2: notify before wait
  PASS

Test plan

Reproducer script confirms both races are fixed
All 217 existing task store tests pass (pytest tests/experimental/tasks/ -v)
0 regressions

Fixes #2535

Two races in wait_for_update: 1. Concurrent waiters: second caller overwrites the first's event in _update_events[task_id], so the first waiter hangs forever. Fix: use a list of events per task_id so each waiter gets its own. 2. Notify before wait: if update_task completes before wait_for_update is called, the signal is lost because no event exists yet. Fix: track pending updates in a set; wait_for_update checks and consumes pending flags before creating an event. Both races are reachable via task_result_handler.py:126 when multiple clients poll the same task or when a task completes between status checks. Adds two tests: concurrent waiters and notify-before-wait. Fixes modelcontextprotocol#2535

blackwell-systems force-pushed the fix/task-store-lost-wakeup branch from 17b3ab2 to 065264d Compare May 2, 2026 15:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: resolve lost-wakeup races in InMemoryTaskStore.wait_for_update#2536

fix: resolve lost-wakeup races in InMemoryTaskStore.wait_for_update#2536
blackwell-systems wants to merge 1 commit intomodelcontextprotocol:mainfrom
blackwell-systems:fix/task-store-lost-wakeup

blackwell-systems commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

blackwell-systems commented May 2, 2026

Summary

Race 1: Concurrent waiters

Race 2: Notify before wait

Reproducer

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant