Skip to content

InMemoryTaskStore.wait_for_update has lost-wakeup races (concurrent waiters, notify-before-wait) #2535

@blackwell-systems

Description

@blackwell-systems

wait_for_update overwrites _update_events[task_id] with a fresh event on every call. This causes two races:

Race 1: Concurrent waiters

If two callers poll the same task simultaneously, the second overwrites the first's event. notify_update only sets the latest event, so the first waiter hangs forever.

Race 2: Notify before wait

If update_task completes (calling notify_update) before wait_for_update is called, the signal is lost because no event exists yet. The waiter blocks until the next update, which never comes for a terminal task.

Both are reachable via task_result_handler.py:126, which calls _wait_for_task_update in a polling loop.

Reproducer

import anyio
from mcp.shared.experimental.tasks.in_memory_task_store import InMemoryTaskStore
from mcp.types import TaskMetadata

async def main():
    store = InMemoryTaskStore()
    task = await store.create_task(TaskMetadata())

    # Race 1: concurrent waiters
    woke = {"a": False, "b": False}
    async def waiter(name):
        await store.wait_for_update(task.task_id)
        woke[name] = True
    async def updater():
        await anyio.sleep(0.05)
        await store.update_task(task.task_id, status="completed")
    try:
        with anyio.fail_after(2):
            async with anyio.create_task_group() as tg:
                tg.start_soon(waiter, "a")
                await anyio.sleep(0.01)
                tg.start_soon(waiter, "b")
                tg.start_soon(updater)
    except TimeoutError:
        pass
    print(f"a: {'woke' if woke['a'] else 'HUNG'}, b: {'woke' if woke['b'] else 'HUNG'}")

    # Race 2: notify before wait
    store2 = InMemoryTaskStore()
    task2 = await store2.create_task(TaskMetadata())
    await store2.update_task(task2.task_id, status="completed")
    try:
        with anyio.fail_after(1):
            await store2.wait_for_update(task2.task_id)
            print("wait returned")
    except TimeoutError:
        print("HUNG: signal lost")

anyio.run(main)

Output:

a: HUNG, b: woke
HUNG: signal lost

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions