draft: MRTR (SEP-2322) lowlevel plumbing + handler-shape comparison#2322
Draft
draft: MRTR (SEP-2322) lowlevel plumbing + handler-shape comparison#2322
Conversation
Lowlevel plumbing for Multi Round-Trip Requests: Types: - IncompleteResult with result_type discriminator, input_requests, request_state - InputRequest/InputResponse unions (elicitation, sampling, roots) - input_responses + request_state fields on RequestParams Server (lowlevel): - on_call_tool return widened to include IncompleteResult Session: - send_request accepts TypeAdapter (overload) for union result parsing - call_tool_mrtr() returns CallToolResult | IncompleteResult - call_tool() stays narrow, raises on IncompleteResult with migration hint Client: - call_tool() drives MRTR retry loop internally — dispatches embedded input requests to elicitation/sampling/list_roots callbacks, retries with collected responses + echoed request_state - max_mrtr_rounds bound (default 8) The client-side delta from today's code is zero: elicitation_callback is the same function whether it fires from SSE push or MRTR retry.
Python-SDK counterpart to typescript-sdk#1701. Seven ways to write the same weather-lookup tool so the diff between files is the argument. SDK primitives (src/mcp/server/experimental/mrtr.py): - MrtrCtx.once() — idempotency guard tracked in request_state (Option F) - ToolBuilder — structural step decomposition; end_step runs exactly once regardless of round count (Option G) - input_response() — sugar for the guard-first pattern - sse_retry_shim() — Option A comparison artifact (pragma no-cover until LATEST_PROTOCOL_VERSION bumps past the MRTR gate) - dispatch_by_version() — Option D comparison artifact Option examples (examples/servers/mrtr-options/): - E (degrade-only): the SDK default. MRTR-native; pre-MRTR gets a default or error. Both quadrant rows collapse here. - A (SSE shim): SDK emulates retry over SSE. Safe re-entry, hidden loop. - B (await shim): exception-based. UNSAFE — hidden double-execution above await. Not a ship target; for contrast. - C (version branch): explicit if/else in handler body. - D (dual handler): two functions, SDK picks by version. - F (ctx.once): idempotency guard, opt-in per side-effect. - G (ToolBuilder): no above-the-guard zone; end_step structurally unreachable until all elicitations complete. The invariant test (tests/experimental/test_mrtr.py) parametrises E/F/G against the same Client + callback to prove identical wire behaviour — the server's internal choice doesn't leak. The footgun test measures audit_log count to prove F and G actually hold the guard (naive handler fires twice; F and G fire once). Both F and G depend on request_state integrity. The demos use plain base64-JSON; a production SDK MUST HMAC-sign the blob.
Two standalone reference examples before the comparison deck: - basic.py: the simple-tool equivalent for MRTR. One IncompleteResult, one retry. Comments walk through the two moves every MRTR handler makes: check input_responses, return IncompleteResult if missing. Runnable end-to-end against the in-memory Client. - basic_multiround.py: the ADO-rules SEP example translated. Two cascading elicitation rounds with request_state carrying accumulated context so any server instance can handle any round. Shows the key gotcha: input_responses carries only the latest round's answers, not accumulated — anything that must survive goes in request_state.
mrtr.py → mrtr/ ├── __init__.py — package docstring + re-exports ├── _state.py — encode_state/decode_state + input_response helper ├── context.py — MrtrCtx (Option F, ship target) ├── builder.py — ToolBuilder (Option G, ship target) └── compat.py — sse_retry_shim + dispatch_by_version (comparison artifacts) Ship targets (F/G) now live separately from the dual-path compat shims. All imports from mcp.server.experimental.mrtr unchanged.
The Option B footgun was: await elicit() looks like a suspension point but
is actually a re-entry point, so everything above it runs twice. Option H
fixes that by making it a REAL suspension point — the coroutine frame is
held in a ContinuationStore across MRTR rounds, keyed by request_state.
Handler code stays exactly as it was in the SSE era:
async def my_tool(ctx: LinearCtx, location: str) -> str:
audit_log(location) # runs exactly once
units = await ctx.elicit("Which units?", UnitsSchema)
return f"{location}: 22°{units.u}"
The wrapper linear_mrtr(my_tool, store=...) translates this into a standard
MRTR on_call_tool handler. Round 1 starts the coroutine; elicit() sends
IncompleteResult back through the wrapper and parks on a stream. Round 2's
retry wakes it with the answer. The coroutine continues from where it
stopped — no re-entry, no double-execution.
Trade-off: server holds the frame in memory between rounds. Client sees
pure MRTR (no SSE, independent requests), but server is stateful within
a single tool call. Horizontally-scaled deployments need sticky routing on
the request_state token. Same operational shape as Option A's SSE hold,
without the long-lived connection.
SDK pieces (src/mcp/server/experimental/mrtr/linear.py):
- LinearCtx with async elicit(message, PydanticSchema) -> instance
- ContinuationStore — owns the task group, TTL-based frame expiry
- linear_mrtr(handler, store=...) — the wrapper
- ElicitDeclined raised when user declines/cancels
7 E2E tests including the key assertion: side-effects above await fire
exactly once (the test measures audit_log count).
Draft
9 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Draft implementation of Multi Round-Trip Requests (SEP-2322) for the Python SDK. Two commits: lowlevel plumbing, then the handler-shape comparison deck.
Counterpart to typescript-sdk#1701 — same weather-lookup tool throughout, so the diff between option files is the argument. Unlike the TS demos, the lowlevel plumbing here is real (not smuggled through JSON text blocks); every option round-trips
IncompleteResultthrough the actual wire protocol.Commit 1: types + lowlevel + client retry loop
src/mcp/types/_types.pyIncompleteResult(discriminated byresult_type),InputRequest/InputResponseunions,input_responses+request_statefolded intoRequestParamssrc/mcp/server/lowlevel/server.pyon_call_toolreturn widened toCallToolResult | IncompleteResult | CreateTaskResultsrc/mcp/shared/session.pysend_requestacceptsTypeAdaptervia overload — enables union result parsingsrc/mcp/client/session.pycall_tool_mrtr()returns the union;call_tool()stays narrow, raises clearly onIncompleteResultsrc/mcp/client/client.pycall_tool()drives the retry loop internally — dispatches embedded input requests toelicitation_callback/sampling_callback/list_roots_callback, retries with collected responses + echoedrequest_state.max_mrtr_rounds=8bound.The client-side delta from today's code is zero:
elicitation_callbackis the same function whether it fires from SSE push or MRTR retry.Commit 2: handler-shape comparison
SDK primitives in
src/mcp/server/experimental/mrtr.py:MrtrCtx.once(key, fn)— idempotency guard tracked inrequest_state(Option F)ToolBuilder—incomplete_step(...).end_step(...).build();end_stepruns exactly once regardless of round count (Option G)input_response(params, key)— sugar for the guard-first patternsse_retry_shim()+dispatch_by_version()— comparison artifacts for A/DOption examples in
examples/servers/mrtr-options/:await elicit()IncompleteResultif versionbranchctx.oncewrapsonce()guard in request_state.build()Testing
tests/experimental/test_mrtr.pyparametrises E/F/G against the sameClient+ callback to prove identical wire behaviour — the server's internal choice doesn't leak. The footgun test measuresaudit_logcount: naive handler fires twice for one tool call, F and G fire once.tests/client/test_client.pyhas 8 new E2E tests covering the retry loop (single-round elicitation, multi-round with request_state accumulation, sampling/roots dispatch, round-limit, missing-callback error paths).Not in scope
ServerTaskContextalready doesinput_required; MRTR integration is a separate PRmrtrOnlyclient flag — trivial to add, not demoedMCPServerintegration (@server.tooldecorator shape) — lowlevel-first, this PR stops atServerExploratory — not intended to merge as-is. Open questions: which of F/G (or both) to ship as SDK primitives, whether to keep
call_tool_mrtras public or fold the union intocall_toolonce SEP finalises, whethersse_retry_shimbelongs in the SDK at all vs docs-only.Types of changes