[CUDA] Extend Pad support through opset 25 with wrap mode by Copilot · Pull Request #27774 · microsoft/onnxruntime

Copilot · 2026-03-19T19:44:25Z

Description

This PR consolidates PRs #27416 and #27708 to extend CUDA Pad kernel support through opset 25, including wrap mode implementation.

Motivation and Context

The CUDA execution provider previously only registered the Pad kernel up to opset 18 and did not implement wrap mode. When an ONNX model exported with opset 19+ was run on the CUDA executor, the Pad operation was forced to fall back to CPU, resulting in significant performance degradation. This PR aligns CUDA Pad registration with the ONNX Pad schema evolution through opset 25 and provides a correct wrap mode implementation.

Related issues: #26393
Related PRs: #27416, #27708

Summary of Changes

Kernel registration and opset coverage

File	Change
`onnxruntime/core/providers/cuda/tensor/pad.cc`	Adds CUDA Pad kernel registrations for opset ranges 18, 19-20, 21-22, 23, 24, and 25.
`onnxruntime/core/providers/cuda/cuda_execution_provider.cc`	Registers the new Pad kernel versions in the CUDA EP registry under the existing per-opset sections.

CUDA Pad implementation

File	Change
`onnxruntime/core/providers/cuda/tensor/pad_impl.h`	Extends the Pad kernel interface to pass effective sliced extents and per-axis input offsets.
`onnxruntime/core/providers/cuda/tensor/pad_impl.cu`	Adds CUDA wrap mode using a `WrapCoordinate` device helper with `if constexpr` compile-time specialization. Removes dead wrap code from the NCHW-specialized kernel path.
`onnxruntime/core/providers/cuda/tensor/pad.cc`	Computes effective sliced input extents/offsets for wrap behavior with negative pads. Bypasses the NCHW fast-path for wrap mode and routes through the generic implementation.

Documentation

File	Change
`docs/OperatorKernels.md`	Updates the CUDA Pad kernel opset coverage to reflect the new version splits (25+, 24, 23, [21,22], [19,20], 18) up to opset 25.

Test coverage

File	Change
`onnxruntime/test/providers/cpu/tensor/pad_test.cc`	Adds CUDA-only Pad coverage for `edge` across opsets 18-25 and `wrap` across opsets 19-25. Updates existing wrap test comment.

Checklist

Tests added/updated
No breaking changes

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

…wrap impl, CUDA-only tests Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>

ShirasawaSama · 2026-03-19T20:57:36Z

Thanks!

tianleiwu · 2026-03-19T20:58:36Z

@copilot, please update ‎docs/OperatorKernels.md. You can follow PR #27708.

Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>

Copilot · 2026-03-19T21:09:07Z

@copilot, please update ‎docs/OperatorKernels.md. You can follow PR #27708.

Updated docs/OperatorKernels.md with the CUDA Pad opset breakdown (25+, 24, 23, [21,22], [19,20], 18) matching PR #27708's pattern in commit f5e748c.

tianleiwu · 2026-03-19T21:43:32Z

/azp run Windows GPU Doc Gen CI Pipeline, Python format

azure-pipelines · 2026-03-19T21:43:44Z

Azure Pipelines successfully started running 1 pipeline(s).

ShirasawaSama and others added 2 commits March 19, 2026 19:37

Add pad op version 19 to 23 support for CUDA

6af8aa0

Apply recommended changes: extend CUDA Pad through opset 25, cleaner …

edacc32

…wrap impl, CUDA-only tests Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>

Copilot AI assigned Copilot and tianleiwu Mar 19, 2026

Copilot created this pull request from a session on behalf of tianleiwu March 19, 2026 19:44 View session

Copilot AI requested a review from tianleiwu March 19, 2026 19:45

Copilot finished work on behalf of tianleiwu March 19, 2026 19:45

This was referenced Mar 19, 2026

[CUDA EP] Add pad op version from 19 to 23 support for CUDA #27416

Closed

[CUDA] Extend Pad support through opset 25 #27708

Closed

Copilot started work on behalf of tianleiwu March 19, 2026 20:58 View session

Update docs/OperatorKernels.md with CUDA Pad opset 18-25 coverage

f5e748c

Co-authored-by: tianleiwu <30328909+tianleiwu@users.noreply.github.com>

Copilot finished work on behalf of tianleiwu March 19, 2026 21:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CUDA] Extend Pad support through opset 25 with wrap mode#27774

[CUDA] Extend Pad support through opset 25 with wrap mode#27774
Copilot wants to merge 3 commits intomainfrom
copilot/evaluate-cuda-kernels-wrap-mode

Copilot AI commented Mar 19, 2026 •

edited

Loading

Uh oh!

ShirasawaSama commented Mar 19, 2026

Uh oh!

tianleiwu commented Mar 19, 2026

Uh oh!

Copilot AI commented Mar 19, 2026

Uh oh!

tianleiwu commented Mar 19, 2026

Uh oh!

azure-pipelines bot commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Summary of Changes

Kernel registration and opset coverage

CUDA Pad implementation

Documentation

Test coverage

Checklist

Uh oh!

ShirasawaSama commented Mar 19, 2026

Uh oh!

tianleiwu commented Mar 19, 2026

Uh oh!

Copilot AI commented Mar 19, 2026

Uh oh!

tianleiwu commented Mar 19, 2026

Uh oh!

azure-pipelines bot commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Mar 19, 2026 •

edited

Loading