CUDA: add CONV_3D operator support #17255

YaelGitAccount · 2025-11-14T01:04:35Z

Summary

Adds CUDA support for GGML_OP_CONV_3D, enabling full 3D convolution on NVIDIA GPUs with correct multi-dimensional indexing.
The implementation matches the CPU semantics exactly, including fused channel dimensions and nb[] byte-stride layout.

Changes

Added conv3d.cu and conv3d.cuh with CUDA kernel and helpers
Added dispatch path in ggml-cuda.cu
Updated operator registration in ggml-cuda.cu
Updated docs/ops.md and docs/ops/CUDA.csv to include CONV_3D

Implementation

One CUDA thread per output element (batch × OC × OD × OH × OW)
Correct fused-dimension addressing:
- Input: b * IC + ic
- Kernel: oc * IC + ic
- Output: b * OC + oc
Full nb[] stride-aware indexing matching CPU layout
Supports F32 input/output and F16/F32 kernel weights
Fully respects stride, padding, dilation, and 3D spatial dimensions
Follows existing CUDA backend structure and coding conventions

Testing

All CONV_3D backend tests pass for CUDA (F32/F16 kernels, all shapes)
Numerical parity with CPU across all tested configurations
No regressions in CUDA backend test suite
Full backend test suite passes (no global regressions)

Compatibility

CUDA backend only
CPU path unchanged
No external dependencies added
Preserves GGML tensor layout conventions

YaelGitAccount · 2025-11-14T08:56:29Z

This PR is ready for review.
Tagging @CISC and @slaren— your feedback would be greatly appreciated whenever you have the chance.
Thanks for your work on maintaining and improving the CUDA backend!

CISC · 2025-11-14T09:17:08Z

Unfortunately, the reason no backends support CONV_3D is that ggml_conv_3d uses the IM2COL_3D op instead. This is an unused op.

Green-Sky · 2025-11-14T09:32:27Z

There also exists #16948 . You can use the conv3d test program from that pr to compare the performance.

YaelGitAccount added 2 commits November 14, 2025 01:07

feat(cuda): initial conv3d implementation

79247f8

chore(cuda): clean up comments and update Conv3D support matrix

dd9b440

YaelGitAccount requested a review from slaren as a code owner November 14, 2025 01:04

github-actions bot added documentation Improvements or additions to documentation Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Nov 14, 2025

Update ggml-cuda.cu

fcfb24a

DajanaV mentioned this pull request Nov 14, 2025

UPSTREAM PR #17255: CUDA: add CONV_3D operator support auroralabs-loci/llama.cpp#204

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDA: add CONV_3D operator support #17255

CUDA: add CONV_3D operator support #17255

YaelGitAccount commented Nov 14, 2025

Uh oh!

YaelGitAccount commented Nov 14, 2025

Uh oh!

CISC commented Nov 14, 2025 •

edited

Loading

Uh oh!

Green-Sky commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CUDA: add CONV_3D operator support #17255

Are you sure you want to change the base?

CUDA: add CONV_3D operator support #17255

Conversation

YaelGitAccount commented Nov 14, 2025

Summary

Changes

Implementation

Testing

Compatibility

Uh oh!

YaelGitAccount commented Nov 14, 2025

Uh oh!

CISC commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Green-Sky commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CISC commented Nov 14, 2025 •

edited

Loading