feat: Autocast #3878

zewenli98 · 2025-10-28T05:15:58Z

Description

Weak typing behavior in TensorRT is deprecated. However it is a good way to maximize performance. Therefore, we want to create similar PyTorch native system to use with Torch-TensorRT that recovers some of this behavior.

Fixes #3869

Type of change

New feature (non-breaking change which adds functionality)

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

py/torch_tensorrt/dynamo/_compiler.py

zewenli98 · 2025-10-29T02:17:51Z

py/torch_tensorrt/dynamo/_compiler.py

+    enable_autocast: bool = _defaults.ENABLE_AUTOCAST,
+    low_precision_type: Optional[
+        Union[torch.dtype, dtype]
+    ] = _defaults.LOW_PRECISION_TYPE,
+    nodes_to_exclude: Collection[str] = _defaults.NODES_TO_EXCLUDE,
+    targets_to_exclude: Collection[Target] = _defaults.TARGETS_TO_EXCLUDE,
+    data_max: float = _defaults.DATA_MAX,
+    max_depth_of_reduction: Optional[int] = _defaults.MAX_DEPTH_OF_REDUCTION,


Before merging, these args should be added to other compile functions in this file.

zewenli98 · 2025-10-29T02:19:36Z

py/torch_tensorrt/dynamo/lowering/passes/nodeclassifier.py

+        ]:
+            # GEMM: A (M, K) @ B (K, N) = C (M, N)
+            self.reduction_depth = input_0_dims[-1]
+        # TODO: Add more reduction ops here


Should any more reduction targets be added?

py/torch_tensorrt/dynamo/runtime/_PythonTorchTensorRTModule.py

py/torch_tensorrt/dynamo/lowering/passes/rule_based_autocast.py

peri044

Can you also update the documentation at https://github.com/pytorch/TensorRT/blob/main/docsrc/user_guide/mixed_precision.rst

core/runtime/execute_engine.cpp

examples/dynamo/autocast_example.py

py/torch_tensorrt/dynamo/lowering/passes/nodeclassifier.py

py/torch_tensorrt/dynamo/lowering/passes/rule_based_autocast.py

py/torch_tensorrt/dynamo/_compiler.py

narendasan · 2025-11-06T19:24:06Z

For Tests

Should external autocast in pytorch with strong typing
Whole graph autocast pass
a test case that exercises max_output_threshold fallback

L1 or L2 tests

…dd tests

narendasan · 2025-11-14T21:13:11Z

docsrc/user_guide/mixed_precision.rst

-If we compile the above model using Torch-TensorRT, layer profiling logs indicate that all the layers are 
-run in FP32. This is because TensorRT picks the kernels for layers which result in the best performance. 
+If we compile the above model using Torch-TensorRT with the following settings, layer profiling logs indicate that all the layers are 
+run in FP32. This is because TensorRT picks the kernels for layers which result in the best performance (i.e., weak typing in TensorRT). 


We may want to reorient around strong typing first and then weak typing as an optimization. Right now this is a bit confusing

So like in the tutorial

Demonstrate strong typing and explain that its going to be the default behavior

Show the weak typing behavior and talk about how the trt graph changed (and maybe why)

Show how you can recover the weak typing behavior using auto cast for trt 11 and beyond

Since TRT has deprecated weak typing, should we mention weak typing is deprecated so need to use autocast instead? Thus, we have only two modes:

User defineds precision: use_explicit_typing=True + enable_autocast=False Autocast chooses precision: use_explicit_typing=True + enable_autocast=True

narendasan · 2025-11-14T21:16:01Z

docsrc/user_guide/mixed_precision.rst

+Autocast
+---------------
+
+Weak typing behavior in TensorRT is deprecated. However it is a good way to maximize performance. Therefore, in Torch-TensorRT,


However mixed precision is a good way to maximize performance

narendasan · 2025-11-14T21:17:09Z

docsrc/user_guide/mixed_precision.rst

+reduced precision on the rest of the nodes. Torch-TensorRT Autocast also supports users to specify which nodes to exclude from Autocast,
+considering some nodes might be more sensitive to affecting accuracy. In addition, Torch-TensorRT Autocast can cooperate with PyTorch 
+native Autocast, allowing users to use both PyTorch and Torch-TensorRT Autocast in the same model. Torch-TensorRT respects the precision
+of the nodes within PyTorch Autocast.


Can you explain the difference between PyTorch and Torch-TensorRT autocast?

narendasan · 2025-11-14T21:19:35Z

examples/dynamo/autocast_example.py

@@ -0,0 +1,70 @@
+import torch


Can you add comments to this doc? Here is an example of what im looking for https://docs.pytorch.org/TensorRT/tutorials/_rendered_examples/dynamo/converter_overloading.html

narendasan · 2025-11-14T21:20:04Z

examples/dynamo/autocast_example.py

+        return out
+
+
+if __name__ == "__main__":


I know its not best practice but lets just make them pure scripts so they render better

narendasan · 2025-11-14T21:21:23Z

py/torch_tensorrt/dynamo/lowering/passes/_aten_lowering_pass.py

+pre_lowering_pass_list = [
+    remove_detach,
+    remove_assert_nodes,
+    rule_based_autocast,


Should this pass be conditionally added to the pre_lowering_pass_list?

there's a condition inside of rule_based_autocast

narendasan

Nice its looking good, some final polishing details then I think its good to go

zewenli98 added 2 commits October 27, 2025 21:59

implement autocast

eac8809

fix bug

f6c7c7c

zewenli98 self-assigned this Oct 28, 2025

meta-cla bot added the cla signed label Oct 28, 2025

github-actions bot requested a review from apbose October 28, 2025 05:16

zewenli98 removed the request for review from apbose October 28, 2025 05:16

add arg enable_autocast

f7d8068

github-actions bot removed the component: conversion Issues re: Conversion stage label Oct 29, 2025

zewenli98 commented Oct 29, 2025

View reviewed changes

py/torch_tensorrt/dynamo/_compiler.py Outdated Show resolved Hide resolved

zewenli98 commented Oct 29, 2025

View reviewed changes

py/torch_tensorrt/dynamo/runtime/_PythonTorchTensorRTModule.py Show resolved Hide resolved

zewenli98 commented Oct 30, 2025

View reviewed changes

py/torch_tensorrt/dynamo/lowering/passes/rule_based_autocast.py Outdated Show resolved Hide resolved

zewenli98 requested review from narendasan and peri044 October 30, 2025 02:13

peri044 reviewed Oct 30, 2025

View reviewed changes

zewenli98 added 2 commits November 4, 2025 18:50

change names of API and support for user specified node names

e15ce94

support dataloader for calibration

94757d2

zewenli98 added 2 commits November 6, 2025 18:22

fix comments

4bf12e7

optimize Cast insertion logic, fix io dtype issue and comments, and a…

0a62149

…dd tests

github-actions bot added component: tests Issues re: Tests and removed component: core Issues re: The core compiler labels Nov 8, 2025

fix bugs in cpp runtime

a990653

amend doc and settings

3e008c2

github-actions bot added the documentation Improvements or additions to documentation label Nov 14, 2025

narendasan reviewed Nov 14, 2025

View reviewed changes

feat: Autocast #3878

Are you sure you want to change the base?

feat: Autocast #3878

Conversation

zewenli98 commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Checklist:

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

peri044 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

narendasan commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

narendasan left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zewenli98 commented Oct 28, 2025 •

edited

Loading

narendasan commented Nov 6, 2025 •

edited

Loading