Skip to content

Update default strides in AvgPool1dConfig, AvgPool2dConfig, MaxPool1dConfig, and MaxPool2dConfig to match kernel sizes #3338

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 8, 2025

Conversation

lucianyao
Copy link
Contributor

Checklist

  • Confirmed that cargo run-checks command has been executed.

Related Issues/PRs

Fixes #663

Changes

  • Updated the constructor of MaxPool2dConfig to set the default strides equal to the kernel size.
  • This ensures that Burn’s behavior mirrors PyTorch’s default stride behavior, which prevents unexpected output shape mismatches.
  • Previously, Burn used [1, 1] as default strides, resulting in inconsistent behavior with common frameworks.
  • This is a breaking change that should be included before the next release to avoid propagating invalid assumptions.

Testing

  • Added unit test default_strides_match_kernel_size using rstest, verifying that default strides match kernel size.
  • Tested with multiple kernel shapes including [2, 2] and [1, 2].
  • Ran RUSTC_BOOTSTRAP=1 RUSTFLAGS="-Zmacro-backtrace" cargo run-checks , Encountered a panic in tests::memory_checks::test_memory_leaks originating from burn-fusion, which persisted after reverting this PR—confirming it is a preexisting upstream issue:

==== Fusion Memory Report ====

  • Handles: 0
  • Streams: 1
  • StreamId(6279463968646665849) => operations: 0 cursor: 2220

==============================

@antimora
Copy link
Collaborator

antimora commented Jul 1, 2025

For consistency, we should also update other pool modules: https://github.com/tracel-ai/burn/tree/main/crates/burn-core/src/nn/pool

@lucianyao
Copy link
Contributor Author

For consistency, we should also update other pool modules: https://github.com/tracel-ai/burn/tree/main/crates/burn-core/src/nn/pool

I was wondering whether the other pooling configs (like those under crates/onnx-ir/src/node/max_pool2d.rs, max_pool1d, avg_pool1d, avg_pool2d) also need to follow the same stride = kernel size behavior?

And by the way, can we also use the Config macro to generate constructors and builders for those configs, as we've done with max_pool2d?

@antimora
Copy link
Collaborator

antimora commented Jul 1, 2025

For consistency, we should also update other pool modules: https://github.com/tracel-ai/burn/tree/main/crates/burn-core/src/nn/pool

I was wondering whether the other pooling configs (like those under crates/onnx-ir/src/node/max_pool2d.rs, max_pool1d, avg_pool1d, avg_pool2d) also need to follow the same stride = kernel size behavior?

And by the way, can we also use the Config macro to generate constructors and builders for those configs, as we've done with max_pool2d?

Yes, I think you can do it although I didn't know it was possible but it seems to work 😄

Copy link

codecov bot commented Jul 1, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 30.98%. Comparing base (eadc281) to head (3ac5a1c).
Report is 17 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3338   +/-   ##
=======================================
  Coverage   30.98%   30.98%           
=======================================
  Files         413      413           
  Lines       60373    60373           
=======================================
  Hits        18709    18709           
  Misses      41664    41664           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lucianyao
Copy link
Contributor Author

For consistency, we should also update other pool modules: https://github.com/tracel-ai/burn/tree/main/crates/burn-core/src/nn/pool

I was wondering whether the other pooling configs (like those under crates/onnx-ir/src/node/max_pool2d.rs, max_pool1d, avg_pool1d, avg_pool2d) also need to follow the same stride = kernel size behavior?
And by the way, can we also use the Config macro to generate constructors and builders for those configs, as we've done with max_pool2d?

Yes, I think you can do it although I didn't know it was possible but it seems to work 😄

I’ve compared the config constructors under the burn-core directory with their counterparts under onnx-ir. I noticed that max_pool1d and max_pool2d in onnx-ir already align with the default values defined in burn-core, but the avg_pool configs currently don’t define defaults.

Would you like me to update the average pooling configs to match the default values from burn-core as well, for consistency?

@antimora
Copy link
Collaborator

antimora commented Jul 1, 2025

For consistency, we should also update other pool modules: https://github.com/tracel-ai/burn/tree/main/crates/burn-core/src/nn/pool

I was wondering whether the other pooling configs (like those under crates/onnx-ir/src/node/max_pool2d.rs, max_pool1d, avg_pool1d, avg_pool2d) also need to follow the same stride = kernel size behavior?
And by the way, can we also use the Config macro to generate constructors and builders for those configs, as we've done with max_pool2d?

Yes, I think you can do it although I didn't know it was possible but it seems to work 😄

I’ve compared the config constructors under the burn-core directory with their counterparts under onnx-ir. I noticed that max_pool1d and max_pool2d in onnx-ir already align with the default values defined in burn-core, but the avg_pool configs currently don’t define defaults.

Would you like me to update the average pooling configs to match the default values from burn-core as well, for consistency?

The ONNX related defaults should follow ONNX specs like in this one: https://onnx.ai/onnx/operators/onnx__AveragePool.html#attributes which seems different than PyTorch's defaulting to 1. If you notice that's not the case in ONNX, please let us know and we will fix onnx defaults in another PR.

@lucianyao
Copy link
Contributor Author

Question 1:

I've updated the comparison of stride default values across implementations:

Pool PyTorch burn ONNX burn-ONNX
Max Pool kernel_size 1-> kernel_size (in the PR) 1 1
Avg Pool kernel_size 1-> kernel_size(in the PR) 1 (not implemented)

It looks like:

  • PyTorch uses stride = kernel_size by default
  • burn and ONNX both default to 1
  • burn-ONNX matches that for MaxPool, but AvgPool currently doesn’t specify a default

Would you prefer we align the AvgPool behavior in onnx-ir with the ONNX default (1), or keep it explicit?

Question 2:

I've also noticed that the constructor parameter lists in the ONNX configs differ significantly from those in the current burn onnx-ir implementations (e.g. in required vs optional fields, argument order, etc.).

Should we aim to align the constructor signatures more closely with ONNX spec semantics, or is the current structure in burn-ONNX preferred for internal consistency?

@lucianyao lucianyao requested a review from antimora July 3, 2025 13:06
@lucianyao
Copy link
Contributor Author

Regarding Question 2, I’ve seen that the constructor design choice was intentional per SUPPORTED-ONNX-OPS.md, so feel free to disregard that.

Copy link
Member

@laggui laggui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a general note: we should not aim to "match PyTorch" as it is not a goal for Burn.

In this case, I would say that using the kernel size a default stride actually matches the most common intent with pooling operations. That is, reducing spatial dimensions.

So from a user perspective, having stride = kernel_size as the default makes it more intuitive. For example, MaxPool2dConfig::new([2, 2]) will downsample by 2x (typically expected). And overlapping windows have to be explicitly defined by using a smaller stride.

We will have to document the breaking change 🙂 but defaults LGTM unless there are objections @antimora

@laggui
Copy link
Member

laggui commented Jul 4, 2025

Would you prefer we align the AvgPool behavior in onnx-ir with the ONNX default (1), or keep it explicit?

onnx-ir should match the ONNX default when parsing the node config.. but I think it already does no?

/// Create an AvgPool1dConfig from the attributes of the node
pub fn avg_pool1d_config(curr: &Node) -> AvgPool1dConfig {
let mut kernel_shape = Vec::new();
let mut strides = vec![1];
let mut pads = vec![0, 0];
let mut count_include_pad: i64 = 0;
let mut ceil_mode: i64 = 0;

/// Create a AvgPool2dConfig from the attributes of the node
pub fn avg_pool2d_config(curr: &Node) -> AvgPool2dConfig {
let mut kernel_shape = Vec::new();
let mut strides = vec![1, 1];
let mut pads = vec![0, 0, 0, 0];
let mut count_include_pad: i64 = 0;
let mut ceil_mode: i64 = 0;

Copy link
Collaborator

@antimora antimora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Thanks for taking care of this issue.

@antimora antimora changed the title fix(maxpool2d): use kernel size as default stride to match PyTorch (#663) Update default strides in AvgPool1dConfig, AvgPool2dConfig, MaxPool1dConfig, and MaxPool2dConfig to match kernel sizes Jul 4, 2025
@lucianyao
Copy link
Contributor Author

Just a general note: we should not aim to "match PyTorch" as it is not a goal for Burn.

In this case, I would say that using the kernel size a default stride actually matches the most common intent with pooling operations. That is, reducing spatial dimensions.

So from a user perspective, having stride = kernel_size as the default makes it more intuitive. For example, MaxPool2dConfig::new([2, 2]) will downsample by 2x (typically expected). And overlapping windows have to be explicitly defined by using a smaller stride.

We will have to document the breaking change 🙂 but defaults LGTM unless there are objections @antimora

Indeed, the parsing function sets stride = 1 when it's absent, which matches the ONNX spec.

My point was more about the constructor itself — unlike burn-core, we haven’t provided a dedicated new() constructor for AvgPool1dConfig with ONNX-aligned defaults.

Just wondering if it would be worth introducing that, for consistency across config usage patterns.

@antimora
Copy link
Collaborator

antimora commented Jul 4, 2025

Just a general note: we should not aim to "match PyTorch" as it is not a goal for Burn.

In this case, I would say that using the kernel size a default stride actually matches the most common intent with pooling operations. That is, reducing spatial dimensions.

So from a user perspective, having stride = kernel_size as the default makes it more intuitive. For example, MaxPool2dConfig::new([2, 2]) will downsample by 2x (typically expected). And overlapping windows have to be explicitly defined by using a smaller stride.

We will have to document the breaking change 🙂 but defaults LGTM unless there are objections @antimora

Indeed, the parsing function sets stride = 1 when it's absent, which matches the ONNX spec.

My point was more about the constructor itself — unlike burn-core, we haven’t provided a dedicated new() constructor for AvgPool1dConfig with ONNX-aligned defaults.

Just wondering if it would be worth introducing that, for consistency across config usage patterns.

It was by design to leave out defaults from constructor and rely on user passed data. The defaults depend on parsed configurations and opset versions. I wanted to make sure the default values are determined from the configuration function only in one place.

The end users won't be using the struct for onnx as in the burns core

@antimora antimora merged commit 16be063 into tracel-ai:main Jul 8, 2025
26 of 38 checks passed
@lucianyao lucianyao deleted the fix-issue-663 branch July 17, 2025 13:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MaxPool2d not outputting proper shape on burn-ndarray and burn-wgpu backends
3 participants