Skip to content

Conversation

ftynse
Copy link
Contributor

@ftynse ftynse commented Sep 5, 2025

This is more representative of the real workload

Add an option to support input permutations in the signature and
generate sample arguments accordingly.

Refactor the Permutation class from conv into a generic utils.

Signed-off-by: Alex Zinenko <git@ozinenko.com>
Go through `torch.compile` in the boo driver instead of calling the
kernel directly. This allows to collect more realistic execution time
statistics that are close to what the final user will see after
integration. The direct kernel invocation is kept as a reference option
to evaluate the overhead.

Tweak the profiler configuration to ignore initialization and cleanup
steps so as to avoid skewing aggregate statistics.

Signed-off-by: Alex Zinenko <git@ozinenko.com>
This is more representative of the real workload

Signed-off-by: Alex Zinenko <git@ozinenko.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant