Skip to content

[SPARK-52936][INFRA][TESTS] Benchmark result update automation #51643

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from
Closed
30 changes: 28 additions & 2 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,11 @@ on:
description: 'Number of job splits'
required: true
default: '1'
create-commit:
type: boolean
description: 'Commit the benchmark results to the current branch'
required: true
default: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making this false by default.


jobs:
matrix-gen:
Expand Down Expand Up @@ -195,10 +200,31 @@ jobs:
# To keep the directory structure and file permissions, tar them
# See also https://github.com/actions/upload-artifact#maintaining-file-permissions-and-case-sensitive-files
echo "Preparing the benchmark results:"
tar -cvf benchmark-results-${{ inputs.jdk }}-${{ inputs.scala }}.tar `git diff --name-only` `git ls-files --others --exclude=tpcds-sf-1 --exclude=tpcds-sf-1-text --exclude-standard`
tar -cvf target/benchmark-results-${{ inputs.jdk }}-${{ inputs.scala }}.tar `git diff --name-only` `git ls-files --others --exclude=tpcds-sf-1 --exclude=tpcds-sf-1-text --exclude-standard`
- name: Create a pull request with the results
if: ${{ inputs.create-commit && success() }}
run: |
git config --local user.name "${{ github.actor }}"
git config --local user.email "${{ github.event.pusher.email || format('{0}@users.noreply.github.com', github.actor) }}"
git add -A
git commit -m "Benchmark results for ${{ inputs.class }} (JDK ${{ inputs.jdk }}, Scala ${{ inputs.scala }}, split ${{ matrix.split }} of ${{ inputs.num-splits }})"
for i in {1..5}; do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 5 a magic number? Why do we need this repetition?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is. For a large job split, we need to rsync the current w/ the target branch if another one has committed

echo "Attempt $i to push..."
git fetch origin ${{ github.ref_name }}
git rebase origin/${{ github.ref_name }}
if git push origin ${{ github.ref_name }}:${{ github.ref_name }}; then
echo "Push successful."
exit 0
else
echo "Push failed, retrying in 3 seconds..."
sleep 3
fi
done
echo "Error: Failed to push after 5 attempts."
exit 1
- name: Upload benchmark results
uses: actions/upload-artifact@v4
with:
name: benchmark-results-${{ inputs.jdk }}-${{ inputs.scala }}-${{ matrix.split }}
path: benchmark-results-${{ inputs.jdk }}-${{ inputs.scala }}.tar
path: target/benchmark-results-${{ inputs.jdk }}-${{ inputs.scala }}.tar

4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -125,3 +125,7 @@ sql/api/gen/
sql/api/src/main/gen/
sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseLexer.tokens
sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/gen/

tpcds-sf-1/
tpcds-sf-1-text/
tpcds-kit/