Add job benchmark loop #2226

Kobzol · 2025-08-20T10:32:31Z

This PR adds the main logic required to execute job benchmarks in the collector.

It handles:

Updating the collector's heartbeat periodically
Quick loading of already downloaded sysroots, to avoid redownloading them in-between jobs and also collector restarts (useful for local testing)
Dequeing jobs, including in-progress jobs and expanding benchmark sets
Distinguishing between transient and permanent job errors, storing job errors into the DB, and marking jobs as failed or successful
Marking jobs that have been dequeued too many times as failed
Reconnecting to the DB if a transient I/O/network/DB error happens, to try to refresh the DB connection

…enchmark

Jamesbarford · 2025-08-20T12:17:20Z

collector/src/toolchain.rs

-            component,
-            urls
-        ))
+        if !non_404_error {


I could be mistaken however this reads slightly oddly; if http is not a not found 404 then we produce an error of sha not found? Then we fall through to an IO error if it was a 404 despite the sha actually not being found.

The variable name wasn't great, renamed it and added a comment.

database/src/pool/postgres.rs

Jamesbarford · 2025-08-20T12:24:30Z

collector/src/toolchain.rs

+        if !non_404_error {
+            Err(SysrootDownloadError::SysrootShaNotFound)
+        } else {
+            Err(SysrootDownloadError::IO(anyhow!(


We can use resp.error_for_status() so we can get the actual error message too from the response which could be useful for debugging; https://docs.rs/reqwest/latest/reqwest/struct.Response.html#method.error_for_status

error_for_status() is nice, but the reason why we don't provide more context here is simply because we have up to three (potentially different) errors. Now that we detect 404s explicitly, we could just bail out on the first non-404 error, but I'm a bit worried about backwards compatibility, I don't know if "toolchain not found" is always reported with a 404.

Kobzol added 9 commits August 20, 2025 09:20

Remove unused code

293c8e7

Implement conversion from a DB profile to collector's profile

56637e3

Implement job queue deque and basic benchmark loop in the collector

61b0090

Add support for runtime benchmarks

015475c

Do not re-download sysroot from CI if it already exists on disk

aa31f21

Add a function to update collector heartbeat

216c1ed

Distinguish transient and permanent errors that happen during a job b…

f22b8fc

…enchmark

Small cleanups

cd926b0

Reconnect to the database when a transient error occurs

3e1db73

Kobzol requested a review from Jamesbarford August 20, 2025 10:32

Jamesbarford requested changes Aug 20, 2025

View reviewed changes

Review remarks

6c33242

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add job benchmark loop #2226

Add job benchmark loop #2226

Kobzol commented Aug 20, 2025

Uh oh!

Jamesbarford Aug 20, 2025

Uh oh!

Kobzol Aug 20, 2025

Uh oh!

Uh oh!

Jamesbarford Aug 20, 2025

Uh oh!

Kobzol Aug 20, 2025

Uh oh!

Uh oh!

Add job benchmark loop #2226

Are you sure you want to change the base?

Add job benchmark loop #2226

Conversation

Kobzol commented Aug 20, 2025

Uh oh!

Jamesbarford Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

Kobzol Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Jamesbarford Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

Kobzol Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!