Skip to content

Conversation

dfawley
Copy link
Member

@dfawley dfawley commented Jul 18, 2025

Fixes: #8453

This is part of a two-step change to eliminate the potentially-multiple calls for PickerUpdated as we do today. Other languages only have a maximum of one event for this, when the pick finally completed, but in Go we implemented it as a call every time the picker was updated. The plan is to implement the full change as follows:

  1. Add DelayedPickComplete and make PickerUpdated a type alias of it, and change the semantics to match other languages (v1.x).
  2. Delete PickerUpdated (v1.x+1)

RELEASE NOTES:

  • stats: introduce DelayedPickComplete event, a type alias of PickerUpdated. This (combined) event will now be emitted only once per call, when a transport is successfully selected for the attempt. OpenTelemetry metrics will no longer have multiple "Delayed LB pick complete" events in Go, matching other gRPC languages. A future release will delete PickerUpdated.

@dfawley dfawley added this to the 1.75 Release milestone Jul 18, 2025
@dfawley dfawley requested a review from arjan-bal July 18, 2025 20:47
@dfawley dfawley added the Type: Behavior Change Behavior changes not categorized as bugs label Jul 18, 2025
Copy link

codecov bot commented Jul 18, 2025

Codecov Report

Attention: Patch coverage is 90.90909% with 2 lines in your changes missing coverage. Please review.

Project coverage is 82.54%. Comparing base (cc46259) to head (2dae321).
Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
stats/stats.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8465      +/-   ##
==========================================
+ Coverage   82.42%   82.54%   +0.12%     
==========================================
  Files         414      414              
  Lines       40434    40464      +30     
==========================================
+ Hits        33326    33400      +74     
+ Misses       5752     5718      -34     
+ Partials     1356     1346      -10     
Files with missing lines Coverage Δ
clientconn.go 90.79% <100.00%> (+0.22%) ⬆️
picker_wrapper.go 97.00% <100.00%> (-0.09%) ⬇️
stats/opentelemetry/trace.go 88.88% <100.00%> (ø)
stream.go 81.73% <100.00%> (-0.18%) ⬇️
stats/stats.go 68.42% <0.00%> (ø)

... and 19 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@arjan-bal arjan-bal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than a couple of minor comments. I've linked the associated issue in the description. Thank you for fixing this!

FYI @vinothkumarr227.

stats/stats.go Outdated
// PickerUpdated indicates that the RPC is unblocked following a delay in
// selecting a connection for the call.
//
// Deprecated: will be removed in a future release.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also specify that DelayedPickComplete should be used instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Related: I think technically we're not supposed to put "Deprecated" on this until after DelayedPickComplete exists for a release. But since this package is experimental, I think we can skip that and call it deprecated immediately.

@@ -481,6 +482,11 @@ func (a *csAttempt) getTransport() error {
if a.trInfo != nil {
a.trInfo.firstLine.SetRemoteAddr(a.transport.RemoteAddr())
}
if pick.blocked {
for _, sh := range a.statsHandlers {
sh.HandleRPC(a.ctx, &stats.DelayedPickComplete{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also call stats.PickerUpdated to keep the previous behaviour until PickerUpdated is removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a type alias so the stats handler will get one event that matches both of the events.

if pickerUpdatedCount != 1 {
t.Fatalf("sh.pickerUpdated count: %v, want: %v", pickerUpdatedCount, 2)
if delayedPickCompleteCount != 1 {
t.Fatalf("sh.delayedPickComplete count: %v, want: %v", delayedPickCompleteCount, 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We compare delayedPickCompleteCount to 1, but the log mentions 2. I think the log message needs to be fixed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@arjan-bal arjan-bal assigned dfawley and unassigned arjan-bal Jul 21, 2025
@dfawley dfawley closed this Jul 21, 2025
@dfawley dfawley reopened this Jul 21, 2025
@dfawley dfawley merged commit a5e7cd6 into grpc:master Jul 21, 2025
43 of 47 checks passed
dimpavloff pushed a commit to dimpavloff/grpc-go that referenced this pull request Aug 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Behavior Change Behavior changes not categorized as bugs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Duplicate Delayed LB pick complete event emitted by old pick_first policy.
2 participants