-
Notifications
You must be signed in to change notification settings - Fork 4k
mmaprototype: introduce store status #156938
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sumeerbhola reviewed 2 of 2 files at r8, 1 of 5 files at r9, 6 of 6 files at r10, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @wenyihu6)
pkg/kv/kvserver/allocator/mmaprototype/store_status.go line 48 at r10 (raw file):
// receive leases. LeaseDispositionOK LeaseDisposition = iota // LeaseDispositionRefusing dennotes that the store should not be considered
nit: denotes
pkg/kv/kvserver/allocator/mmaprototype/store_status.go line 51 at r10 (raw file):
// as a target for leases. LeaseDispositionRefusing // LeaseDispositionShedding dennotes that the store should actively be
ditto
pkg/kv/kvserver/allocator/mmaprototype/store_status.go line 60 at r10 (raw file):
// ReplicaDispositionRefusing dennotes that the store can receive replicas. ReplicaDispositionOK ReplicaDisposition = iota // ReplicaDispositionRefusing dennotes that the store should not be considered as
ditto
pkg/kv/kvserver/allocator/mmaprototype/store_status.go line 63 at r10 (raw file):
// a target for replica placement. ReplicaDispositionRefusing // ReplicaDispositionShedding dennotes that the store should actively be
ditto
pkg/kv/kvserver/allocator/mmaprototype/store_status.go line 100 at r10 (raw file):
// are necessarily used by it yet. // // INVARIANT: if Health != HealthOK, Disposition.{Lease,Replica} = Shedding.
I wonder whether we should have one enum type with Refusing, Shedding, Ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mod Sumeer’s comments. Maybe just due to unfamiliarity, I found this much easier to understand than the existing allocator’s
StoreFilter.
Maybe this is too far into the future - do we have a preference for whether we want to preserve the best vs. good selector logic if mma starts doing upreplication https://github.com/cockroachdb/cockroach/blob/master/pkg/kv/kvserver/allocator/allocatorimpl/allocator.go#L1328-L1332?
Not needed right now, but in the future it might be useful to review how filtering is done
| candidateStoreList, aliveStoreCount, throttled := storePool.GetStoreList(storepool.StoreFilterThrottled) |
Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @sumeerbhola and @tbg)
pkg/kv/kvserver/allocator/mmaprototype/store_status.go line 116 at r8 (raw file):
func (ss *Status) assert() error { if ss.Health < 0 || ss.Health > healthCount {
Why not >= for this sentinel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @sumeerbhola and @tbg)
pkg/kv/kvserver/allocator/mmaprototype/store_status.go line 62 at r10 (raw file):
// ReplicaDispositionRefusing dennotes that the store should not be considered as // a target for replica placement. ReplicaDispositionRefusing
Super nit so feel free to ignore - I find reject to be clearer to me than refuse.
It is included in the Stringer, but not otherwise hooked up to anything.
This allows TestClusterState to update the store status. It continues to be true that no semantics are attached to the store status at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is too far into the future - do we have a preference for whether we want to preserve the best vs. good selector logic if mma starts doing upreplication https://github.com/cockroachdb/cockroach/blob/master/pkg/kv/kvserver/allocator/allocatorimpl/allocator.go#L1328-L1332?
Looking at what good and best mean in this context:
this seems like a higher level decision anyway. We will need something that picks from a set of candidates based on the Status, the constraint conformance, and the storeLoadSummary. In the worst case, we can introduce a score-based ranking, though I hope we're not going to end up with a balance and diversity score as they get me confused every time because they do very different things depending on the context and are hard to remember the semantics and purpose of.
Not needed right now, but in the future it might be useful to review how filtering is done
Yes, I was planning on looking into this soon. For now I am thinking (though this plan may hit some snag when I try to do it) to rely mostly on the store pool directly (vs its input signals). That is, whenever we want to update the MMA's view of stores' statuses, we look at the store pool storeStatuses:
@tbg dismissed @sumeerbhola and @wenyihu6 from 6 discussions.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @sumeerbhola and @wenyihu6)
pkg/kv/kvserver/allocator/mmaprototype/store_status.go line 116 at r8 (raw file):
Previously, wenyihu6 (Wenyi Hu) wrote…
Why not >= for this sentinel?
Because silly bug, thanks for catching that. Fixed.
pkg/kv/kvserver/allocator/mmaprototype/store_status.go line 51 at r10 (raw file):
Previously, sumeerbhola wrote…
ditto
Done.
pkg/kv/kvserver/allocator/mmaprototype/store_status.go line 60 at r10 (raw file):
Previously, sumeerbhola wrote…
ditto
Done.
pkg/kv/kvserver/allocator/mmaprototype/store_status.go line 62 at r10 (raw file):
Previously, wenyihu6 (Wenyi Hu) wrote…
Super nit so feel free to ignore - I find reject to be clearer to me than refuse.
Done.
pkg/kv/kvserver/allocator/mmaprototype/store_status.go line 63 at r10 (raw file):
Previously, sumeerbhola wrote…
ditto
Done.
pkg/kv/kvserver/allocator/mmaprototype/store_status.go line 100 at r10 (raw file):
Previously, sumeerbhola wrote…
I wonder whether we should have one enum type with Refusing, Shedding, Ok.
Mhm, I think that could lead to situations in which it isn't clear which of the purposes we're comparing. I want as much type system support as possible. I'll leave as is for now.
|
bors r+ |
This introduces the
Statusstruct and hooks it up to theTestClusterStatetest.It doesn't yet attach any actual semantics to the new member of
storeStatusthat is being added as part of doing so. This will happen in follow-up commits,
loosely replacing the integration points removed in #156933.
Touches #156776.
Epic: CRDB-55052