Skip to content

Conversation

@nikParasyr
Copy link
Contributor

@nikParasyr nikParasyr commented Nov 1, 2025

What this PR does / why we need it:
When (re)creating a pool member, wait until its
provisioning_status is ACTIVE. This avoids scenarios
were CAPO contrinues removing members before
new ones are actually active

Which issue(s) this PR fixes:
Fixes #2763

Special notes for your reviewer:

  1. I have not implemented a "waitForPoolMemberDelete" as I'm not sure whether this is required. If it is let me know

TODOs:

  • squashed commits
  • if necessary:
    • includes documentation
    • adds unit tests

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 1, 2025
@netlify
Copy link

netlify bot commented Nov 1, 2025

Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!

Name Link
🔨 Latest commit bd31bfc
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-sigs-cluster-api-openstack/deploys/690c5a8b68ca3a0008b1560a
😎 Deploy Preview https://deploy-preview-2815--kubernetes-sigs-cluster-api-openstack.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 1, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @nikParasyr. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@nikParasyr nikParasyr changed the title 🐛 Ensure pool member reaches active state 🐛 Ensure pool member reach active state Nov 1, 2025
@lentzi90
Copy link
Contributor

lentzi90 commented Nov 3, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 3, 2025
Copy link
Contributor

@lentzi90 lentzi90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but let's wait for #2799 and those unit tests before merging.

@nikParasyr nikParasyr force-pushed the issue-2763 branch 2 times, most recently from 2f55d53 to 69d00f2 Compare November 5, 2025 11:58
@nikParasyr
Copy link
Contributor Author

LGTM, but let's wait for #2799 and those unit tests before merging.

Done

const (
loadBalancerProvisioningStatusActive = "ACTIVE"
loadBalancerProvisioningStatusPendingDelete = "PENDING_DELETE"
poolMembeProvisioningStatusActive = "ACTIVE"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

poolMembeProvisioningStatusActive => poolMemberProvisioningStatusActive

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! fixed

When (re)creating a pool member, wait until its
provisioning_status is ACTIVE. This avoids scenarios
were CAPO contrinues removing members before new ones
are actually active
Copy link
Contributor

@bnallapeta bnallapeta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 6, 2025
Copy link
Contributor

@lentzi90 lentzi90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 6, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bnallapeta, lentzi90

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 6, 2025
@lentzi90
Copy link
Contributor

lentzi90 commented Nov 6, 2025

We seem to have some issues with the cleanup of the e2e tests sometimes. They get stuck after completion.
This one passed but it is still "running"...

 Ran 11 of 16 Specs in 3129.372 seconds
SUCCESS! -- 11 Passed | 0 Failed | 0 Pending | 5 Skipped
Ginkgo ran 1 suite in 52m11.558997621s
Test Suite Passed
real	52m11.566s
user	1m24.301s
sys	0m15.667s
+ test_status=0
+ '[' -z boskos.test-pods.svc.cluster.local ']'
+ python3 hack/boskos.py --release
+ exit 0
+ cleanup
+ [[ -z 575 ]]
+ kill -9 575
./scripts/ci-e2e.sh: line 42: kill: (575) - No such process
+ EXIT_VALUE=1
+ set +o xtrace
Cleaning up after docker in docker.
================================================================================
Waiting 30 seconds for pods stopped with terminationGracePeriod:30
Cleaning up after docker
Waiting for docker to stop for 30 seconds
Stopping Docker: dockerProgram process in pidfile '/var/run/docker-ssd.pid', 1 process(es), refused to die.
================================================================================
Done cleaning up after docker in docker. 

Retriggering
/test pull-cluster-api-provider-openstack-e2e-test

@lentzi90
Copy link
Contributor

lentzi90 commented Nov 6, 2025

Looks like flake. If this one also fails we need to look deeper
/test pull-cluster-api-provider-openstack-e2e-test

@k8s-ci-robot k8s-ci-robot merged commit a9d872e into kubernetes-sigs:main Nov 6, 2025
12 checks passed
@github-project-automation github-project-automation bot moved this from Inbox to Done in CAPO Roadmap Nov 6, 2025
@nikParasyr nikParasyr deleted the issue-2763 branch November 6, 2025 14:23
@mnaser
Copy link
Contributor

mnaser commented Nov 6, 2025

@lentzi90 should we backport this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

CAPO does not observe Member provisioning state when adding/deleting members of a pool

5 participants