Skip to content

Conversation

@FGasper
Copy link
Collaborator

@FGasper FGasper commented Nov 4, 2025

For some time the verifier has had a race condition where it would, immediately after passing a batch of events to the recheck-enqueue thread, persist the change stream’s resume token. Thus, if the recheck-enqueue thread failed, the verifier could have restarted and skipped documents.

PR #156 aggravated this by storing multiple batches of change events in the channel between the reader and recheck-enqueue threads. Now, if there’s a failure after persisting a resume token, there are very good odds that documents will be skipped.

This changeset fixes that by moving the resume token’s persistence to the recheck-enqueue thread. Now each resume token is sent along with its batch to the recheck-enqueue thread, and only after that batch is persisted is its resume token persisted.

@FGasper FGasper marked this pull request as ready for review November 4, 2025 15:37
@FGasper FGasper requested a review from tdq45gj November 5, 2025 20:03
Copy link
Collaborator

@tdq45gj tdq45gj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if oaux/timer is used anywhere but it's imported in this PR. The rest of the changes LGTM.

@FGasper FGasper requested a review from tdq45gj November 7, 2025 14:27
Copy link
Collaborator

@tdq45gj tdq45gj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@FGasper FGasper merged commit f69ce63 into mongodb-labs:main Nov 7, 2025
99 checks passed
@FGasper FGasper deleted the REP-6785-persist-token-in-handler branch November 7, 2025 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants