Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue with unassigned worker initialization when restoring ClusterConfig #943

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Mathos1432
Copy link
Contributor

@Mathos1432 Mathos1432 commented Jan 22, 2025

We saw issues when running the forget command that would cause the different nodes to return invalid information where it looked like a different node was responding. This was caused due to the Role of worker[0] in the cluster config being set to primary (by default) since it was not initialized in this constructor and we skipped initialization of worker 0 here:

public static ClusterConfig FromByteArray(byte[] other)

This then caused the check to see if we knew about the node in the forget command to always be true and we would delete the node that was at index 0 in the array, which would cause all kinds of issue down the line.
https://github.com/microsoft/garnet/blob/c85e281acede27498f239dab41c3f28684abfa57/libs/cluster/Server/ClusterManagerWorkerState.cs#L64C1-L65C1

This was validated with a new test and copying the binaries on a broken cluster and verifying that it no longer showed the issue after restarting it.

@Mathos1432 Mathos1432 force-pushed the users/matrembl/cluster-init branch from 3837939 to 4323360 Compare January 22, 2025 12:52
@badrishc badrishc requested a review from vazois January 22, 2025 19:41
[Category("CLUSTER-CONFIG"), CancelAfter(1000)]
public void ClusterForgetAfterNodeRestartTest()
{
int nbInstances = 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix this warning

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants