Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge tree: Validate exact range obliterate with expansion #23614

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jzaffiro
Copy link
Contributor

@jzaffiro jzaffiro commented Jan 21, 2025

While some of the bugs with overlapping ranges are still being resolved, add coverage for conflicting obliterates with the exact range being replaced.

AB#26798

@github-actions github-actions bot added area: dds Issues related to distributed data structures base: main PRs targeted against main branch labels Jan 21, 2025
@jzaffiro jzaffiro changed the title Updated conflict farm merge tree: Validate exact range obliterate with expansion Jan 21, 2025
@jzaffiro jzaffiro marked this pull request as ready for review January 21, 2025 19:50
@Copilot Copilot bot review requested due to automatic review settings January 21, 2025 19:50
@jzaffiro jzaffiro requested a review from a team as a code owner January 21, 2025 19:50

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (2)

packages/dds/merge-tree/src/test/mergeTreeOperationRunner.ts:49

  • [nitpick] The function numberRange returns endPos as startPos + 4, which could be misleading. Consider renaming or clarifying its purpose.
};

packages/dds/merge-tree/src/test/mergeTreeOperationRunner.ts:32

  • [nitpick] The function insertField does not have a return type specified. Consider adding a return type for clarity.
export const insertField: TestOperation = (
name: "obliterate fields",
config: {
...opts,
operations: [insertField, obliterateField],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we will still want regular inserts and removes here, but they will also need to know about fields, so they don't do partial changes to them.

I also think the generateOperationMessagesForClients function may need to change to not split fields.

also, what happens if you enable applyOpDuringGeneration

opEnd: number,
random: IRandom,
) => {
const numberText = random.string(5, "0123456789");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe do something like (client.longClientId.codePointAt(0) %10).repeat(5) so that each client gets a uniqe-ish field id which can make debugging and reading logs easier.

Copy link
Contributor

@anthony-murphy anthony-murphy Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i also think fields will need endpoint chars, like |number| otherwise your obliterate field function can't find just one field if two are next to each to each other

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be good to have various lengths.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the subsequent code prevents adjacency.

const numberText = random.string(5, "0123456789");
if (
// start is not a number
Number.isNaN(Number(client.getText(opStart, opStart + 1))) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Number.isNaN(Number(client.getText(opStart, opStart + 1))) &&
Number.isInteger(client.getText(opStart, opStart + 1))) &&

start: number,
): { startPos: number | undefined; endPos: number | undefined } => {
let startPos = start;
while (startPos > 0 && !Number.isNaN(Number(client.getText(startPos, startPos + 1)))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you need to check is both directions?

Copy link
Contributor

@anthony-murphy anthony-murphy Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh. i see. you depend on the format, so you find the start and know the end. this still has the problem of not being able to find fields which are next to one another, you will only even find the first one. making the format more structured will help with this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should add some comments here, and where the fields are generated, as this code is tightly coupled, and if either side changes the test could break

}
};

const numberRange = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would probably call this something like getFieldExtents and return {start, end} | undefined and use this everywhere you inspect to see if a position is a field. if it is, you get the start and end, if it is not, you get undefined.

const numberRange = (
client: TestClient,
start: number,
): { startPos: number | undefined; endPos: number | undefined } => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startPos and endPos should never be undefined

return { startPos, endPos: startPos + 4 };
};

export const obliterateField: TestOperation = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case we care about, the field value never only gets obliterated. It's always obliterate [with endpoint expansion] + insert updated value. (If it's just obliterating, then we start hitting the zero-length case.) Not sure how much this distinction matters here; I think it might when two clients are competing to replace the same field value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: dds Issues related to distributed data structures base: main PRs targeted against main branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants