Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature suggestion: type accessors #611

Open
alecgibson opened this issue May 22, 2023 · 3 comments
Open

Feature suggestion: type accessors #611

alecgibson opened this issue May 22, 2023 · 3 comments

Comments

@alecgibson
Copy link
Collaborator

Introduction

Mutating objects in sharedb is not necessarily the most intuitive thing to do, which is a shame for a library whose purpose is to mutate objects.

It often requires in-depth understanding of the types being used (by default json0) and how to construct ops for that type.

For users of TypeScript, op construction like this also removes type safety, and autocompletion suggestions.

For example, consider a simple json0 document:

{"foo": "bar"}

To mutate this, we need to consult the json0 docs, and then manually assemble an op:

[
  {
    "p": ["foo"],
    "od": "bar",
    "oi": "baz",
  }
]

As well as being cumbersome to write (and to read!), this can also be error-prone. If we're using TypeScript, we run in to these potentially avoidable errors:

  • accessing an invalid path
  • inserting a property with an incorrect type
  • deleting the incorrect value

Proposal

I'm proposing extension of the sharedb API to allow for a more fluent syntax, where consumers can (seemingly) mutate doc.data directly:

doc.data.foo = 'baz';

This update would be far easier for consumers to use, and would keep TypeScript checks.

It would also stop consumers from accidentally mutating the doc.data object (since this is now allowed!).

Implementation

For a fully-worked json0 example, please see below.

The basic concept is to use JavaScript Proxys to convert setter calls into ops.

Since conversion into ops is type-dependent, the implementation would belong in the types. We could potentially add an optional method to the type spec: accessor(), which returns a proxy object which may be "mutated" directly and instead submits ops.

If we want the type to be unaware of the Doc class (probably for the best), this may be done by having the accessor emit events that the Doc can subscribe to and then submit.

If we wanted to use super-modern JS features, we could potentially use for await...of and have the doc asynchronously iterate ops from the accessor.

Acknowledgement & error handling

Since consumers don't directly call submit() with this API, they also won't have an opportunity to register an error handler. For simple use cases, this may be "good enough" — ShareDB will still emit an error event if anything goes wrong, which can either be handled or leaved to let Node.js processes crash.

However, consumers may also want to wait for the op to be acknowledged by the server for some reason (eg UI updates; performing a dependent task; etc.). They may also want to handle errors on a per-op basis (especially if they want sensible stack traces).

Since setters aren't asynchronous, this would require addition of some mechanism to capture errors from these accessor updates.

One possible way is to just add something like:

class Doc {
  lastOpAcknowledged(callback) {
    if (!this.pendingOps.length) return callback();
    pendingOps[pendingOps.length - 1].callbacks.push(callback);
  }
}

Since ops will only be submitted in the next event loop tick, we can synchronously queue a number of changes, and then synchronously attach a callback:

doc.data.foo= 'baz';
doc.data.updated = Date.now();
doc.lastOpAcknowledged((error) => {
  // Handle errors, etc.
});

We could alternatively add some sort of helper method that will actively batch ops and attach a callback. Consumer code might look like:

doc.submitBatch(
  function() {
    this.data.foo = 'baz';
    this.data.updated = Date.now();
  },
  function(error) {
    // Handle errors, etc.
  },
);

Disadvantages

  • This is pretty magical
  • Could create confusion when moving between types that don't implement accessor()
  • Hides type details from consumers, possibly obfuscating more appropriate op features (eg using lm instead of li+ld; using na instead of oi+od; etc.)
  • Increased CPU + memory usage (although could be made an opt-in feature)
  • It's always hard to tell with JS, but I think Proxy was only finalised in the ES6 spec (and we currently target ES3)

json0 example

Below is a basic example of what this might look like for json0.

The main complications come about from handling arrays (we manually override a number of array methods, taken from Vue's reactivity docs. There are also some corner cases around creating sparse arrays, which json0 doesn't support.

export function json0Accessor(doc) {
  const proxyCache = new WeakMap();
  const pathCache = new WeakMap();

  const ARRAY_OVERRIDES = {
    pop: (target) => function() {
      const [popped] = splice(target)(-1, 1);
      return popped;
    },
    push: (target) => function(...items) {
      const newLength = target.length + items.length;
      splice(target)(target.length, 0, ...items);
      return newLength;
    },
    unshift: (target) => function(...items) {
      const newLength = target.length + items.length;
      splice(target)(0, 0, ...items);
      return newLength;
    },
    shift: (target) => function() {
      const [shifted] = splice(target)(0, 1);
      return shifted;
    },
    splice,
    sort: () => () => {
      // Left as an exercise to the reader to implement using `lm`
      throw new Error("Unsupported method 'sort'");
    },
    reverse: () => () => {
      // Left as an exercise to the reader to implement using `lm`
      throw new Error("Unsupported method 'reverse'");
    },
  };

  return new Proxy(doc, {
    get(target, key) {
      if (key === 'data') return doc.data && deepProxy(doc.data);
      return target[key];
    },
  });

  function deepProxy(obj) {
    return new Proxy(obj, {
      get(target, key) {
        const value = target[key];

        if (Array.isArray(target) && key in ARRAY_OVERRIDES) {
          return ARRAY_OVERRIDES[key](target);
        }

        if (!value || typeof value !== 'object') return value;
        const parentPath = pathCache.get(target) || [];
        const path = parentPath.concat(normalizedKey(target, key));
        pathCache.set(value, path);
        if (proxyCache.has(value)) return proxyCache.get(value);
        const proxy = deepProxy(value);
        proxyCache.set(value, proxy);
        return proxy;
      },
      set(target, key, value) {
        key = normalizedKey(target, key);
        const p = pathCache.get(target).concat(key);
        let op = {p, od: target[key], oi: value};
        if (Array.isArray(target)) {
          // Setting length directly can create a sparse Array, which we can't do with json0
          if (key === 'length') throw new Error('Cannot set Array.length with writeableJSON');
          else if (typeof key !== 'number') throw new Error('Cannot set non-numeric keys on Arrays');
          // Setting keys outside our length also creates a sparse index
          if (key >= target.length) throw new Error('Cannot set index outside Array bounds');
          op = {p, ld: target[key], li: value};
        }
        submitOp(op);
        return true;
      },
      deleteProperty(target, key: PropertyKey) {
        // Using the delete keyword with an Array creates a sparse Array, which we can't do with json0
        if (Array.isArray(target)) throw new Error('Cannot use delete with arrays');
        key = normalizedKey(target, key);
        const p = pathCache.get(target).concat(key);
        submitOp({p, od: target[key]});
        return true;
      },
    });
  }

  function submitOp(op) {
    if (doc.type.uri !== 'http://sharejs.org/types/JSONv0') throw new Error(`Cannot with type '${doc.type.uri}'`);
    doc.submitOp(op);
  }

  function normalizedKey(target, key) {
    if (!Array.isArray(target)) return key;
    const normalized = +key.toString();
    return Number.isInteger(normalized) ? normalized : key;
  }

  function splice(target) {
    return function(start, deleteCount, ...items) {
      start = +start;
      deleteCount = arguments.length > 1 ? +deleteCount : target.length;
      if (start < 0) start = target.length + start;
      start = Math.min(start, target.length);
      start = Math.max(start, 0);

      const path = pathCache.get(target);
      const p = path.concat(start);
      const deleted = target.slice(start, start + deleteCount);
      for (const ld of deleted) submitOp({p, ld});
      let i = start;
      for (const item of items) {
        submitOp({p: path.concat(i), li: item});
        i++;
      }
      return deleted;
    };
  }
}
@ericyhwang
Copy link
Contributor

From discussion today:

  • Opt in is good
  • submitBatch option could provide the mutable doc data for you, which prevents issues with forgetting to handle op submission errors that would cause unhandled errors / promise rejections
  • Doing it as a plugin should mean cleaner code since the plugin can be Doc-aware, and then it's not tied to core json0

@Qquanwei
Copy link

I think for simple scenarios, proxy can avoid using json0 to construct op object, but for some cases, there will be problems. For example, if I have an array and I want to swap the position of 1,3, maybe use

tmp = data[3];
data[3] = data[1];
data[1] = tmp;

In this case we cannot identify whether to swap the order of the array or to assign values ​​to two elements.

@alecgibson
Copy link
Collaborator Author

@Qquanwei yes I agree; I've stated that in the "Disadvantages" section. I think for things like this, consumers would be expected to use the "traditional" submitOp() machinery they want specific semantics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants