Skip to content
This repository has been archived by the owner on Oct 18, 2023. It is now read-only.

Same-thread frame injection #695

Merged
merged 10 commits into from
Oct 13, 2023
Merged

Same-thread frame injection #695

merged 10 commits into from
Oct 13, 2023

Conversation

MarinPostma
Copy link
Collaborator

@MarinPostma MarinPostma commented Sep 24, 2023

A series of improvements to the frame injection process:

  • in-place injection: there is no need anymore to have a separate thread for the injector connection
  • faster replication from snapshots: the replicator was re-performing a handshake in between snapshot
  • Frame API refactor to allow in-place frame header mutation (this one might need a bit of rework later on. We force the alignment of the whole frame when only the header needs to be aligned, meaning we have to perform a full copy on the frame each time. Thinking about it now, we can do better, work with unaligned frames, and mutate the header by writing it back to the frame data)

@MarinPostma MarinPostma force-pushed the same-thread-frame-injector branch 8 times, most recently from 1c52981 to 30ec997 Compare September 26, 2023 09:33
@MarinPostma MarinPostma marked this pull request as ready for review September 26, 2023 14:57
Copy link
Contributor

@psarna psarna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is too vast for my tiny brain to review in one go, so I'll do it in stages. First batch of comments:


Ok(())
} else {
anyhow::bail!("failed to apply pages");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we bother updating is_txn here, should we?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't bother, because this is caught as a fatal replication error anyway, and everything blows up

OpenFlags::SQLITE_OPEN_READ_WRITE
| OpenFlags::SQLITE_OPEN_CREATE
| OpenFlags::SQLITE_OPEN_URI
| OpenFlags::SQLITE_OPEN_NO_MUTEX,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: once we migrate to libsql crate, we need to audit that NO_MUTEX is still ok, because there's been discussions around which threading model we should use.

sqld/src/replication/replica/injector/mod.rs Outdated Show resolved Hide resolved
sqld/src/replication/replica/injector/mod.rs Outdated Show resolved Hide resolved
sqld/src/replication/replica/injector/mod.rs Outdated Show resolved Hide resolved
sqld/src/rpc/replication_log.rs Show resolved Hide resolved
impl Frame {
/// size of a single frame
pub const SIZE: usize = size_of::<FrameHeader>() + WAL_PAGE_SIZE as usize;
/// Owned version of a frame, on the heap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing the purpose of the new FrameMut and FrameBorrowed. Why do we need it? Feel free to also post the answer in the commit message

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, only got back to this PR now. So basically, FrameBorrowed is a stack data structure that contains the frame's raw data. But since it's quite big, you don't really ever get an owned version. Instead, you get a Frame and a FrameMut that are the stack-allocated equivalents. Both deref to a FrameBorrowed, with the difference between FrameMut and Frame being that Frame is a shared pointer that is cheaply clonable, and FrameMut is an exclusive reference to the Frame.

@MarinPostma MarinPostma marked this pull request as draft October 6, 2023 09:43
@MarinPostma MarinPostma force-pushed the same-thread-frame-injector branch 2 times, most recently from 43c0012 to 2d645cd Compare October 10, 2023 14:59
@MarinPostma MarinPostma marked this pull request as ready for review October 10, 2023 16:11
&TRANSPARENT_METHODS,
// safety: hook is dropped after connection
(),
DEFAULT_AUTO_CHECKPOINT,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Autocheckpoint should be preventively disabled in this case. If we somehow end up having 999 pages in WAL while recreating this table (e.g. because the user mistakenly dropped it), this call will trigger an autocheckpoint outside of our virtual WAL. And that we don't want, especially if we later introduce checkpoint logic that only allows checkpointing frames that were backed up in S3.

sqld/src/replication/replica/replicator.rs Show resolved Hide resolved
let connection = self.connection.lock();
connection.execute("INSERT INTO libsql_temp_injection VALUES (42)", ())?;
// force call to xframe
match connection.cache_flush() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarinPostma did you observe xFrames not being called without it? I found it weird, the docs https://www.sqlite.org/c3ref/db_cacheflush.html suggest that it's for flushing pages mid-transaction, and a single insert is a whole transaction on its own. It should be flushed to disk the moment you finish calling execute. And rusqlite's cache_flush is just a wrapper over sqlite3_db_cacheflush.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are in the context of a transaction here, so the call to INSERT never causes xFrame to get called

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are in the context of a transaction here

Ok, that's new, and proves I didn't follow the idea closely enough.. I see the begin_txn now. And I suppose we take it so that we can inject frames in multiple xFrames calls, but we need it to happen atomically anyway?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea sir, this way we can apply very large transactions, such as snapshots

Copy link
Contributor

@psarna psarna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok, although I'm still anxious if we never leave this new transaction hanging forever, preventing further injections and checkpoints. I suppose the idea is that either we apply a finite number of frames, or exit with a fatal error, right? Is there are non-fatal error that would leave a transaction hanging, or do we ever wait for the network in transaction context?

@MarinPostma MarinPostma force-pushed the same-thread-frame-injector branch 2 times, most recently from e581066 to 742fe9d Compare October 13, 2023 06:52
@MarinPostma MarinPostma added this pull request to the merge queue Oct 13, 2023
Merged via the queue into main with commit ebb7ae9 Oct 13, 2023
5 checks passed
@MarinPostma MarinPostma deleted the same-thread-frame-injector branch October 13, 2023 08:02
LucioFranco added a commit that referenced this pull request Oct 13, 2023
MarinPostma pushed a commit that referenced this pull request Oct 15, 2023
github-merge-queue bot pushed a commit that referenced this pull request Oct 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants