Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support persist manifest efficientlly #1596

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

baojinri
Copy link
Contributor

@baojinri baojinri commented Nov 15, 2024

Rationale

close #1592

Detailed Changes

  • support merge manifest ssts background

Test Plan

CI

@github-actions github-actions bot added the feature New feature or request label Nov 15, 2024
@baojinri baojinri force-pushed the feat-support-persist-manifest-efficientlly branch 3 times, most recently from 96179be to 3ea66e8 Compare November 18, 2024 06:01
@baojinri baojinri force-pushed the feat-support-persist-manifest-efficientlly branch from 3ea66e8 to fe5f793 Compare November 18, 2024 06:13
@baojinri baojinri marked this pull request as ready for review November 18, 2024 06:18
@@ -29,13 +39,16 @@ use crate::{

pub const PREFIX_PATH: &str = "manifest";
pub const SNAPSHOT_FILENAME: &str = "snapshot";
pub const SST_PREFIX: &str = "sst";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those files should not be named sst, sst is only used in LSM, and represent sorted string table, here those file are delta files based on snapshot.

}

let mut sst_files = Vec::with_capacity(paths.len());
for path in &paths {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could fetch those files concurrently, one after one is too slow.

.await
.context("failed to read sst file")?;
let pb_sst = pb_types::SstFile::decode(bytes).context("failed to decode sst file")?;
sst_files.push(SstFile::try_from(pb_sst)?);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to dedup files before merge, since there may exists old delta files, which are failed to delete in last merge.

.context("Failed to update manifest")?;

// 2. Delete the old sst files
for path in &paths {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete concurrently

self.store
.delete(path)
.await
.context("failed to delete sst file")?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add path name to log

tokio::spawn(async move {
ssts_merge.run().await;
});

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When start up, we also need to read delta files, if we only read snapshot, then part of sst won't be queried later?

Self {
channel_size: 100,
max_interval_seconds: 5,
merge_threshold: 50,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10-20 is a reasonable choice.

pub async fn add_file(&self, id: FileId, meta: FileMeta) -> Result<()> {
let mut payload = self.payload.write().await;
let mut tmp_ssts = payload.files.clone();
let new_sst_path = Path::from(format!("{}/{id}", self.sst_path));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are too many pending delta files, we should block the write here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support more efficient way to persist manifest of TimeMergeStorage
2 participants