Skip to content

Commit

Permalink
Fix init of WAL page header at startup (#8914)
Browse files Browse the repository at this point in the history
If the primary is started at an LSN within the first of a 16 MB WAL
segment, the "long XLOG page header" at the beginning of the segment was
not initialized correctly. That has gone unnnoticed, because under
normal circumstances, nothing looks at the page header. The WAL that is
streamed to the safekeepers starts at the new record's LSN, not at the
beginning of the page, so that bogus page header didn't propagate
elsewhere, and a primary server doesn't normally read the WAL its
written. Which is good because the contents of the page would be bogus
anyway, as it wouldn't contain any of the records before the LSN where
the new record is written.

Except that in the following cases a primary does read its own WAL:

1. When there are two-phase transactions in prepared state at
checkpoint. The checkpointer reads the two-phase state from the
XLOG_XACT_PREPARE record, and writes it to a file in pg_twophase/.

2. Logical decoding reads the WAL starting from the replication slot's
restart LSN.

This PR fixes the problem with two-phase transactions. For that, it's
sufficient to initialize the page header correctly. The checkpointer
only needs to read XLOG_XACT_PREPARE records that were generated after
the server startup, so it's still OK that older WAL is missing / bogus.

I have not investigated if we have a problem with logical decoding,
however. Let's deal with that separately.

Special thanks to @Lzjing-1997, who independently found the same bug
and opened a PR to fix it, although I did not use that PR.
  • Loading branch information
hlinnaka authored Sep 21, 2024
1 parent f03f7b3 commit 9a32aa8
Show file tree
Hide file tree
Showing 6 changed files with 34 additions and 13 deletions.
31 changes: 26 additions & 5 deletions test_runner/regress/test_twophase.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
PgBin,
fork_at_current_lsn,
import_timeline_from_vanilla_postgres,
wait_for_wal_insert_lsn,
)


Expand All @@ -22,11 +23,6 @@ def twophase_test_on_timeline(env: NeonEnv):
conn = endpoint.connect()
cur = conn.cursor()

# FIXME: Switch to the next WAL segment, to work around the bug fixed in
# https://github.com/neondatabase/neon/pull/8914. When that is merged, this can be
# removed.
cur.execute("select pg_switch_wal()")

cur.execute("CREATE TABLE foo (t text)")

# Prepare a transaction that will insert a row
Expand Down Expand Up @@ -140,3 +136,28 @@ def test_twophase_nonzero_epoch(
vanilla_pg.stop() # don't need the original server anymore

twophase_test_on_timeline(env)


def test_twophase_at_wal_segment_start(neon_simple_env: NeonEnv):
"""
Same as 'test_twophase' test, but the server is started at an LSN at the beginning
of a WAL segment. We had a bug where we didn't initialize the "long XLOG page header"
at the beginning of the segment correctly, which was detected when the checkpointer
tried to read the XLOG_XACT_PREPARE record from the WAL, if that record was on the
very first page of a WAL segment and the server was started up at that first page.
"""
env = neon_simple_env
timeline_id = env.neon_cli.create_branch("test_twophase", "main")

endpoint = env.endpoints.create_start(
"test_twophase", config_lines=["max_prepared_transactions=5"]
)
endpoint.safe_psql("SELECT pg_switch_wal()")

# to avoid hitting https://github.com/neondatabase/neon/issues/9079, wait for the
# WAL to reach the pageserver.
wait_for_wal_insert_lsn(env, endpoint, env.initial_tenant, timeline_id)

endpoint.stop_and_destroy()

twophase_test_on_timeline(env)
2 changes: 1 addition & 1 deletion vendor/postgres-v14
2 changes: 1 addition & 1 deletion vendor/postgres-v15
2 changes: 1 addition & 1 deletion vendor/postgres-v16
2 changes: 1 addition & 1 deletion vendor/postgres-v17
8 changes: 4 additions & 4 deletions vendor/revisions.json
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
{
"v17": [
"17rc1",
"dadd6fe208bb906cc0a48980f2ab4e13c47ba3ad"
"2cf120e7393ca5f537c6a38b457585576dc035fc"
],
"v16": [
"16.4",
"d009084a745cb4d5e6de222c778b2a562c8b2767"
"1d7081a3b076ddf5086e0b118d4329820e6a7427"
],
"v15": [
"15.8",
"1dbd6f316416c8360bbd4f3d6db956cf70937cf0"
"16c3c6b64f1420a367a2a9b2510f20d94f791af8"
],
"v14": [
"14.13",
"f9c51c12438b20049b6905eb4e43d321defd6ff2"
"a38d15f3233a4c07f2bf3335fcbd874dd1f4e386"
]
}

1 comment on commit 9a32aa8

@github-actions
Copy link

@github-actions github-actions bot commented on 9a32aa8 Sep 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5091 tests run: 4917 passed, 0 failed, 174 skipped (full report)


Flaky tests (10)

Postgres 17

Postgres 16

Postgres 15

Code coverage* (full report)

  • functions: 31.9% (7435 of 23317 functions)
  • lines: 49.9% (59940 of 120118 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
9a32aa8 at 2024-09-21T09:57:08.493Z :recycle:

Please sign in to comment.