Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

icinga2: Reset Event#ID before resending ack set event #243

Merged
merged 1 commit into from
Jul 18, 2024

Conversation

yhabteab
Copy link
Member

Event#Sync() synchronises an event row with the database only if its ID is set to no other value than 0. Therefore, we must reset the event ID for the acknowledgement set event generated in the catch-up phase before resending it.

@yhabteab yhabteab added the bug Something isn't working label Jul 17, 2024
@yhabteab yhabteab requested a review from oxzi July 17, 2024 14:27
@cla-bot cla-bot bot added the cla/signed CLA is signed by all contributors of a PR label Jul 17, 2024
Copy link
Member

@oxzi oxzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to retrace the steps how this can happened. Thanks for giving me some pointers, @yhabteab, as otherwise I have totally missed it.

  • The journey of our new fakeEv starts in checkMissedChanges, from where it will be send to catchupEventCh by reference
    case catchupEventCh <- &catchupEventMsg{eventMsg: &eventMsg{fakeEv, attrs.LastStateChange.Time()}}:
  • It ends up in the worker's catchup case..
    case catchupMsg, ok := <-catchupEventCh:
  • ..and will be eventually passed to the CallbackFn
    client.CallbackFn(ev)
  • In there, our adventurous event will be passed on to incident.ProcessEvent
    err := incident.ProcessEvent(subCtx, launcher.Db, launcher.Logs, launcher.RuntimeConfig, ev)
  • As it is a new incident, it gets .Synced
    err = utils.RunInTx(ctx, db, func(tx *sqlx.Tx) error { return ev.Sync(ctx, tx, db, obj.ID) })
  • And ends up in the database and gets its .ID set
    eventRow := NewEventRow(e, objectId)
    eventID, err := utils.InsertAndFetchId(ctx, tx, utils.BuildInsertStmtWithout(db, eventRow, "id"), eventRow)
    if err == nil {
    e.ID = eventID
    }
  • Back in checkMissedChanges, after sending the host service event, the fakeEv might be sent again
    case catchupEventCh <- &catchupEventMsg{eventMsg: &eventMsg{ev, attrs.LastStateChange.Time()}}:
    stateChangeEvents++
    if fakeEv.Type == event.TypeAcknowledgementSet {
    fakeEv.ID = 0 // We don't want to reference a previously sent event, so reset its ID!
    select {
    // Retry the AckSet event so that the author of the ack is set as the incident
    // manager if there was no existing incident before the above state change event.
    case catchupEventCh <- &catchupEventMsg{eventMsg: &eventMsg{fakeEv, attrs.LastStateChange.Time()}}:
  • Now, I am going to skip some steps as shown above until the event ends up in Sync
    if e.ID != 0 {
    return nil
    }
  • And here it goes silently unnoticed as it got an ID in the meantime. Yikes.

I am going to take another look tomorrow.

@yhabteab
Copy link
Member Author

If it is synced here, there should be no problems. The real problem, however, is that if there is an existing incident, that fake event will make its way to the fellowing scopes multiple times.

if err = ev.Sync(ctx, tx, i.db, i.Object.ID); err != nil {

if err = i.AddEvent(ctx, tx, ev); err != nil {
i.logger.Errorw("Can't insert incident event to the database", zap.Error(err))

When the incident finally notices that this ack set event is superfluous here ...

if err := i.processAcknowledgementEvent(ctx, tx, ev); err != nil {
if errors.Is(err, errSuperfluousAckEvent) {

It will simply return without committing the transaction, leaving the now dangling ev.ID set above.

Comment on lines 322 to 326
case catchupEventCh <- &catchupEventMsg{eventMsg: &eventMsg{ev, attrs.LastStateChange.Time()}}:
stateChangeEvents++
if fakeEv.Type == event.TypeAcknowledgementSet {
ackEvent := *fakeEv
ackEvent.ID = 0 // We don't want to reference a previously sent event, so reset its ID!
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate the whole event before sending it to any channel

By that, I meant before fakeEv is sent to any channel for the first time, i.e. before line 305. That way, the pointer wasn't yet passed anywhere else and duplicating it is safe for sure. So that would mean creating a second variable next to fakeEv and duplicating the event right after client.buildAcknowledgementEvent().

@julianbrost julianbrost merged commit 74e05ee into main Jul 18, 2024
12 checks passed
@julianbrost julianbrost deleted the do-not-reuse-event-id branch July 18, 2024 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cla/signed CLA is signed by all contributors of a PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants