Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitors not updating in destination #191

Open
KaleFive opened this issue Oct 23, 2023 · 8 comments
Open

Monitors not updating in destination #191

KaleFive opened this issue Oct 23, 2023 · 8 comments

Comments

@KaleFive
Copy link

Overview of issue:
In the documentation, it is stated that the purpose is The source organization will not be modified, but the destination organization will have resources created and updated by the sync command. However, I do not see my monitor updating via this sync tool. I expect that in the source I have resourceA which the sync tool will then create in the destination as resourceB. If I make an edit to resourceA, I expect running the sync command to then update resourceB in my destination. But this does not happen.

Steps to reproduce the issue:

  1. Create resourceA in source
  2. Run sync tool to get equivalent (resourceB) in destination.
  3. Make an edit to the query of monitor in resourceA
  4. Run sync tool
  5. resourceB in destination is unchanged.

Describe the results you received:
resourceB does not update

Describe the results you expected:
I expect resourceB to update with the same changes that I made manually to resourceA

Additional information you deem important (e.g. issue happens only occasionally):

@skarimo
Copy link
Member

skarimo commented Oct 24, 2023

Hello, thanks for reaching out.

Are you re-importing the monitors after modifying the resourceA? Could you manually check the state files and see if the correct query is in the source state files.

@KaleFive
Copy link
Author

Yes - I am re-importing and upon checking the state I can see that the correct query is in the source state file.

@skarimo
Copy link
Member

skarimo commented Oct 25, 2023

What version of the sync cli are you using? Could you also share the json definition for the monitor from both the source and destination files. I am curious to see what the queries look like and test to see if diff might be getting suppressed.

If we cannot gain much information based on above, you can open a support ticket with our support team so we can gather a bit more specific information including the actual state files. Thanks

@KaleFive
Copy link
Author

Hi @skarimo
We're using version 0.4.1

Here is an example of the json for a monitor

Source

"107719379": {
    "id": 107719379,
    "org_id": 743021,
    "type": "query alert",
    "name": "Encountered an error when interacting with Elasticsearch",
    "message": "This monitor tracks for ANY errors when interacting with elasticsearch.\n\nCurrently uses a very low threshold since we have little traffic and we'd like to capture any errors possible and investigate.",
    "tags": [
      "hex-search"
    ],
    "query": "avg(last_15m):sum:trace.elasticsearch.query.errors{env:prod,!resource_name:put_/entities,!resource_name:put_/datasources ,!resource_name:put_/projects} by {resource_name}.as_rate() > 0.5",
    "options": {
      "thresholds": {
        "critical": 0.5
      },
      "notify_audit": false,
      "require_full_window": false,
      "notify_no_data": false,
      "renotify_interval": 0,
      "include_tags": false,
      "new_group_delay": 60,
      "silenced": {}
    },
    "multi": true,
    "created_at": 1673564013000,
    "created": "2023-01-12T22:53:33.735088+00:00",
    "modified": "2023-10-23T15:36:08.193632+00:00",
    "deleted": null,
    "restricted_roles": null,
    "priority": 5,
    "overall_state_modified": "2023-10-23T16:44:59+00:00",
    "overall_state": "OK",
    "creator": {
      "name": "redacted",
      "handle": "[email protected]",
      "email": "[email protected]",
      "id": 4685156
    },
    "matching_downtimes": []
  }

Destination

"107719379": {
    "id": 13603814,
    "org_id": 1000099300,
    "type": "query alert",
    "name": "Encountered an error when interacting with Elasticsearch",
    "message": "This monitor tracks for ANY errors when interacting with elasticsearch.\n\nCurrently uses a very low threshold since we have little traffic and we'd like to capture any errors possible and investigate.",
    "tags": [
      "hex-search"
    ],
    "query": "avg(last_15m):sum:trace.elasticsearch.query.errors{env:prod,!resource_name:put_/entities,!resource_name:put_/datasources ,!resource_name:put_/projects} by {resource_name}.as_rate() > 0.5",
    "options": {
      "thresholds": {
        "critical": 0.5
      },
      "notify_audit": false,
      "require_full_window": false,
      "notify_no_data": false,
      "renotify_interval": 0,
      "include_tags": false,
      "new_group_delay": 60,
      "silenced": {}
    },
    "multi": true,
    "created_at": 1698082005000,
    "created": "2023-10-23T17:26:45.587916+00:00",
    "modified": "2023-10-23T17:26:45.587916+00:00",
    "deleted": null,
    "restricted_roles": null,
    "priority": 5,
    "overall_state_modified": null,
    "overall_state": "No Data",
    "creator": {
      "name": "redacted",
      "handle": "[email protected]",
      "email": "[email protected]",
      "id": 1000665890
    }
  }

Not sure how related this is but the source has a "matching_downtimes": [] but destination does not.

@skarimo
Copy link
Member

skarimo commented Nov 21, 2023

Thanks for sharing those. The config looks identical with an exception to matching_downtimes as you mentioned which is already handled so that doesn't effect the diffs. See: https://github.com/DataDog/datadog-sync-cli/blob/main/datadog_sync/model/monitors.py#L28

Based on the outputs, I am inclined to think that the diff is caused by changes occurring to the destination resources outside of the tool. For example, if you manually update the monitor in destination org, this would not be reflected in the local state files, hence changes would not be applied. The tool currently does not do a "live refresh" (read current state from destination rather than state files).

Does this seem like the root of the issue you are facing?

@KaleFive
Copy link
Author

Thank you @skarimo. I do not believe that we manually make changes to our destination. In our case, the destination is a EU specific domain and we treat it as a secondary to our US instance.

So I understand the expected behavior....

  1. Can we rely on a uni-directional update path for our monitors, dashboards, etc? Stated in another way, if we keep the source (our US instance) the main place we make updates, and only ever edit the destination (our EU instance) with the datadog-sync tool, should I expect that changes to my monitors are updated in the destination?
  2. What happens if we break from the above and accidentally manually update a monitor in the destination?

@KaleFive
Copy link
Author

Hello @skarimo - checking to see if you saw my message above?

@skarimo
Copy link
Member

skarimo commented Dec 19, 2023

👋 Thanks for the ping, missed the reply.

  1. Yes. If relying on single direction sync, the destination should always reflect source.
  2. It will be updated to the source config once there is an actual diff between the source and destination config files. Caveat being, the destination monitor will not be updated until there is a change to the same monitor on the source. This is because of the missing live refresh of destination resources mentioned previously.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants