Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ecosystem): Adds Project Management integration SLOs #80719

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

GabeVillalobos
Copy link
Member

@GabeVillalobos GabeVillalobos commented Nov 14, 2024

WIP - Adds Project Management/Issues SLO metrics

This PR adds metric coverage to the majority of Project Management features:

  • Issue Creation (including alert rules) - src/sentry/api/endpoints/group_integration_details.py
  • Outbound status syncing - src/sentry/integrations/tasks/sync_status_outbound.py
  • Outbound assignment syncing - src/sentry/integrations/tasks/sync_assignee_outbound.py
  • Issue linking - src/sentry/api/endpoints/group_integration_details.py
  • Inbound assignment syncing
  • Installation metrics for ticketing integrations

Has Metrics, Needs Testing:

  • Inbound status syncing

Incidental Work:

  • Adds new tests for sync_outbound_assignment and sync_outbound_status tasks, which previously only had endpoint level testing.
  • Adds new test for the shared sync_group_assignee_inbound utility (src/sentry/integrations/utils/sync.py)

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Nov 14, 2024
return Response(exc.field_errors, status=400)
except IntegrationError as e:
lifecycle.record_halt(e)
return Response({"non_field_errors": [str(e)]}, status=400)

Check warning

Code scanning / CodeQL

Information exposure through an exception Medium

Stack trace information
flows to this location and may be exposed to an external user.

Copilot Autofix AI 2 days ago

To fix the problem, we should avoid exposing the detailed exception message to the end user. Instead, we can log the detailed error message on the server for debugging purposes and return a generic error message to the user. This approach ensures that sensitive information is not exposed while still allowing developers to diagnose issues.

  • Modify the exception handling code to log the detailed error message.
  • Return a generic error message to the user instead of the detailed exception message.
Suggested changeset 1
src/sentry/api/endpoints/group_integration_details.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/src/sentry/api/endpoints/group_integration_details.py b/src/sentry/api/endpoints/group_integration_details.py
--- a/src/sentry/api/endpoints/group_integration_details.py
+++ b/src/sentry/api/endpoints/group_integration_details.py
@@ -186,3 +186,4 @@
                 lifecycle.record_failure(e)
-                return Response({"non_field_errors": [str(e)]}, status=400)
+                self.log_error(e)
+                return Response({"non_field_errors": ["An internal error has occurred."]}, status=400)
 
@@ -249,2 +250,8 @@
 
+    def log_error(self, error: Exception) -> None:
+        # Log the detailed error message for debugging purposes
+        import logging
+        logger = logging.getLogger(__name__)
+        logger.error("Integration error occurred: %s", str(error))
+
     def post(self, request: Request, group, integration_id) -> Response:
EOF
@@ -186,3 +186,4 @@
lifecycle.record_failure(e)
return Response({"non_field_errors": [str(e)]}, status=400)
self.log_error(e)
return Response({"non_field_errors": ["An internal error has occurred."]}, status=400)

@@ -249,2 +250,8 @@

def log_error(self, error: Exception) -> None:
# Log the detailed error message for debugging purposes
import logging
logger = logging.getLogger(__name__)
logger.error("Integration error occurred: %s", str(error))

def post(self, request: Request, group, integration_id) -> Response:
Copilot is powered by AI and may make mistakes. Always verify output.
Positive Feedback
Negative Feedback

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Please select one or more of the options
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't new code, this is what allows errors to propagate up to the UI when ticket creation or linking fails on the integrator side (usually a config issue with the ticket contents)

return Response(exc.field_errors, status=400)
except IntegrationError as e:
lifecycle.record_halt(e)
return Response({"non_field_errors": [str(e)]}, status=400)

Check warning

Code scanning / CodeQL

Information exposure through an exception Medium

Stack trace information
flows to this location and may be exposed to an external user.

Copilot Autofix AI 2 days ago

To fix the problem, we need to ensure that detailed exception messages are not exposed to the end user. Instead, we should log the detailed error message on the server and return a generic error message to the user. This can be achieved by modifying the exception handling code to log the exception and return a generic error message.

  1. Import the logging module to log the detailed error message.
  2. Modify the exception handling blocks to log the exception and return a generic error message.
Suggested changeset 1
src/sentry/api/endpoints/group_integration_details.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/src/sentry/api/endpoints/group_integration_details.py b/src/sentry/api/endpoints/group_integration_details.py
--- a/src/sentry/api/endpoints/group_integration_details.py
+++ b/src/sentry/api/endpoints/group_integration_details.py
@@ -27,2 +27,3 @@
 from sentry.users.models.user import User
+import logging
 
@@ -220,3 +221,4 @@
                 lifecycle.record_failure(e)
-                return Response({"non_field_errors": [str(e)]}, status=400)
+                logging.error("IntegrationError: %s", str(e))
+                return Response({"non_field_errors": ["An internal error has occurred."]}, status=400)
 
@@ -233,2 +235,3 @@
                 lifecycle.record_halt(exc)
+                logging.error("IntegrityError: %s", str(exc))
                 return Response({"non_field_errors": ["That issue is already linked"]}, status=400)
EOF
@@ -27,2 +27,3 @@
from sentry.users.models.user import User
import logging

@@ -220,3 +221,4 @@
lifecycle.record_failure(e)
return Response({"non_field_errors": [str(e)]}, status=400)
logging.error("IntegrationError: %s", str(e))
return Response({"non_field_errors": ["An internal error has occurred."]}, status=400)

@@ -233,2 +235,3 @@
lifecycle.record_halt(exc)
logging.error("IntegrityError: %s", str(exc))
return Response({"non_field_errors": ["That issue is already linked"]}, status=400)
Copilot is powered by AI and may make mistakes. Always verify output.
Positive Feedback
Negative Feedback

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Please select one or more of the options
@dataclass
class ProjectManagementEvent(IntegrationEventLifecycleMetric):
action_type: ProjectManagementActionType
integration: Integration | RpcIntegration
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out we don't always have access to integration installations (specifically the inbound assignment code), so I made this use integration/rpc integration.

Comment on lines +135 to +144
logger.info(
"%s.rule_trigger.create_ticket.failure",
provider,
extra={
"rule_id": rule_id,
"provider": provider,
"integration_id": integration.id,
"error_message": str(e),
"exception_type": type(e).__name__,
},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's nice being able to delete these logs in the future

Comment on lines +21 to +28
class BrokenJiraMock(MockJira):
"""
Subclass of MockJira client, which implements a broken create_issue method
to simulate a misconfigured alert rule for Jira.
"""

def create_issue(self, raw_form_data):
raise ApiInvalidRequestError("Invalid data entered")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is gross, but was the easiest way to trigger a failure scenario.

Copy link

codecov bot commented Nov 14, 2024

Codecov Report

Attention: Patch coverage is 93.75000% with 10 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sentry/integrations/jira/utils/api.py 71.42% 5 Missing and 1 partial ⚠️
.../sentry/api/endpoints/group_integration_details.py 94.11% 1 Missing and 1 partial ⚠️
src/sentry/integrations/vsts/webhooks.py 81.81% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #80719      +/-   ##
==========================================
+ Coverage   78.39%   78.42%   +0.02%     
==========================================
  Files        7206     7212       +6     
  Lines      319438   319787     +349     
  Branches    43991    44011      +20     
==========================================
+ Hits       250437   250777     +340     
- Misses      62620    62626       +6     
- Partials     6381     6384       +3     

Copy link
Member

@iamrajjoshi iamrajjoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i just have a few questions, but i think we are kind of aligned on "halt".

i have been thinking of fault as any unexpected condition that isn't user's fault.
with that definition, i have been considering halt as all other non-success conditions, which makes it probably the broadest in definition.

try:
installation.after_link_issue(external_issue, data=request.data)
except IntegrationFormError as exc:
lifecycle.record_halt(exc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these errors user config errors?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, typically an IntegrationFormError is an incorrectly configured Jira ticket creation config.

item for item in data["changelog"]["items"] if item["field"] == "assignee"
)
if not assignee_changed:
lifecycle.record_halt()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this possibly be a success? are these normal scenarios where jira doesn't send us assignee?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. Tbh though, I'm looking at moving this call into the shared util this calls as there are specific lifecycles we probably want to track there, so this is likely to change.

assignee = fields.get("assignee")

if assignee is None:
lifecycle.record_halt()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean the ticket was unassigned? why does this have to be a halt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is actually fine and should be recorded as a success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants