Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow more customization in the OpenTelemetry collector configuration #573

Open
Tracked by #916
jvanz opened this issue Oct 29, 2024 · 4 comments · May be fixed by #581
Open
Tracked by #916

Allow more customization in the OpenTelemetry collector configuration #573

jvanz opened this issue Oct 29, 2024 · 4 comments · May be fixed by #581
Assignees
Labels
Milestone

Comments

@jvanz
Copy link
Member

jvanz commented Oct 29, 2024

The Helm charts need to be updated to allow for more customization of the OpenTelemetry (Optel) collector configuration. Making the Optel collector configuration as flexible as possible. This will allow users to configure pipelines and exporters to send data to a Stackstate cluster.

This is necessary because the main point of integration with Stackstate is done through the Kubewarden Optel collector sending data to the Stackstate Optel collector. In order to accomplish this, the Kubewarden collector must be configured to receive data, pass it through a pipeline, and export the final data to the Stackstate collector. This will require configuration changes to the receivers, processors, exporters, and pipelines. The exporter can be the https/grpc exporter available in Optel.

In previous experiments, it was necessary to update the collector configuration after the Kubewarden installation. This issue should be addressed. The configuration used in those experiments, based on Stackstate documentation and experiments, is provided as an example.

image:
  repository: "otel/opentelemetry-collector-k8s"
extraEnvsFrom:
  - secretRef:
      name: open-telemetry-collector
mode: deployment
ports:
  metrics:
    enabled: true
presets:
  kubernetesAttributes:
    enabled: true
    extractAllPodLabels: true
config:
  extensions:
    bearertokenauth:
      scheme: StackState
      token: "${env:API_KEY}"
  exporters:
    otlp/stackstate:
      auth:
        authenticator: bearertokenauth
      endpoint: <otlp-stackstate-endpoint>:443
  processors:
    tail_sampling:
      decision_wait: 10s
      policies:
      - name: rate-limited-composite
        type: composite
        composite:
          max_total_spans_per_second: 500
          policy_order: [errors, slow-traces, rest]
          composite_sub_policy:
          - name: errors
            type: status_code
            status_code: 
              status_codes: [ ERROR ]
          - name: slow-traces
            type: latency
            latency:
              threshold_ms: 1000
          - name: rest
            type: always_sample
          rate_allocation:
          - policy: errors
            percent: 33
          - policy: slow-traces
            percent: 33
          - policy: rest
            percent: 34
    resource:
      attributes:
      - key: k8s.cluster.name
        action: upsert
        value: <your-cluster-name>
      - key: service.instance.id
        from_attribute: k8s.pod.uid
        action: insert
    filter/dropMissingK8sAttributes:
      error_mode: ignore
      traces:
        span:
          - resource.attributes["k8s.node.name"] == nil
          - resource.attributes["k8s.pod.uid"] == nil
          - resource.attributes["k8s.namespace.name"] == nil
          - resource.attributes["k8s.pod.name"] == nil
  connectors:
    spanmetrics:
      metrics_expiration: 5m
      namespace: otel_span
    routing/traces:
      error_mode: ignore
      match_once: false
      table: 
      - statement: route()
        pipelines: [traces/sampling, traces/spanmetrics]
  service:
    extensions:
      - health_check
      - bearertokenauth
    pipelines:
      traces:
        receivers: [otlp]
        processors: [filter/dropMissingK8sAttributes, memory_limiter, resource]
        exporters: [routing/traces]
      traces/spanmetrics:
        receivers: [routing/traces]
        processors: []
        exporters: [spanmetrics]
      traces/sampling:
        receivers: [routing/traces]
        processors: [tail_sampling, batch]
        exporters: [debug, otlp/stackstate]
      metrics:
        receivers: [otlp, spanmetrics, prometheus]
        processors: [memory_limiter, resource, batch]
        exporters: [debug, otlp/stackstate]

Warning

This is an example configuration and does not necessarily represent the final configuration.

Given the wide variety of possible Optel collector configurations, I propose allowing users to completely overwrite the current default definition as the easiest solution. This will give users the ability to customize the collector as they see fit, without the need for a Helm chart update every time they want to add a new feature, such as a new pipeline, processor, or exporter. I understand that this may cause issues in defining what is supported and what is not. This is a decision that can be made during the course of working on this task.

Acceptance Criteria

  • Define the best way to allow users to define their own Optel collector configuration for integration with Stackstate. Ideas include:
    • Adding values to cover the most common possibilities, as is currently done.
    • Allowing users to overwrite the collector configuration and configure it freely.
  • Implement the new configuration approach.
  • Add tests to cover the Optel collector configuration.
@jvanz jvanz transferred this issue from kubewarden/kubewarden-controller Oct 29, 2024
@flavio flavio moved this to Todo in Kubewarden Oct 30, 2024
@flavio
Copy link
Member

flavio commented Oct 30, 2024

Makes sense, moved to the TODO column

@jvanz jvanz self-assigned this Nov 1, 2024
@jvanz jvanz moved this from Todo to In Progress in Kubewarden Nov 1, 2024
@flavio flavio added this to the 1.19 milestone Nov 4, 2024
@flavio
Copy link
Member

flavio commented Nov 4, 2024

@jvanz: while working on this issue you could also move to the new v1beta1 CRDs od OTEL (see this issue)

@jvanz jvanz linked a pull request Nov 5, 2024 that will close this issue
@jvanz
Copy link
Member Author

jvanz commented Nov 6, 2024

I'm moving this to block until I restore the access to my testing machines

@jvanz jvanz moved this from In Progress to Blocked in Kubewarden Nov 6, 2024
@jvanz jvanz moved this from Blocked to In Progress in Kubewarden Nov 7, 2024
@jvanz jvanz moved this from In Progress to Pending review in Kubewarden Nov 14, 2024
@jvanz jvanz moved this from In Progress to Pending review in Kubewarden Nov 14, 2024
@flavio
Copy link
Member

flavio commented Nov 15, 2024

Moving to blocked, we have to discuss how to move forward with that

@flavio flavio moved this from Pending review to Blocked in Kubewarden Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Blocked
Development

Successfully merging a pull request may close this issue.

3 participants