Reduced performance of sending logs #92

KhafRuslan · 2024-08-02T14:15:23Z

At a certain point, when we reached a heavy load we encountered the problem of low speed of sending logs via promtail

The difference is the speed of reading promtail logs from a file with the same configuration. In the screenshot promtail sent all messages to loki

Configuring the client part of promtail:

clients:
  - url: http://127.0.0.1:3111/loki/api/v1/push
    batchwait: 1s
    batchsize: 100
    backoff_config:
      min_period: 100ms
      max_period: 5s
    external_labels:
      job: ${HOSTNAME}

The solution was simple, we raised the second loki log receiver. After that we can observe a decrease in the graph above. The result is the same

The average resource utilization of an instance was no higher than 30 percent

The text was updated successfully, but these errors were encountered:

lmangani · 2024-08-02T14:18:43Z

The qryn process is single threaded so you either need to scale multiple writers/readers and distribute traffic to achieve your desired capacity or use the qryn otel-collector and write directly into ClickHouse at max speed. Remember most of the performance is on the clickhouse side.

KhafRuslan · 2024-08-03T08:38:36Z

Rather, the description of the panels was confusing. I use qryn otel-collector, it was on it that I encountered the problem. single receiver configuration :

receivers:
  loki:
    protocols:
       grpc:
        endpoint: 0.0.0.0:3200
       http:
        endpoint: 0.0.0.0:3100

processors:
  batch/logs:
    send_batch_size: 8600
    timeout: 400ms
  memory_limiter/logs:
    limit_percentage: 100
    check_interval: 2s

exporters:
  qryn:
    dsn: http://qryn-chp1...
    logs:
      format: raw
    retry_on_failure:
      enabled: true
      initial_interval: 5s
      max_elapsed_time: 300s
      max_interval: 30s
    sending_queue:
      queue_size: 1200
    timeout: 10s
service:
    extensions: [pprof, zpages, health_check]
    pipelines:
       logs:
         exporters: [qryn]
         processors: [batch/logs]
         receivers: [loki]
    telemetry:
      logs:
        level: "debug"
      metrics:
        address: 0.0.0.0:8888

lmangani · 2024-08-08T11:22:03Z

If you are using the otel-collector to ingest, then I would assume the bottleneck being either with the collector or clickhouse rather than qryn itself. Did you observe any resource bottlenecking while operating the setup?

KhafRuslan · 2024-08-09T14:54:48Z

I ran into the problem not in qryn. It's with qryn-otel-collector. Perhaps I misunderstood your comment. I'm not sure if it's a resource problem, because it works correctly when I bring up another receiver

lmangani · 2024-09-04T11:02:55Z

I'm not sure if it's a resource problem, because it works correctly when I bring up another receiver

We definitely need to investigate this further to understand what the root cause is. Could you show the multi-receiver config too?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduced performance of sending logs #92

Reduced performance of sending logs #92

KhafRuslan commented Aug 2, 2024 •

edited

Loading

lmangani commented Aug 2, 2024 •

edited

Loading

KhafRuslan commented Aug 3, 2024 •

edited

Loading

lmangani commented Aug 8, 2024

KhafRuslan commented Aug 9, 2024 •

edited

Loading

lmangani commented Sep 4, 2024

Reduced performance of sending logs #92

Reduced performance of sending logs #92

Comments

KhafRuslan commented Aug 2, 2024 • edited Loading

lmangani commented Aug 2, 2024 • edited Loading

KhafRuslan commented Aug 3, 2024 • edited Loading

lmangani commented Aug 8, 2024

KhafRuslan commented Aug 9, 2024 • edited Loading

lmangani commented Sep 4, 2024

KhafRuslan commented Aug 2, 2024 •

edited

Loading

lmangani commented Aug 2, 2024 •

edited

Loading

KhafRuslan commented Aug 3, 2024 •

edited

Loading

KhafRuslan commented Aug 9, 2024 •

edited

Loading