Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BALANCER] Implementation of BalancerBenchmarkApp #1736

Merged
merged 9 commits into from
Jun 3, 2023

Conversation

garyparrot
Copy link
Collaborator

@garyparrot garyparrot commented May 13, 2023

Resolve #1693

這個 PR 將 BalancerBenchmark 的功能放上 App

執行方法

 ./gradlew run --args="balancer_benchmark cost_profiling 
    --cluster.info /home/garyparrot/cluster-file3.bin 
    --cluster.bean /home/garyparrot/bean-file3.bin 
    --optimization.config /home/garyparrot/balancer.json"

目前主要要提供的參數包含

  • --cluster.info ClusterInfo 序列化檔案位置
  • --cluster.bean ClusterBean 序列化檔案位置 (可能會依照實際實作改變名稱,最近的 issue 討論最後沒有打算序列化 CB
  • --optimization.config 一個 JSON 檔案包含優化問題的敘述
  • 其他 Benchmark 模式需要的參數...

Optimization Config 的 JSON 格式和 POST /balancer 一樣,如:

{
      "timeout": "10s",
      "balancer": "org.astraea.common.balancer.algorithms.GreedyBalancer",
      "balancerConfig": {},
      "clusterCosts": [
        { "cost": "org.astraea.common.cost.NetworkIngressCost", "weight": 1 },
        { "cost": "org.astraea.common.cost.NetworkEgressCost", "weight": 1 }
      ],
      "moveCosts": [
        "org.astraea.common.cost.ReplicaLeaderCost"
      ],
      "costConfig": {
        "max.migrated.leader.number": 100
      }
}

執行過程中會輸出結果到 stdout,為了方便呈現結果,整個輸出的格式會是 Markdown,使用者可以直接複製貼上到支援 Markdown 顯示的地方。

Experiment Benchmark 的範例 stdout 輸出結果

Balancer Benchmark
===============================

* Version: 0.3.0-SNAPSHOT
* Build Time: 2023-05-13 01:50:03
* Revision: 200897cc78d6788623040ed94dce10c9439e006a
* Author: Zheng-Xian Li

## Balancing Problem

```json
{
      "timeout": "10s",
      "balancer": "org.astraea.common.balancer.algorithms.GreedyBalancer",
      "balancerConfig": {},
      "clusterCosts": [
        { "cost": "org.astraea.common.cost.NetworkIngressCost", "weight": 1 },
        { "cost": "org.astraea.common.cost.NetworkEgressCost", "weight": 1 }
      ],
      "moveCosts": [
        "org.astraea.common.cost.ReplicaLeaderCost"
      ],
      "costConfig": {
        "max.migrated.leader.number": 100
      }
}

```

* Execution: PT10S
* Balancer: org.astraea.common.balancer.algorithms.GreedyBalancer
* Balancer Configuration:
  * no config
* Cluster Cost Function: WeightCompositeClusterCostFunction[{"NetworkEgressCost" weight 1.0}, {"NetworkIngressCost" weight 1.0}]
* Move Cost Function: MoveCosts["ReplicaLeaderCost"]
* Cost Function Configuration:
  * "max.migrated.leader.number": 100

## ClusterInfo Summary

* ClusterId: MRu1V07RS1ut7d9xqXJrIg
* Topics: 1001
* Partition: 10150
* Replicas: 10150
* Broker Count: 6

## ClusterBean Summary

* Total Metrics: 113128
* Avg Metrics Per Broker: 18854.666667
* Broker Count: 6
* Metrics Start From: 2023-05-13T01:50:03.848
* Metrics End at: 2023-05-13T01:50:04.778
* Recorded Duration: PT0.93S

Balancer Experiment Result
===============================

* Attempted Trials: 5
* Solution Found Trials: 5
* No Solution Found Trials: 0

## ClusterCost Detail

* Initial ClusterCost
  > WeightCompositeClusterCost[{"NetworkEgressCost" cost 0.07873122891973776 weight 1.0 description {506.93 MB/SECOND, 439.45 MB/SECOND, 418.90 MB/SECOND, 430.92 MB/SECOND, 454.00 MB/SECOND, 647.00 MB/SECOND} }, {"NetworkIngressCost" cost 0.07420329522755623 weight 1.0 description {335.40 MB/SECOND, 282.84 MB/SECOND, 267.50 MB/SECOND, 273.01 MB/SECOND, 288.47 MB/SECOND, 482.49 MB/SECOND} }] = 0.07646726207364699
* Best ClusterCost
  > WeightCompositeClusterCost[{"NetworkEgressCost" cost 0.069468102436478 weight 1.0 description {519.58 MB/SECOND, 441.33 MB/SECOND, 425.36 MB/SECOND, 430.44 MB/SECOND, 453.87 MB/SECOND, 626.62 MB/SECOND} }, {"NetworkIngressCost" cost 0.06660853076417374 weight 1.0 description {348.07 MB/SECOND, 283.33 MB/SECOND, 272.29 MB/SECOND, 272.70 MB/SECOND, 288.03 MB/SECOND, 465.27 MB/SECOND} }] = 0.06803831660032586

## Statistics

* Initial Cost: 0.076467
* Min Cost: 0.06803831660032586
* Average Cost: 0.07182385955741408
* Max Cost: 0.07291358389975788
* Cost Variance: 3.5931495329665256E-6

## All Cost Values

```
0.06803831660032586
0.07259888692518332
0.07274871566376734
0.07281979469803597
0.07291358389975788
```

Cost Profiling Benchmark 的範例 stdout 輸出結果

Balancer Benchmark
===============================

* Version: 0.3.0-SNAPSHOT
* Build Time: 2023-05-13 01:52:35
* Revision: 200897cc78d6788623040ed94dce10c9439e006a
* Author: Zheng-Xian Li

## Balancing Problem

```json
{
      "timeout": "10s",
      "balancer": "org.astraea.common.balancer.algorithms.GreedyBalancer",
      "balancerConfig": {},
      "clusterCosts": [
        { "cost": "org.astraea.common.cost.NetworkIngressCost", "weight": 1 },
        { "cost": "org.astraea.common.cost.NetworkEgressCost", "weight": 1 }
      ],
      "moveCosts": [
        "org.astraea.common.cost.ReplicaLeaderCost"
      ],
      "costConfig": {
        "max.migrated.leader.number": 100
      }
}

```

* Execution: PT10S
* Balancer: org.astraea.common.balancer.algorithms.GreedyBalancer
* Balancer Configuration:
  * no config
* Cluster Cost Function: WeightCompositeClusterCostFunction[{"NetworkEgressCost" weight 1.0}, {"NetworkIngressCost" weight 1.0}]
* Move Cost Function: MoveCosts["ReplicaLeaderCost"]
* Cost Function Configuration:
  * "max.migrated.leader.number": 100

## ClusterInfo Summary

* ClusterId: MRu1V07RS1ut7d9xqXJrIg
* Topics: 1001
* Partition: 10150
* Replicas: 10150
* Broker Count: 6

## ClusterBean Summary

* Total Metrics: 113128
* Avg Metrics Per Broker: 18854.666667
* Broker Count: 6
* Metrics Start From: 2023-05-13T01:52:36.521
* Metrics End at: 2023-05-13T01:52:37.644
* Recorded Duration: PT1.123S

Balancer Cost Profiling Result
===============================

* Initial Cost Value: 0.076467
  > WeightCompositeClusterCost[{"NetworkIngressCost" cost 0.07420329522755623 weight 1.0 description {335.40 MB/SECOND, 282.84 MB/SECOND, 267.50 MB/SECOND, 273.01 MB/SECOND, 288.47 MB/SECOND, 482.49 MB/SECOND} }, {"NetworkEgressCost" cost 0.07873122891973776 weight 1.0 description {506.93 MB/SECOND, 439.45 MB/SECOND, 418.90 MB/SECOND, 430.92 MB/SECOND, 454.00 MB/SECOND, 647.00 MB/SECOND} }] = 0.07646726207364699

* Best Cost Value: 0.0725145501110046
  > WeightCompositeClusterCost[{"NetworkIngressCost" cost 0.07097332616381534 weight 1.0 description {335.93 MB/SECOND, 284.48 MB/SECOND, 272.08 MB/SECOND, 271.91 MB/SECOND, 287.76 MB/SECOND, 477.54 MB/SECOND} }, {"NetworkEgressCost" cost 0.07405577405819387 weight 1.0 description {507.74 MB/SECOND, 441.89 MB/SECOND, 424.54 MB/SECOND, 430.94 MB/SECOND, 453.01 MB/SECOND, 639.09 MB/SECOND} }] = 0.0725145501110046

## Runtime Statistics

* Execution Time: PT10.017272315S
* Average Iteration Time: 13.741 ms
* Average Balancer Operation Time: 2.270 ms
* Average ClusterCost Processing Time: 13.691 ms
* Average MoveCost Processing Time: 9.950 ms
* Total ClusterCost Evaluation: 81
* Total MoveCost Evaluation: 729

## Detail

* Cost Profiling Result (ClusterCost Only) in CSV: /tmp/cost-profiling-1683914027792-bd54.csv
* Cost Profiling Result (All) in CSV: /tmp/cost-profiling-1683914027792-bd54-verbose.csv

待完成的工作

  • ClusterInfo 序列化
  • ClusterBean 序列化

@garyparrot garyparrot self-assigned this May 13, 2023
@chia7712
Copy link
Contributor

--cluster.info /home/garyparrot/cluster-file3.bin 
--cluster.bean /home/garyparrot/bean-file3.bin 

這兩個來源我們有方便的方式產生了嗎?

@garyparrot
Copy link
Collaborator Author

兩個來源我們有方便的方式產生了嗎

目前專案內沒有,我是用我自己的 serialization 測試的

@chia7712
Copy link
Contributor

目前專案內沒有,我是用我自己的 serialization 測試的

喔喔,那這也需要有一個議題來追蹤一下,可否麻煩幫忙開議題,並且描述一下你自己工具的用法

@garyparrot
Copy link
Collaborator Author

garyparrot commented May 13, 2023

喔喔,那這也需要有一個議題來追蹤一下,可否麻煩幫忙開議題

我記得 ClusterInfo & ClusterBean 的議題已經有在 Issue 內了 (#1710, #1704)

並且描述一下你自己工具的用法

我的工具是當初為了應急寫的,只能在非常特定的情況下動作,可能沒有太大的參考價值。

@chia7712
Copy link
Contributor

我記得 ClusterInfo & ClusterBean 的議題已經有在 Issue 內了 (#1710, #1704)

內容不太一樣,不過沒關係我開好議題了 #1740

@garyparrot garyparrot marked this pull request as ready for review June 2, 2023 09:18
@garyparrot
Copy link
Collaborator Author

garyparrot commented Jun 2, 2023

./gradlew run --args="balancer_benchmark cost_profiling --optimization.config /tmp/balancer.json --cluster.info /tmp/cluster.bin --cluster.bean /tmp/beans.bin"

@chia7712 反序列化的功能有弄上去了,麻煩再看看,感謝🙏

bench.zip

@garyparrot garyparrot requested a review from chia7712 June 2, 2023 09:24
Copy link
Contributor

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@garyparrot garyparrot merged commit 61c158c into opensource4you:main Jun 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BALANCER] Enable BalancerBenchmark to Astraea app
2 participants