Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HOTFIX: XDPBenchResults: split inheritance tree #385

Merged
merged 2 commits into from
Oct 30, 2024

Conversation

enhaut
Copy link
Member

@enhaut enhaut commented Oct 24, 2024

Description

XDPBenchMeasurements results used to use FlowMeasurementResults as a base class overriding some of its methods because xdp-bench tool doesn't measure CPU usage and so FlowMeasurementResults's CPU metrics were set to None.

Recent changes from [0] now expects FlowMeasurementResults to use both perf and CPU results. Especially
.{start,end}_timestamp properties expect CPU metrics to be set and if not, it crashes since not using CPU metrics is "masked" by setting None instead of regular result containers.

[0] #382

Tests

XDP drop ipv4+ipv6 test in J:10094508

without fix, running any xdp recipe crashed on:

Traceback (most recent call last):
  File "./do-my-test", line 35, in main
    ctl.run(recipe, multimatch=bool(params.get("MULTIMATCH", False)))
  File "/lnst/lib/python3.9/site-packages/lnst/Controller/Controller.py", line 172, in run
    recipe.test()
  File "/lnst/lib/python3.9/site-packages/lnst/Recipes/ENRT/BaseEnrtRecipe.py", line 210, in test
    self.do_tests(recipe_config)
  File "/lnst/lib/python3.9/site-packages/lnst/Recipes/ENRT/BaseEnrtRecipe.py", line 332, in do_tests
    self.do_perf_tests(recipe_config)
  File "/lnst/lib/python3.9/site-packages/lnst/Recipes/ENRT/BaseEnrtRecipe.py", line 358, in do_perf_tests
    self.perf_report_and_evaluate(result)
  File "/lnst/lib/python3.9/site-packages/lnst/RecipeCommon/Perf/Recipe.py", line 207, in perf_report_and_evaluate
    aligned_results = results.time_aligned_results
  File "/lnst/lib/python3.9/site-packages/lnst/RecipeCommon/Perf/Recipe.py", line 127, in time_aligned_results
    real_times = self._get_measurement_timestamps(iteration_results_group)
  File "/lnst/lib/python3.9/site-packages/lnst/RecipeCommon/Perf/Recipe.py", line 112, in _get_measurement_timestamps
    return flows[-1].warmup_end, flows[-1].warmdown_start
  File "/lnst/lib/python3.9/site-packages/lnst/RecipeCommon/Perf/Measurements/Results/FlowMeasurementResults.py", line 87, in warmup_end
    return self.start_timestamp+self.warmup_duration
  File "/lnst/lib/python3.9/site-packages/lnst/RecipeCommon/Perf/Measurements/Results/FlowMeasurementResults.py", line 68, in start_timestamp
    self.generator_cpu_stats.start_timestamp,
AttributeError: 'NoneType' object has no attribute 'start_timestamp'

Now it works as expected.

`XDPBenchMeasurements` results used to use `FlowMeasurementResults`
as a base class overriding some of its methods because xdp-bench
tool doesn't measure CPU usage and so `FlowMeasurementResults`
CPU metrics were set to `None`.

Recent changes from [0] now expects `FlowMeasurementResults` to
use both perf and CPU results. Especially
`.{start,end}_timestamp` properties expect CPU metrics to be set
and if not, it crashes since not using CPU metrics is "masked"
by setting `None` instead of regular result containers.

[0] LNST-project#382
Copy link
Collaborator

@olichtne olichtne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think changing the base class type makes it so that the measurement results aren't properly exported via the Json run summary formatter and so the type should be added there as well.

The test job shows:

    {
        "result": "PASS",
        "type": "measurement",
        "measurement_type": "cpu",
        "data": {
            "utilization": 658.9386350727488
        }
    },
    {
        "result": "PASS",
        "type": "measurement",
        "measurement_type": "cpu",
        "data": {
            "utilization": 86.08423280166635
        }
    },
    {
        "result": "PASS",
        "type": "unknown",
        "description": "Flow(\n    type=xdp,\n    generator=Host(machine_id=host1),\n    generator_bind=192.168.101.1,\n    generator_nic=Device(machine=host1, id=eth0, name=eno12399, ifindex=4),\n    generator_port=12000,\n    receiver=Host(machine_id=host2),\n    receiver_bind=192.168.101.2,\n    receiver_nic=Device(machine=host2, id=eth0, name=eno12399, ifindex=4),\n    receiver_port=12000,\n    msg_size=60,\n    duration=60,\n    parallel_streams=1,\n    generator_cpupin=[0, 2, 4, 6, 8],\n    receiver_cpupin=[0, 2, 4, 6, 8],\n    aggregated_flow=False\n    warmup_duration=3,\n)\nGenerator generated (generator_results): 37,172,469.218145 packets per second.\nReceiver processed (receiver_results): 37,149,679.712138 packets per second."
    },

where the unknown result is the xdp measurement result

`JsonRunSummaryFormatter` requires measurement results to be
instance of `MeasurementResult` class.
@enhaut
Copy link
Member Author

enhaut commented Oct 30, 2024

I think changing the base class type makes it so that the measurement results aren't properly exported via the Json run summary formatter and so the type should be added there as well.

The test job shows:

    {
        "result": "PASS",
        "type": "measurement",
        "measurement_type": "cpu",
        "data": {
            "utilization": 658.9386350727488
        }
    },
    {
        "result": "PASS",
        "type": "measurement",
        "measurement_type": "cpu",
        "data": {
            "utilization": 86.08423280166635
        }
    },
    {
        "result": "PASS",
        "type": "unknown",
        "description": "Flow(\n    type=xdp,\n    generator=Host(machine_id=host1),\n    generator_bind=192.168.101.1,\n    generator_nic=Device(machine=host1, id=eth0, name=eno12399, ifindex=4),\n    generator_port=12000,\n    receiver=Host(machine_id=host2),\n    receiver_bind=192.168.101.2,\n    receiver_nic=Device(machine=host2, id=eth0, name=eno12399, ifindex=4),\n    receiver_port=12000,\n    msg_size=60,\n    duration=60,\n    parallel_streams=1,\n    generator_cpupin=[0, 2, 4, 6, 8],\n    receiver_cpupin=[0, 2, 4, 6, 8],\n    aggregated_flow=False\n    warmup_duration=3,\n)\nGenerator generated (generator_results): 37,172,469.218145 packets per second.\nReceiver processed (receiver_results): 37,149,679.712138 packets per second."
    },

where the unknown result is the xdp measurement result

Looking at the older results, this apparently didn't work for quite a while :D
Should be fixed, test run in J:10094508

@enhaut enhaut requested a review from olichtne October 30, 2024 07:07
@olichtne olichtne merged commit d6c1046 into LNST-project:master Oct 30, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants