Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linuxperf: Copy debug symbols if possible #329

Merged
merged 3 commits into from
Sep 26, 2023

Conversation

Kuba314
Copy link
Contributor

@Kuba314 Kuba314 commented Sep 13, 2023

Description

perf archive archives symbols referenced from perf.data files so that it can be closely examined on a different machine.

Also make linuxperf accept only a single list of cpus. Tester should filter a subset of those cpus manually later during perf report.

Tests

J:8306972

Reviews

@jtluka

Closes: #326

@Kuba314
Copy link
Contributor Author

Kuba314 commented Sep 13, 2023

From the test I see that our infrastructure explicitly forbids installation of kernel-debuginfo to save space. Maybe we should consider looking into that.

For now though, I have created a job where I didn't disable the installation of kernel-debuginfo. The job ID is J:8307111.

@Kuba314
Copy link
Contributor Author

Kuba314 commented Sep 13, 2023

Seems that kernel-debuginfo is downloaded, but not installed... Not really sure how to fix this.

@jtluka
Copy link
Collaborator

jtluka commented Sep 14, 2023

What I had in mind was to completely get rid of generatirg the profiled cpus from dev_intr_cpu and perf_tool_cpu. Simply not specify --cpu for perf record. That way we profile the whole system and can fetch the relevant cpus whenever we want. It is ok to keep the override parameter, but otherwise we should have no logic around the profiled cpus.

lnst/Tests/LinuxPerf.py Outdated Show resolved Hide resolved
@olichtne
Copy link
Collaborator

Leaving out a host is the same as specifying it with an empty list of cpus, which is the same as measuring every CPU.

so that means that there is now no way to actually "skip" a host from having perf running... you have two options:

  1. don't perf measure anywhere
  2. measure perf everywhere, with the choice of 1-N cpus...

not saying this is a problem... just asking if this is intentional?

@Kuba314
Copy link
Contributor Author

Kuba314 commented Sep 19, 2023

not saying this is a problem... just asking if this is intentional?

wasn't really intentional, but I don't know how else would a user want to say to measure on every CPU. I guess the logical thing would be to pass "host1": None, or maybe have a magic constant - "host1": "all"? Passing an empty list would then be ignored, because it'd mean don't measure any CPU.

Alternatively we could rename the param to linuxperf_cpu_filter to make "host1": None mean no filter -> match all CPUs. Not sure which is best honestly. @olichtne

@Kuba314
Copy link
Contributor Author

Kuba314 commented Sep 19, 2023

Testing in internal lab with RHEL for some reason fails on the perf archive stage:

2023-09-19 18:50:36           (host1)        -   DEBUG: Executing: "perf archive /perf.data"
2023-09-19 18:50:37           (host1)        -   DEBUG: 
    Stdout:
    ----------------------------
    Now please run:
    
    $ tar xvf /perf.data.tar.bz2 -C ~/.debug
    
    wherever you need to run 'perf report' on.
    ----------------------------
2023-09-19 18:50:37           (host1)        -   DEBUG: 
    Stderr:
    ----------------------------
    tar: .build-id/a5/c4c91bbf5dd49e30781c0d4b9280123a8d2baf: Cannot stat: No such file or directory
    tar: .build-id/31/f2a86084da882dfe4ecc1fe2a9eca8ce9416fd: Cannot stat: No such file or directory
    tar: .build-id/8f/f3b7d14abcc882a7201334e28be9bc785cd741: Cannot stat: No such file or directory
    tar: .build-id/c4/1114caace3681cd1b867b88b5cdf017b49c9a6: Cannot stat: No such file or directory
    tar: .build-id/5f/ee22d848d771c93de9edf3896cb21f3dcdacd4: Cannot stat: No such file or directory
    tar: .build-id/c6/66f5af3805323607ffc9753c13d0b95a9e3529: Cannot stat: No such file or directory
    tar: .build-id/f0/2a71bc066726654c9a58bed8e9f0ce12868332: Cannot stat: No such file or directory
    tar: .build-id/55/1b4ac05e0b7c321c4760eb5cc45ebaef407235: Cannot stat: No such file or directory
    tar: .build-id/32/2061e377107f54453d1cd9be751da5cd1e2318: Cannot stat: No such file or directory
    tar: .build-id/6e/ee4b0aa096db0754a0fa67d7c7451dfdda007b: Cannot stat: No such file or directory
    tar: .build-id/b8/2fc86b738193e46d0132830dfb0e853d6c8ad5: Cannot stat: No such file or directory
    tar: .build-id/8c/b75b8849063f4e659c76d6c52cda54b5060b7d: Cannot stat: No such file or directory
    tar: .build-id/6f/aac91a1cfe1bdfe2fc35921e7bbf1138f0c209: Cannot stat: No such file or directory
    tar: Exiting with failure status due to previous errors
    ----------------------------

I have no clue what's going on. Locally it just works and doesn't display these error messages on stderr. I'm testing in docker fedora36 container, running on my fedora39 host. One thing I noticed is that for some reason perf in my container doesn't stop after it receives an interrupt signal, it just times out and gets killed after 2 minutes by lnst. Not sure if that's related at all. I have no clue what to do now.

@olichtne
Copy link
Collaborator

I have no clue what to do now.

does running the same command by hand work ok?

@olichtne
Copy link
Collaborator

perf archive /perf.data

is the /perf.data here intentional? that seems like it's generating the file under the root of file dir hierarchy - /... could there be an issue with that as well?

@olichtne
Copy link
Collaborator

wasn't really intentional, but I don't know how else would a user want to say to measure on every CPU.

so... i'm not sure we have a usecase for "skip a host from perf measurement" @jtluka maybe sees one? IMO the do_linuxperf_measurement = True sort of already implies we're debugging and we don't mind extra data...

BUT, purely for the "discussion":

there's two approaches based on what we want the DEFAULT to be:

  1. if we want the default to measure everything... we want to make it easiest to specify "everything" and that would mean that we simply pass do_linux_perf=True and we specify nothing for linuxperf_cpus... in other words the "default" specification can look like linuxperf_cpus = {host1: None, host2: [], ...} or it could even be linuxperf_cpus = {} and this is basically your "filter" proposal - None means, no filter, so measure everything...
  2. we want the default to measure NOTHING... so that means that specifying "measure everything" is going to have to be explicit and "long" - linuxperf_cpus = {host1: [1, 2, 3, ...], host2: [...] } and so that makes the default value be just linuxperf_cpus = {host1: [], ...} as an empty list in this case means that we're measuring nothing.

in all of the possible value cases the algorithm works the same way which is the benefit here... the "special case" doesn't, that requires an explicit condition branch... which i don't like.

IMO.. both solutions are ok, i lean closer to the filter one as i believe specifying "nothing to measure everything" is more handy and the "specify a little to measure nothing" is a good compromise.

The explicit specification of case 2 is easier to understand, but more "wordy" - the case "nothing to measure nothing" is short, but it's a rare case, and the "everything to measure everything" is long, and that is a more common case than "nothing to do nothing" imo...

@jtluka
Copy link
Collaborator

jtluka commented Sep 20, 2023

To be honest, I think we're overcomplicating this. If user needs debugging data, the most straightforward way to do that should be just specifying do_linuxperf_measurement=True. I don't think user would bother with specifying individual cpus, simply because you can filter them out later using perf report --cpus perf.data . The same goes for specifying/bypassing hosts. I don't think user would bother with that, too.

@olichtne
Copy link
Collaborator

right.... so do we want to just measure all of the cpus whenever we do do_linuxperf_measurement=True? that would simplify some code...

@jtluka
Copy link
Collaborator

jtluka commented Sep 20, 2023

right.... so do we want to just measure all of the cpus whenever we do do_linuxperf_measurement=True? that would simplify some code...

That was the original idea. Jakub decided to keep the linuxperf_cpus parameter, first I thought it was a good idea, but based on this discussion I'm more inclined to just remove it for the sake of code simplicity.

@Kuba314
Copy link
Contributor Author

Kuba314 commented Sep 20, 2023

I removed the cpu params completely.

does running the same command by hand work ok?

I think I tried already, but it showed the same errors. I will verify this.

hat seems like it's generating the file under the root of file dir hierarchy - /... could there be an issue with that as well?

not sure tbh, perf-report outputs to the current directory and since perf-record is executed on agents through the RPC mechanism, I'm guessing it picked cwd as root. Not sure if that's an issue... (or how I would fix this, maybe cding before running the command?)

@olichtne
Copy link
Collaborator

not sure tbh, perf-report outputs to the current directory and since perf-record is executed on agents through the RPC mechanism, I'm guessing it picked cwd as root. Not sure if that's an issue... (or how I would fix this, maybe cding before running the command?)

yeah... looking at it closer, the filename is defined explicitly here: https://github.com/LNST-project/lnst/pull/329/files#diff-2a2c8f8f5fbe4a6f37e6cd91f5fd4e17686793c5b05a42b7f3e418b0f2000d46R41 and that comes from the value generated here: https://github.com/LNST-project/lnst/pull/329/files#diff-3fea1a13becb675a3e6af752edaa772181df47ae5a0c810c6571db6db0eb98a8R83 and that defines just the file without the path...

The Agent runs as a systemd service meaning that you're correct - ``cwd == '/'`.

That seems like it is probably fine since the record file is created ok, just the archive has an issue... So that's probably not too important...

I do wonder though, when you're running locally with containers, is the agent running as a service as well? is the path potentially different?

Copy link
Collaborator

@jtluka jtluka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After removal of cpu parameter, codewise looks ok. Once we resolve the beaker run issue I approve.

@Kuba314 Kuba314 force-pushed the linuxperf-archive branch 2 times, most recently from 5151598 to e162705 Compare September 21, 2023 23:09
@Kuba314
Copy link
Contributor Author

Kuba314 commented Sep 22, 2023

perf record needed HOME env variable set to correctly store debug info to ~/.debug (later read by perf archive). The variable was not set. Got it working at J:8341378. The perf-archive command took around 70 seconds and generated a 350MB file for each host and iteration.

@olichtne
Copy link
Collaborator

perf-archive command took around 70 seconds and generated a 350MB file

and each perf iteration, the job you linked as perf_duration=3;perf_iterations=2 and the 70s perf archive happens twice...

So to use this for a normal test run where perf_duration=60;perf_iterations=5 that would mean that we effectively double the duration of each perf iteration and so instead of a 5 minute iteration we're doing a 10 iteration.

the question is... are we ok with this? we're only using the perf command rarely for debugging purposes and it already affects performance anyway so this may be ok.

BUT, if we see this as too much... is it possible to run this just once at the end? Not related to LNST implementation... just from the point of view of the perf command... we're adding this to have the debug info generated... that should be the same for all iterations so maybe that's ok to have just once? Or is it that we need it for every iteration?

@jtluka?

@jtluka
Copy link
Collaborator

jtluka commented Sep 22, 2023

I took a look at generated archive file. And I'm confused. Unpacked file would take ~1GB that is basically the same as if I installed it from rpm.

In the output below, the biggest file is vmlinux/elf with 936 MB. I'm not quite why the elf file is included

[igyn@localhost tmp]$ ls -lh linuxperf-data/host1/perf.data.1.tar.bz2
-rw-r--r--. 1 igyn igyn 329M Sep 22 01:39 linuxperf-data/host1/perf.data.1.tar.bz2
[igyn@localhost tmp]$ tar -tvjf linuxperf-data/host1/perf.data.1.tar.bz2
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/a5/c4c91bbf5dd49e30781c0d4b9280123a8d2baf -> ../../usr/lib/debug/usr/lib/modules/4.18.0-425.3.1.el8.x86_64/vmlinux/a5c4c91bbf5dd49e30781c0d4b9280123a8d2baf
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/lib/debug/usr/lib/modules/4.18.0-425.3.1.el8.x86_64/vmlinux/a5c4c91bbf5dd49e30781c0d4b9280123a8d2baf/
-rwxr-xr-x root/root 937725200 2022-09-30 18:25 usr/lib/debug/usr/lib/modules/4.18.0-425.3.1.el8.x86_64/vmlinux/a5c4c91bbf5dd49e30781c0d4b9280123a8d2baf/elf
hrwxr-xr-x root/root         0 2022-09-30 18:25 usr/lib/debug/usr/lib/modules/4.18.0-425.3.1.el8.x86_64/vmlinux/a5c4c91bbf5dd49e30781c0d4b9280123a8d2baf/debug link to usr/lib/debug/usr/lib/modules/4.18.0-425.3.1.el8.x86_64/vmlinux/a5c4c91bbf5dd49e30781c0d4b9280123a8d2baf/elf
-rw-r--r-- root/root         0 2023-09-22 01:38 usr/lib/debug/usr/lib/modules/4.18.0-425.3.1.el8.x86_64/vmlinux/a5c4c91bbf5dd49e30781c0d4b9280123a8d2baf/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/57/61e5d3532a4e7010008b7f22ac5de337f56b33 -> ../../usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/arch/x86/kvm/kvm.ko.xz/5761e5d3532a4e7010008b7f22ac5de337f56b33
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/arch/x86/kvm/kvm.ko.xz/5761e5d3532a4e7010008b7f22ac5de337f56b33/
-rw-r--r-- root/root    294892 2022-09-30 18:28 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/arch/x86/kvm/kvm.ko.xz/5761e5d3532a4e7010008b7f22ac5de337f56b33/elf
-r--r--r-- root/root  15865200 2022-09-30 18:20 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/arch/x86/kvm/kvm.ko.xz/5761e5d3532a4e7010008b7f22ac5de337f56b33/debug
-rw-r--r-- root/root         0 2023-09-22 01:38 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/arch/x86/kvm/kvm.ko.xz/5761e5d3532a4e7010008b7f22ac5de337f56b33/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/c8/00307b5e3baf8ac71d3fb936c236ef0827bf7c -> ../../usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/fs/xfs/xfs.ko.xz/c800307b5e3baf8ac71d3fb936c236ef0827bf7c
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/fs/xfs/xfs.ko.xz/c800307b5e3baf8ac71d3fb936c236ef0827bf7c/
-rw-r--r-- root/root    466576 2022-09-30 18:28 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/fs/xfs/xfs.ko.xz/c800307b5e3baf8ac71d3fb936c236ef0827bf7c/elf
-r--r--r-- root/root  25333856 2022-09-30 18:23 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/fs/xfs/xfs.ko.xz/c800307b5e3baf8ac71d3fb936c236ef0827bf7c/debug
-rw-r--r-- root/root         0 2023-09-22 01:38 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/fs/xfs/xfs.ko.xz/c800307b5e3baf8ac71d3fb936c236ef0827bf7c/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/ed/a8e5128e04d44b5de0e4fe695732dad26de1e5 -> ../../usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko.xz/eda8e5128e04d44b5de0e4fe695732dad26de1e5
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko.xz/eda8e5128e04d44b5de0e4fe695732dad26de1e5/
-rw-r--r-- root/root    583344 2022-09-30 18:28 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko.xz/eda8e5128e04d44b5de0e4fe695732dad26de1e5/elf
-r--r--r-- root/root  66041240 2022-09-30 18:22 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko.xz/eda8e5128e04d44b5de0e4fe695732dad26de1e5/debug
-rw-r--r-- root/root         0 2023-09-22 01:38 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko.xz/eda8e5128e04d44b5de0e4fe695732dad26de1e5/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/37/86f4a04961613b56232a6c7c93cd38bde936e5 -> ../../usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/net/ethernet/broadcom/tg3.ko.xz/3786f4a04961613b56232a6c7c93cd38bde936e5
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/net/ethernet/broadcom/tg3.ko.xz/3786f4a04961613b56232a6c7c93cd38bde936e5/
-rw-r--r-- root/root     81224 2022-09-30 18:28 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/net/ethernet/broadcom/tg3.ko.xz/3786f4a04961613b56232a6c7c93cd38bde936e5/elf
-r--r--r-- root/root   3435864 2022-09-30 18:22 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/net/ethernet/broadcom/tg3.ko.xz/3786f4a04961613b56232a6c7c93cd38bde936e5/debug
-rw-r--r-- root/root         0 2023-09-22 01:38 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/net/ethernet/broadcom/tg3.ko.xz/3786f4a04961613b56232a6c7c93cd38bde936e5/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/4e/8f5fea4e3ed7bc4ba437db3149667b38d7bb14 -> ../../usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/net/ethernet/intel/i40e/i40e.ko.xz/4e8f5fea4e3ed7bc4ba437db3149667b38d7bb14
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/net/ethernet/intel/i40e/i40e.ko.xz/4e8f5fea4e3ed7bc4ba437db3149667b38d7bb14/
-rw-r--r-- root/root    194680 2022-09-30 18:28 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/net/ethernet/intel/i40e/i40e.ko.xz/4e8f5fea4e3ed7bc4ba437db3149667b38d7bb14/elf
-r--r--r-- root/root   9071504 2022-09-30 18:22 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/net/ethernet/intel/i40e/i40e.ko.xz/4e8f5fea4e3ed7bc4ba437db3149667b38d7bb14/debug
-rw-r--r-- root/root         0 2023-09-22 01:38 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/net/ethernet/intel/i40e/i40e.ko.xz/4e8f5fea4e3ed7bc4ba437db3149667b38d7bb14/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/2d/b55ed37defbcce6d16f0457cfa4dc162ec7c8e -> ../../usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/scsi/megaraid/megaraid_sas.ko.xz/2db55ed37defbcce6d16f0457cfa4dc162ec7c8e
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/scsi/megaraid/megaraid_sas.ko.xz/2db55ed37defbcce6d16f0457cfa4dc162ec7c8e/
-rw-r--r-- root/root     71988 2022-09-30 18:28 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/scsi/megaraid/megaraid_sas.ko.xz/2db55ed37defbcce6d16f0457cfa4dc162ec7c8e/elf
-r--r--r-- root/root   2228512 2022-09-30 18:23 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/scsi/megaraid/megaraid_sas.ko.xz/2db55ed37defbcce6d16f0457cfa4dc162ec7c8e/debug
-rw-r--r-- root/root         0 2023-09-22 01:38 usr/lib/modules/4.18.0-425.3.1.el8.x86_64/kernel/drivers/scsi/megaraid/megaraid_sas.ko.xz/2db55ed37defbcce6d16f0457cfa4dc162ec7c8e/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/d5/1a972e63f40b8c0a32d2f8a1533563959d179a -> ../../usr/lib64/libcrypto.so.1.1.1k/d51a972e63f40b8c0a32d2f8a1533563959d179a
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/lib64/libcrypto.so.1.1.1k/d51a972e63f40b8c0a32d2f8a1533563959d179a/
-rwxr-xr-x root/root   3083672 2023-02-08 17:15 usr/lib64/libcrypto.so.1.1.1k/d51a972e63f40b8c0a32d2f8a1533563959d179a/elf
-rw-r--r-- root/root         0 2023-09-22 01:38 usr/lib64/libcrypto.so.1.1.1k/d51a972e63f40b8c0a32d2f8a1533563959d179a/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/31/f2a86084da882dfe4ecc1fe2a9eca8ce9416fd -> ../../usr/lib64/libc-2.28.so/31f2a86084da882dfe4ecc1fe2a9eca8ce9416fd
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/lib64/libc-2.28.so/31f2a86084da882dfe4ecc1fe2a9eca8ce9416fd/
-rwxr-xr-x root/root   2089984 2023-01-23 09:32 usr/lib64/libc-2.28.so/31f2a86084da882dfe4ecc1fe2a9eca8ce9416fd/elf
-rw-r--r-- root/root      5648 2023-09-22 01:38 usr/lib64/libc-2.28.so/31f2a86084da882dfe4ecc1fe2a9eca8ce9416fd/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/c4/1114caace3681cd1b867b88b5cdf017b49c9a6 -> ../../[vdso]/c41114caace3681cd1b867b88b5cdf017b49c9a6
drwxr-xr-x root/root         0 2023-09-22 01:38 [vdso]/c41114caace3681cd1b867b88b5cdf017b49c9a6/
-rw------- root/root      8192 2023-09-22 01:38 [vdso]/c41114caace3681cd1b867b88b5cdf017b49c9a6/vdso
-rw-r--r-- root/root         0 2023-09-22 01:38 [vdso]/c41114caace3681cd1b867b88b5cdf017b49c9a6/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/55/1b4ac05e0b7c321c4760eb5cc45ebaef407235 -> ../../usr/lib64/python3.9/lib-dynload/_json.cpython-39-x86_64-linux-gnu.so/551b4ac05e0b7c321c4760eb5cc45ebaef407235
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/lib64/python3.9/lib-dynload/_json.cpython-39-x86_64-linux-gnu.so/551b4ac05e0b7c321c4760eb5cc45ebaef407235/
-rwxr-xr-x root/root     78960 2022-12-21 16:59 usr/lib64/python3.9/lib-dynload/_json.cpython-39-x86_64-linux-gnu.so/551b4ac05e0b7c321c4760eb5cc45ebaef407235/elf
-rw-r--r-- root/root         0 2023-09-22 01:38 usr/lib64/python3.9/lib-dynload/_json.cpython-39-x86_64-linux-gnu.so/551b4ac05e0b7c321c4760eb5cc45ebaef407235/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/32/2061e377107f54453d1cd9be751da5cd1e2318 -> ../../usr/lib64/python3.9/lib-dynload/_pickle.cpython-39-x86_64-linux-gnu.so/322061e377107f54453d1cd9be751da5cd1e2318
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/lib64/python3.9/lib-dynload/_pickle.cpython-39-x86_64-linux-gnu.so/322061e377107f54453d1cd9be751da5cd1e2318/
-rwxr-xr-x root/root    144864 2022-12-21 16:59 usr/lib64/python3.9/lib-dynload/_pickle.cpython-39-x86_64-linux-gnu.so/322061e377107f54453d1cd9be751da5cd1e2318/elf
-rw-r--r-- root/root         0 2023-09-22 01:38 usr/lib64/python3.9/lib-dynload/_pickle.cpython-39-x86_64-linux-gnu.so/322061e377107f54453d1cd9be751da5cd1e2318/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/b8/2fc86b738193e46d0132830dfb0e853d6c8ad5 -> ../../usr/lib64/libpython3.9.so.1.0/b82fc86b738193e46d0132830dfb0e853d6c8ad5
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/lib64/libpython3.9.so.1.0/b82fc86b738193e46d0132830dfb0e853d6c8ad5/
-rwxr-xr-x root/root   3997072 2022-12-21 16:59 usr/lib64/libpython3.9.so.1.0/b82fc86b738193e46d0132830dfb0e853d6c8ad5/elf
-rw-r--r-- root/root      1159 2023-09-22 01:38 usr/lib64/libpython3.9.so.1.0/b82fc86b738193e46d0132830dfb0e853d6c8ad5/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/8c/b75b8849063f4e659c76d6c52cda54b5060b7d -> ../../usr/bin/perf/8cb75b8849063f4e659c76d6c52cda54b5060b7d
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/bin/perf/8cb75b8849063f4e659c76d6c52cda54b5060b7d/
-rwxr-xr-x root/root  10883592 2023-04-05 19:56 usr/bin/perf/8cb75b8849063f4e659c76d6c52cda54b5060b7d/elf
-rw-r--r-- root/root        80 2023-09-22 01:38 usr/bin/perf/8cb75b8849063f4e659c76d6c52cda54b5060b7d/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/c6/66f5af3805323607ffc9753c13d0b95a9e3529 -> ../../usr/lib64/libglib-2.0.so.0.5600.4/c666f5af3805323607ffc9753c13d0b95a9e3529
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/lib64/libglib-2.0.so.0.5600.4/c666f5af3805323607ffc9753c13d0b95a9e3529/
-rwxr-xr-x root/root   1171912 2023-01-03 19:24 usr/lib64/libglib-2.0.so.0.5600.4/c666f5af3805323607ffc9753c13d0b95a9e3529/elf
-rw-r--r-- root/root      8135 2023-09-22 01:38 usr/lib64/libglib-2.0.so.0.5600.4/c666f5af3805323607ffc9753c13d0b95a9e3529/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/f0/2a71bc066726654c9a58bed8e9f0ce12868332 -> ../../usr/lib64/libpython3.6m.so.1.0/f02a71bc066726654c9a58bed8e9f0ce12868332
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/lib64/libpython3.6m.so.1.0/f02a71bc066726654c9a58bed8e9f0ce12868332/
-rwxr-xr-x root/root   3286072 2023-01-24 04:32 usr/lib64/libpython3.6m.so.1.0/f02a71bc066726654c9a58bed8e9f0ce12868332/elf
-rw-r--r-- root/root       697 2023-09-22 01:38 usr/lib64/libpython3.6m.so.1.0/f02a71bc066726654c9a58bed8e9f0ce12868332/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/52/108d6a5a11d7c45c2dd7192b18e57d45877575 -> ../../usr/lib64/python3.6/site-packages/perf.cpython-36m-x86_64-linux-gnu.so/52108d6a5a11d7c45c2dd7192b18e57d45877575
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/lib64/python3.6/site-packages/perf.cpython-36m-x86_64-linux-gnu.so/52108d6a5a11d7c45c2dd7192b18e57d45877575/
-rwxr-xr-x root/root    380560 2023-04-05 19:56 usr/lib64/python3.6/site-packages/perf.cpython-36m-x86_64-linux-gnu.so/52108d6a5a11d7c45c2dd7192b18e57d45877575/elf
-rw-r--r-- root/root         0 2023-09-22 01:38 usr/lib64/python3.6/site-packages/perf.cpython-36m-x86_64-linux-gnu.so/52108d6a5a11d7c45c2dd7192b18e57d45877575/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/e7/e1ca30c84f66da46dba3d94cb1767776bf282c -> ../../usr/bin/restraintd/e7e1ca30c84f66da46dba3d94cb1767776bf282c
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/bin/restraintd/e7e1ca30c84f66da46dba3d94cb1767776bf282c/
-rwxr-xr-x root/root   8650760 2023-05-15 20:34 usr/bin/restraintd/e7e1ca30c84f66da46dba3d94cb1767776bf282c/elf
-rw-r--r-- root/root         0 2023-09-22 01:38 usr/bin/restraintd/e7e1ca30c84f66da46dba3d94cb1767776bf282c/probes
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/6f/aac91a1cfe1bdfe2fc35921e7bbf1138f0c209 -> ../../usr/local/lib/libiperf.so.0.0.0/6faac91a1cfe1bdfe2fc35921e7bbf1138f0c209
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/local/lib/libiperf.so.0.0.0/6faac91a1cfe1bdfe2fc35921e7bbf1138f0c209/
-rwxr-xr-x root/root    631160 2023-09-22 01:37 usr/local/lib/libiperf.so.0.0.0/6faac91a1cfe1bdfe2fc35921e7bbf1138f0c209/elf
-rw-r--r-- root/root         0 2023-09-22 01:38 usr/local/lib/libiperf.so.0.0.0/6faac91a1cfe1bdfe2fc35921e7bbf1138f0c209/probes
[igyn@localhost tmp]$ tar -tvhjf linuxperf-data/host1/perf.data.1.tar.bz2
lrwxrwxrwx root/root         0 2023-09-22 01:38 .build-id/a5/c4c91bbf5dd49e30781c0d4b9280123a8d2baf -> ../../usr/lib/debug/usr/lib/modules/4.18.0-425.3.1.el8.x86_64/vmlinux/a5c4c91bbf5dd49e30781c0d4b9280123a8d2baf
drwxr-xr-x root/root         0 2023-09-22 01:38 usr/lib/debug/usr/lib/modules/4.18.0-425.3.1.el8.x86_64/vmlinux/a5c4c91bbf5dd49e30781c0d4b9280123a8d2baf/
-rwxr-xr-x root/root 937725200 2022-09-30 18:25 usr/lib/debug/usr/lib/modules/4.18.0-425.3.1.el8.x86_64/vmlinux/a5c4c91bbf5dd49e30781c0d4b9280123a8d2baf/elf

@jtluka
Copy link
Collaborator

jtluka commented Sep 22, 2023

@Kuba314 could you please try following:

  • do not install kernel-debuginfo in the beaker job
  • on a fork of this test branch remove the check for kernel-debuginfo installation
    With the generated files check if the addresses are resolved.

I'm also wondering if this can be achieved also with just saving kallsyms file and the running perf report --kallsyms kallsyms.

@Kuba314
Copy link
Contributor Author

Kuba314 commented Sep 22, 2023

@jtluka J:8346661. Downloaded the tar file onto my local machine, ran perf report and saw addresses, extracted the debug-symbols archive, ran perf report again and I did see resolved symbols for everything (except like 5 addresses coming from perf?). Seems installing kernel-debuginfo is not necessary.

I'm not quite why the elf file is included.

.build-id/a5/c4c91bbf5dd49e30781c0d4b9280123a8d2baf seems to be a symlink to a directory in which the elf file lies. Keep in mind perf archive does nothing fancy probably. This is the source code I think: https://github.com/torvalds/linux/blob/master/tools/perf/perf-archive.sh.

Or is it that we need it for every iteration?
@olichtne I don't think so - every iteration should be the same. But we need to distinguish between perf_iterations=2 and perf_tests=["tcp_stream", "udp_stream"] and other params causing multiple tests being executed sequentially, because the tester probably intends to get (at least) one perf file for each of those, while they don't expect getting a perf file for each iteration which just repeats the test.

Another option would be to run perf each time, but somehow combine all the debug symbols after the recipe ends into a single tar file. I have looked into merging tar files a bit and found only concatenation, which doesn't solve our problem.

@jtluka
Copy link
Collaborator

jtluka commented Sep 25, 2023

@jtluka J:8346661. Downloaded the tar file onto my local machine, ran perf report and saw addresses, extracted the debug-symbols archive, ran perf report again and I did see resolved symbols for everything (except like 5 addresses coming from perf?). Seems installing kernel-debuginfo is not necessary.

hmm, I tried it on my system and it does not work, how do you extract the files?

I was able to resolve the symbols only by specifying --kallsyms parameter but it does not work for me automatically.

@Kuba314
Copy link
Contributor Author

Kuba314 commented Sep 25, 2023

how do you extract the files?

@jtluka The debug files have to be stored under ~/.debug/ for perf report to find them. This fact is mentioned in the output of perf archive. The command I used is tar xvf perf.data.1.tar.bz2 -C ~/.debug.

@jtluka
Copy link
Collaborator

jtluka commented Sep 26, 2023

how do you extract the files?

@jtluka The debug files have to be stored under ~/.debug/ for perf report to find them. This fact is mentioned in the output of perf archive. The command I used is tar xvf perf.data.1.tar.bz2 -C ~/.debug.

cool, works now. I think we should be all set now. thanks for trying this out.

It's no longer needed to distinguish different perf.data files from each
other.
@Kuba314
Copy link
Contributor Author

Kuba314 commented Sep 26, 2023

This is now ready for a final review @jtluka (test at J:8357506)

@Kuba314
Copy link
Contributor Author

Kuba314 commented Sep 26, 2023

Bad news, vmlinux/...../elf got included again... Now the kernel symbols are fetched from [kernel.vmlinux] instead of [kernel.kallsyms]. Can I somehow force perf record to not reach for kernel.vmlinux and take kernel.kallsyms instead? @jtluka

@Kuba314
Copy link
Contributor Author

Kuba314 commented Sep 26, 2023

I forgot to not install kernel-debuginfo. Testing at J:8358579.

@olichtne
Copy link
Collaborator

I forgot to not install kernel-debuginfo. Testing at J:8358579.

from what i can see here, the file is still 1.7GB

interestingly the perf command on host1 takes 5 seconds to finish, but a minute on host2...

now... i'm not sure we care though... i imagine that the main usecase for this is for debugging purposes when someone wants to run just one test and gather as much data to analyze as possible...

so if @jtluka is ok with this file size and duration i'm for merging this now... or we can spend a bit more time trying to understand why this happens...

@jtluka
Copy link
Collaborator

jtluka commented Sep 26, 2023

I forgot to not install kernel-debuginfo. Testing at J:8358579.

from what i can see here, the file is still 1.7GB

second host installed the debuginfo. it was removed only on the first host. so, it is a setup issue.

nevertheless, imo this is ok to merge in.

@olichtne olichtne merged commit a43b6c9 into LNST-project:master Sep 26, 2023
2 checks passed
@Kuba314 Kuba314 deleted the linuxperf-archive branch September 27, 2023 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

archive symbols for linuxperf record
3 participants