-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
linuxperf: Copy debug symbols if possible #329
Conversation
From the test I see that our infrastructure explicitly forbids installation of For now though, I have created a job where I didn't disable the installation of |
Seems that |
What I had in mind was to completely get rid of generatirg the profiled cpus from dev_intr_cpu and perf_tool_cpu. Simply not specify --cpu for perf record. That way we profile the whole system and can fetch the relevant cpus whenever we want. It is ok to keep the override parameter, but otherwise we should have no logic around the profiled cpus. |
so that means that there is now no way to actually "skip" a host from having perf running... you have two options:
not saying this is a problem... just asking if this is intentional? |
wasn't really intentional, but I don't know how else would a user want to say to measure on every CPU. I guess the logical thing would be to pass Alternatively we could rename the param to |
Testing in internal lab with RHEL for some reason fails on the
I have no clue what's going on. Locally it just works and doesn't display these error messages on stderr. I'm testing in docker fedora36 container, running on my fedora39 host. One thing I noticed is that for some reason |
does running the same command by hand work ok? |
is the |
so... i'm not sure we have a usecase for "skip a host from perf measurement" @jtluka maybe sees one? IMO the BUT, purely for the "discussion": there's two approaches based on what we want the DEFAULT to be:
in all of the possible value cases the algorithm works the same way which is the benefit here... the "special case" doesn't, that requires an explicit condition branch... which i don't like. IMO.. both solutions are ok, i lean closer to the filter one as i believe specifying "nothing to measure everything" is more handy and the "specify a little to measure nothing" is a good compromise. The explicit specification of case 2 is easier to understand, but more "wordy" - the case "nothing to measure nothing" is short, but it's a rare case, and the "everything to measure everything" is long, and that is a more common case than "nothing to do nothing" imo... |
To be honest, I think we're overcomplicating this. If user needs debugging data, the most straightforward way to do that should be just specifying |
right.... so do we want to just measure all of the cpus whenever we do |
That was the original idea. Jakub decided to keep the linuxperf_cpus parameter, first I thought it was a good idea, but based on this discussion I'm more inclined to just remove it for the sake of code simplicity. |
a19207d
to
1753bdc
Compare
I removed the cpu params completely.
I think I tried already, but it showed the same errors. I will verify this.
not sure tbh, perf-report outputs to the current directory and since perf-record is executed on agents through the RPC mechanism, I'm guessing it picked cwd as root. Not sure if that's an issue... (or how I would fix this, maybe |
yeah... looking at it closer, the The Agent runs as a systemd service meaning that you're correct - ``cwd == '/'`. That seems like it is probably fine since the record file is created ok, just the archive has an issue... So that's probably not too important... I do wonder though, when you're running locally with containers, is the agent running as a service as well? is the path potentially different? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After removal of cpu parameter, codewise looks ok. Once we resolve the beaker run issue I approve.
5151598
to
e162705
Compare
|
and each perf iteration, the job you linked as So to use this for a normal test run where the question is... are we ok with this? we're only using the perf command rarely for debugging purposes and it already affects performance anyway so this may be ok. BUT, if we see this as too much... is it possible to run this just once at the end? Not related to LNST implementation... just from the point of view of the |
I took a look at generated archive file. And I'm confused. Unpacked file would take ~1GB that is basically the same as if I installed it from rpm. In the output below, the biggest file is vmlinux/elf with 936 MB. I'm not quite why the elf file is included
|
@Kuba314 could you please try following:
I'm also wondering if this can be achieved also with just saving kallsyms file and the running |
@jtluka J:8346661. Downloaded the tar file onto my local machine, ran
Another option would be to run perf each time, but somehow combine all the debug symbols after the recipe ends into a single tar file. I have looked into merging tar files a bit and found only concatenation, which doesn't solve our problem. |
hmm, I tried it on my system and it does not work, how do you extract the files? I was able to resolve the symbols only by specifying --kallsyms parameter but it does not work for me automatically. |
@jtluka The debug files have to be stored under |
cool, works now. I think we should be all set now. thanks for trying this out. |
It's no longer needed to distinguish different perf.data files from each other.
e162705
to
d519a76
Compare
This is now ready for a final review @jtluka (test at |
Bad news, |
I forgot to not install |
from what i can see here, the file is still 1.7GB interestingly the perf command on now... i'm not sure we care though... i imagine that the main usecase for this is for debugging purposes when someone wants to run just one test and gather as much data to analyze as possible... so if @jtluka is ok with this file size and duration i'm for merging this now... or we can spend a bit more time trying to understand why this happens... |
second host installed the debuginfo. it was removed only on the first host. so, it is a setup issue. nevertheless, imo this is ok to merge in. |
Description
perf archive
archives symbols referenced fromperf.data
files so that it can be closely examined on a different machine.Also make linuxperf accept only a single list of cpus. Tester should filter a subset of those cpus manually later during
perf report
.Tests
J:8306972
Reviews
@jtluka
Closes: #326