Use as python library #660

Badg · 2024-03-31T11:45:42Z

Hello!

There was some discussion in this post about potentially shipping python bindings to py-spy to support its use as a library. I'd like to formally request this feature :)

My use case is as follows: I'm starting to design the controlplane for an application I'm working on. The local controlplane service is written in python, is started by systemd, and manages launching, monitoring, etc, some number of identical application processes. Each of the application processes would be child processes to the controlplane service. Meanwhile, the controlplane service contacts the remote controlplane coordinator, which devs can connect to. I'd like to completely disable remote access to the individual app servers, and only allow remote operations through explicitly programmed RPC calls between the controlplane coordinator and the controlplane service.

"All well and good", you say, "but where does py-spy come into this?" Well, I'd like to expose py-spy as an RPC call. So a dev could, for example, log into the controlplane coordinator (via cloudflare zero-trust), and then get a dump from a particular process on a particular app server. Or start a profiling run. Etc.

"Okay, but you can do that already. Why the library use?" Simple: if I can use py-spy as a python library, then it'll run with the same PID as the controlplane service, which means that the application processes will all be child processes -- which in turn means that I won't need to run the controlplane as a sudoer, which is a huge win from a security standpoint.

ankush · 2024-04-12T18:17:26Z

By dump do you mean current stacktrace only right?

If so, you can probably implement it with signals fairly easily. Your master process can send signal to all child process (let's say SIGUSR1) and they can dump the stacktrace to stderr or something for you to extract. You don't even have to write a ton of code for this: https://docs.python.org/3/library/faulthandler.html#faulthandler.register

That being said, I still want to see library access. It can unlock a lot of new use cases.

Badg · 2024-04-12T19:24:06Z

Hooking into signals would certainly be one option for getting a stacktrace of all running threads, and it's actually even something I've done before (though it has been a looooooong time, so thanks for the reminder!). Though I seem to remember it having some resiliency issues with deadlocks in the main thread; I might be mis-remembering that though.

At any rate, as you say, library access would open up a lot of new use cases. And in fact, I'd really like to be able to start and stop remote profiling this way, so I can avoid something like new relic, which is both super expensive and super heavyweight. Help with troubleshooting is all well and good, but for my architecture it's of limited use (it's trivial for me to just destroy the underlying process/instance and start up a new one, all with zero downtime, because of my load balancing setup). But remote profiling... now that's something that gets super interesting.

kvelicka · 2024-07-04T09:37:01Z

I'd like to add motivation for a library use case - we've tried using py-spy on a long-running TCP server application of ours and were quite happy with the results we got out (certainly beating our internal tooling) - so I'd like to say "great job" to creators and maintainers of py-spy!

We did face some issues in operating py-spy however. We'd like to run it 24/7 and currently this doesn't seem to be really possible since you either:

run py-spy interactively/"top mode" and get the ability to inspect the python process interactively but (AFAIU) without the ability to write this information for future reference/investigation
run py-spy in record mode which gives us very helpful flamegraph, but we couldn't find a way to do on a continuous basis - that is, we can have py-spy record profiling information for a predetermined period of time but I don't think the recorded information is accessible/readable before the recording period is up, and once it is up it would have to be invoked again, usually leading to some gap between the records too

So, for us one of the following ways of operation (or both even) would be extremely interesting:

running py-spy from within the same python process, e.g. on a greenthread/greenlet - this would be the simplest from deployment POV but comes with a constant overhead (as long as py-spy is running) but we are happy to trade [reasonable] overhead for ability to monitor/record activity of our Python processes constantly
still running py-spy as some sort of sidecar, but in a continuous (ish?) recording mode - for example, py-spy could dump profiling data in some form on periodic (say minutely) basis and then have the ability to summarise/aggregate that data at a later point in time

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use as python library #660

Use as python library #660

Badg commented Mar 31, 2024

ankush commented Apr 12, 2024

Badg commented Apr 12, 2024

kvelicka commented Jul 4, 2024

Use as python library #660

Use as python library #660

Comments

Badg commented Mar 31, 2024

ankush commented Apr 12, 2024

Badg commented Apr 12, 2024

kvelicka commented Jul 4, 2024