Tracing nameko rpc calls

I’ve got code in which a service (X) rpc calls a service (Y). When client © calls X (which then calls Y), Y takes about 6ms to respond, but X doesn’t return to the client for another almost 150ms when the client uses a loop to call X repeatedly. Strangely, if I put a sleep for 100ms between calls, the calls to X return in about 70ms or less.

So I’m trying to figure out how I can find where the delay is. I suspect vanilla rabbitmq or perhaps persistence to disk is getting in the way. This is on a 2x 10 core system with 128GB of ram, so I don’t think it’s a hardware issue.

I also tried cProfiling the calls, but that doesn’t seem to be granular enough. X doesn’t do anything much with the Y data but clean it up and pass it back. I wonder if the size of the data is a factor (though we’re talking 1-2kb tops).

I also mocked up a similar cascading call that just returns a simple string… returns in about 9ms and returns in about 4ms – the X call includes the Y call. So that seems on-par with what I’d expect.

Any thoughts on how to better trace the calls to figure out where the delay is? I’m hoping there is a debug flag that might help to call out some metrics around nameko’s interactions wtih rabbitmq.



I ended up finding the problem. It was a delay in the database call waiting for the DB to become consistent. The issue was found after using the highly technically advanced method of injecting print statements into the code and running it.
Also on a positive note, I learned more about DependencyProviders and how they can be used in the worker lifecycle. So that’s cool!