Multi-threading in Rpc Calls

colin · March 25, 2021, 2:49pm

Hi guys -

I said that before - but it’s worth repeating it - congrats on creating and maintaining such an easy-to-use framework.

I am trying to understand and/or get feedback from people with experience on the following: what could be the problem processing an incoming request over RPC with a multi-threaded approach? In my specific case, the situation becomes even murkier as I don’t need to recover the responses - it gets written to a database.

For illustration purposes, the situation would be something like

from nameko.rpc import rpc, RpcProxy

class ServiceY:
    name = "service_y"
    @rpc
    def do_something_mp(self,):
        # do something multi threaded/processed
        # write results to the db
        return True

class ServiceX:
    name = "service_x"
    y = RpcProxy("service_y")
    @rpc
    def do_something(self, value):
        self.y.do_something_mp.call_async()
        return True

Can someone shed some light on what happens to requests that are never “recovered” (just doing call_async without actually looking for the result())? My understanding is that it shouldn’t be a problem and RabbitMq will manage.

More generally, I am having a hard time understanding what will happen from a thread perspective. I know that the ServiceY will spawn a new worker having its own thread but it feels like (well I kinda saw it) there is a risk for a fork bomb…

Can someone share some insights/ideas on how to investigate this situation?

Thanks in advance,

Colin

mattbennett · April 7, 2021, 1:06pm

Hi Colin,

call_async is intended to let your service method carry on with something else while you wait for a result, rather than blocking. It sounds like you don’t really need RPC. If you’ve got no reason to return something to the caller, you could use an event or a simple message to carry your payload.

That said, if you do use call_async and never recover the result, everything will be fine. The calling service will receive the result, but you’re not obliged to consume it. There’s no difference over the wire from a blocking call.

Re: threading. Nameko runs every worker (i.e. service method execution) in its own Eventlet greenthread. You probably don’t want to have that greenthread spawn a whole bunch of Python threads, or even manage something with multiprocessing. It’d probably work, but it’s a bit messy.

I would generally prefer to use Nameko’s concurrency “natively” – that is, spawn a whole bunch more messages or RPC calls, one to handle each concurrent thing. If you need more than one CPU processing concurrently (i.e. the multiprocessing case) then just spin up more service instances.

I find this pattern generally easier to understand and more observable, because every sub-thing is a worker execution in its own right. It should be noted that it’s less performant though, because you’re incurring the overhead of spawning a whole new worker.

colin · April 19, 2021, 10:15pm

Thanks for taking the time to answer Matt. As usual very clear and helpful.

you could use an event or a simple message to carry your payload >> Are there any benefits to that approach versus a standard RPC call? I haven’t read in details how the pub/sub will work but I would imagine that - the same way i would “clog the pipe” if I were to send thousands of async messages with no one to consume/service them - if my remote service is not doing its job properly, no one will acknowledge and respond to these events, right?

Regarding the overall architecture, I have simplified and now just delegate the processing of the incoming request to a library call that will be in charge of writing to the cache or db. That being said, it seems that with a high enough # of concurrent requests, I manage to kill the process in which ServiceY is running…
I’d be grateful if you could confirm the following is true:
new request is made to the service
→ new worker is spawned - working in its own/separate thread
→ dependencies get injected into the worker
→ in my case, dependency = instance of a class in charge of performing computations + writing to the DB, so all this happen in a thread safe environment as the worker is still waiting for this lib call to return
→ piling requests will wait in the RPC queue for some worker to die so the service can spawn new instances

I apologize if this is a dumb question, but I want to make sure I am digging in the right places to solve my problem.

Thanks in advance,

Colin

Topic		Replies	Views
Call_async takes longer than expected	7	1279	November 26, 2020
Call_async result Github Issues	0	617	August 13, 2018
Service deadlock when request volume increases googlegroup	7	497	March 23, 2017
Retrieve an asynchronous task at a later time by it's correlation_id. googlegroup	5	751	January 18, 2016
Validation of nameko Usage Pattern googlegroup	6	1131	December 7, 2016

Multi-threading in Rpc Calls

Related topics